first commit
This commit is contained in:
303
USAGE.md
Normal file
303
USAGE.md
Normal file
@@ -0,0 +1,303 @@
|
||||
# Skinbase Vision Stack — Usage Guide
|
||||
|
||||
This document explains how to run and use the Skinbase Vision Stack (Gateway + CLIP, BLIP, YOLO, Qdrant services).
|
||||
|
||||
## Overview
|
||||
|
||||
- Services: `gateway`, `clip`, `blip`, `yolo`, `qdrant`, `qdrant-svc` (FastAPI each, except `qdrant` which is the official Qdrant DB).
|
||||
- Gateway is the public API endpoint; the other services are internal.
|
||||
|
||||
## Model overview
|
||||
|
||||
- **CLIP**: Contrastive Language–Image Pretraining — maps images and text into a shared embedding space. Used for zero-shot image tagging, similarity search, and returning ranked tags with confidence scores.
|
||||
|
||||
- **BLIP**: Bootstrapping Language-Image Pre-training — a vision–language model for image captioning and multimodal generation. BLIP produces human-readable captions (multiple `variants` supported) and can be tuned with `max_length`.
|
||||
|
||||
- **YOLO**: You Only Look Once — a family of real-time object-detection models. YOLO returns detected objects with `class`, `confidence`, and `bbox` (bounding box coordinates); use `conf` to filter low-confidence detections.
|
||||
|
||||
- **Qdrant**: High-performance vector similarity search engine. Stores CLIP image embeddings and enables reverse image search (find similar images). The `qdrant-svc` wrapper auto-embeds images via CLIP before upserting.
|
||||
|
||||
## Prerequisites
|
||||
|
||||
- Docker Desktop (with `docker compose`) or a Docker environment.
|
||||
- Recommended: at least 8GB RAM for CPU-only; more for model memory or GPU use.
|
||||
|
||||
## Start the stack
|
||||
|
||||
Run from repository root:
|
||||
|
||||
```bash
|
||||
docker compose up -d --build
|
||||
```
|
||||
|
||||
Stop:
|
||||
|
||||
```bash
|
||||
docker compose down
|
||||
```
|
||||
|
||||
View logs:
|
||||
|
||||
```bash
|
||||
docker compose logs -f
|
||||
docker compose logs -f gateway
|
||||
```
|
||||
|
||||
## Health
|
||||
|
||||
Check the gateway health endpoint:
|
||||
|
||||
```bash
|
||||
curl https://vision.klevze.net/health
|
||||
```
|
||||
|
||||
## Universal analyze (ALL)
|
||||
|
||||
Analyze an image by URL (gateway aggregates CLIP, BLIP, YOLO):
|
||||
|
||||
```bash
|
||||
curl -X POST https://vision.klevze.net/analyze/all \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{"url":"https://files.skinbase.org/img/aa/bb/cc/md.webp","limit":5}'
|
||||
```
|
||||
|
||||
File upload (multipart):
|
||||
|
||||
```bash
|
||||
curl -X POST https://vision.klevze.net/analyze/all/file \
|
||||
-F "file=@/path/to/image.webp" \
|
||||
-F "limit=5"
|
||||
```
|
||||
|
||||
Parameters:
|
||||
- `limit`: optional integer to limit returned tag/caption items.
|
||||
|
||||
## Individual services (via gateway)
|
||||
|
||||
These endpoints call the specific service through the gateway.
|
||||
|
||||
### CLIP — tags
|
||||
|
||||
URL request:
|
||||
|
||||
```bash
|
||||
curl -X POST https://vision.klevze.net/analyze/clip \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{"url":"https://files.skinbase.org/img/aa/bb/cc/md.webp","limit":5}'
|
||||
```
|
||||
|
||||
File upload:
|
||||
|
||||
```bash
|
||||
curl -X POST https://vision.klevze.net/analyze/clip/file \
|
||||
-F "file=@/path/to/image.webp" \
|
||||
-F "limit=5"
|
||||
```
|
||||
|
||||
Return: JSON list of tags with confidence scores.
|
||||
|
||||
### BLIP — captioning
|
||||
|
||||
URL request:
|
||||
|
||||
```bash
|
||||
curl -X POST https://vision.klevze.net/analyze/blip \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{"url":"https://files.skinbase.org/img/aa/bb/cc/md.webp","variants":3}'
|
||||
```
|
||||
|
||||
File upload:
|
||||
|
||||
```bash
|
||||
curl -X POST https://vision.klevze.net/analyze/blip/file \
|
||||
-F "file=@/path/to/image.webp" \
|
||||
-F "variants=3" \
|
||||
-F "max_length=60"
|
||||
```
|
||||
|
||||
Parameters:
|
||||
- `variants`: number of caption variants to return.
|
||||
- `max_length`: optional maximum caption length.
|
||||
|
||||
Return: one or more caption strings (optionally with scores).
|
||||
|
||||
### YOLO — object detection
|
||||
|
||||
URL request:
|
||||
|
||||
```bash
|
||||
curl -X POST https://vision.klevze.net/analyze/yolo \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{"url":"https://files.skinbase.org/img/aa/bb/cc/md.webp","conf":0.25}'
|
||||
```
|
||||
|
||||
File upload:
|
||||
|
||||
```bash
|
||||
curl -X POST https://vision.klevze.net/analyze/yolo/file \
|
||||
-F "file=@/path/to/image.webp" \
|
||||
-F "conf=0.25"
|
||||
```
|
||||
|
||||
Parameters:
|
||||
- `conf`: confidence threshold (0.0–1.0).
|
||||
|
||||
Return: detected objects with `class`, `confidence`, and `bbox` (bounding box coordinates).
|
||||
|
||||
### Qdrant — vector storage & similarity search
|
||||
|
||||
The Qdrant integration lets you store image embeddings and find visually similar images. Embeddings are generated automatically by the CLIP service.
|
||||
|
||||
#### Upsert (store) an image by URL
|
||||
|
||||
```bash
|
||||
curl -X POST https://vision.klevze.net/vectors/upsert \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{"url":"https://files.skinbase.org/img/aa/bb/cc/md.webp","id":"img-001","metadata":{"category":"wallpaper","source":"upload"}}'
|
||||
```
|
||||
|
||||
Parameters:
|
||||
- `url` (required): image URL to embed and store.
|
||||
- `id` (optional): custom string ID for the point; auto-generated if omitted.
|
||||
- `metadata` (optional): arbitrary key-value payload stored alongside the vector.
|
||||
- `collection` (optional): target collection name (defaults to `images`).
|
||||
|
||||
#### Upsert by file upload
|
||||
|
||||
```bash
|
||||
curl -X POST https://vision.klevze.net/vectors/upsert/file \
|
||||
-F "file=@/path/to/image.webp" \
|
||||
-F 'id=img-002' \
|
||||
-F 'metadata_json={"category":"photo"}'
|
||||
```
|
||||
|
||||
#### Upsert a pre-computed vector
|
||||
|
||||
```bash
|
||||
curl -X POST https://vision.klevze.net/vectors/upsert/vector \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{"vector":[0.1,0.2,...],"id":"img-003","metadata":{"custom":"data"}}'
|
||||
```
|
||||
|
||||
#### Search similar images by URL
|
||||
|
||||
```bash
|
||||
curl -X POST https://vision.klevze.net/vectors/search \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{"url":"https://files.skinbase.org/img/aa/bb/cc/md.webp","limit":5}'
|
||||
```
|
||||
|
||||
Parameters:
|
||||
- `url` (required): query image URL.
|
||||
- `limit` (optional, default 5): number of results.
|
||||
- `score_threshold` (optional): minimum cosine similarity (0.0–1.0).
|
||||
- `filter_metadata` (optional): filter results by metadata, e.g. `{"category":"wallpaper"}`.
|
||||
- `collection` (optional): collection to search.
|
||||
|
||||
Return: list of `{"id", "score", "metadata"}` sorted by similarity.
|
||||
|
||||
#### Search by file upload
|
||||
|
||||
```bash
|
||||
curl -X POST https://vision.klevze.net/vectors/search/file \
|
||||
-F "file=@/path/to/image.webp" \
|
||||
-F "limit=5"
|
||||
```
|
||||
|
||||
#### Search by pre-computed vector
|
||||
|
||||
```bash
|
||||
curl -X POST https://vision.klevze.net/vectors/search/vector \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{"vector":[0.1,0.2,...],"limit":5}'
|
||||
```
|
||||
|
||||
#### Collection management
|
||||
|
||||
List all collections:
|
||||
```bash
|
||||
curl https://vision.klevze.net/vectors/collections
|
||||
```
|
||||
|
||||
Get collection info:
|
||||
```bash
|
||||
curl https://vision.klevze.net/vectors/collections/images
|
||||
```
|
||||
|
||||
Create a custom collection:
|
||||
```bash
|
||||
curl -X POST https://vision.klevze.net/vectors/collections \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{"name":"my_collection","vector_dim":512,"distance":"cosine"}'
|
||||
```
|
||||
|
||||
Delete a collection:
|
||||
```bash
|
||||
curl -X DELETE https://vision.klevze.net/vectors/collections/my_collection
|
||||
```
|
||||
|
||||
#### Delete points
|
||||
|
||||
```bash
|
||||
curl -X POST https://vision.klevze.net/vectors/delete \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{"ids":["img-001","img-002"]}'
|
||||
```
|
||||
|
||||
#### Get a point by ID
|
||||
|
||||
```bash
|
||||
curl https://vision.klevze.net/vectors/points/img-001
|
||||
```
|
||||
|
||||
## Request/Response notes
|
||||
|
||||
- For URL requests use `Content-Type: application/json`.
|
||||
- For uploads use `multipart/form-data` with a `file` field.
|
||||
- The gateway aggregates and normalizes outputs for `/analyze/all`.
|
||||
|
||||
## Running a single service
|
||||
|
||||
To run only one service via docker compose:
|
||||
|
||||
```bash
|
||||
docker compose up -d --build clip
|
||||
```
|
||||
|
||||
Or run locally (Python env) from the service folder:
|
||||
|
||||
```bash
|
||||
# inside clip/ or blip/ or yolo/
|
||||
uvicorn main:app --host 0.0.0.0 --port 8000
|
||||
```
|
||||
|
||||
## Production tips
|
||||
|
||||
- Add authentication (API keys or OAuth) at the gateway.
|
||||
- Add rate-limiting and per-client quotas.
|
||||
- Keep model services on an internal Docker network.
|
||||
- For GPU: enable NVIDIA runtime and update service Dockerfiles / compose profiles.
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
- Service fails to start: check `docker compose logs <service>` for model load errors.
|
||||
- High memory / OOM: increase host memory or reduce model footprint; consider GPUs.
|
||||
- Slow startup: model weights load on service startup — expect extra time.
|
||||
|
||||
## Extending
|
||||
|
||||
- Swap or update models in each service by editing that service's `main.py`.
|
||||
- Add request validation, timeouts, and retries in the gateway to improve robustness.
|
||||
|
||||
## Files of interest
|
||||
|
||||
- `docker-compose.yml` — composition and service definitions.
|
||||
- `gateway/` — gateway FastAPI server.
|
||||
- `clip/`, `blip/`, `yolo/` — service implementations and Dockerfiles.
|
||||
- `qdrant/` — Qdrant API wrapper service (FastAPI).
|
||||
- `common/` — shared helpers (e.g., image I/O).
|
||||
|
||||
---
|
||||
|
||||
If you want, I can merge these same contents into the project `README.md`,
|
||||
create a Postman collection, or add example response schemas for each endpoint.
|
||||
Reference in New Issue
Block a user