741 lines
24 KiB
Markdown
741 lines
24 KiB
Markdown
# Skinbase Vision Stack — Usage Guide
|
||
|
||
This document explains how to run and use the Skinbase Vision Stack (Gateway + CLIP, BLIP, YOLO, Qdrant, Card Renderer, Maturity, and optional LLM services).
|
||
|
||
## Overview
|
||
|
||
- Services: `gateway`, `clip`, `blip`, `yolo`, `qdrant`, `qdrant-svc`, `card-renderer`, `maturity`, `llm` (FastAPI each except `qdrant`; `llm` is a thin FastAPI shim that manages an internal `llama-server` process).
|
||
- Gateway is the public API endpoint; the other services are internal.
|
||
|
||
## Model overview
|
||
|
||
- **CLIP**: Contrastive Language–Image Pretraining — maps images and text into a shared embedding space. Used for zero-shot image tagging, similarity search, and returning ranked tags with confidence scores.
|
||
|
||
- **BLIP**: Bootstrapping Language-Image Pre-training — a vision–language model for image captioning and multimodal generation. BLIP produces human-readable captions (multiple `variants` supported) and can be tuned with `max_length`.
|
||
|
||
- **YOLO**: You Only Look Once — a family of real-time object-detection models. YOLO returns detected objects with `class`, `confidence`, and `bbox` (bounding box coordinates); use `conf` to filter low-confidence detections.
|
||
|
||
- **Qdrant**: High-performance vector similarity search engine. Stores CLIP image embeddings and enables reverse image search (find similar images). The `qdrant-svc` wrapper auto-embeds images via CLIP before upserting.
|
||
|
||
- **Card Renderer**: Generates branded social-card images (e.g. Open Graph previews) from artwork images. Applies smart center-weighted cropping, gradient overlays, title/username/tag text, and an optional logo. Returns binary image bytes (WebP by default). Template: `nova-artwork-v1`.
|
||
|
||
- **Maturity**: Dedicated NSFW/maturity classifier. Accepts an image and returns a normalized safety signal including `maturity_label` (`safe`/`mature`), `confidence`, raw `score`, optional sublabels (e.g. `nsfw`), and an `action_hint` (`safe`, `review`, `flag_high`) designed for Nova moderation workflows. Powered by `Falconsai/nsfw_image_detection` (ViT-based, HuggingFace). Thresholds are configurable via environment variables.
|
||
|
||
- **LLM**: Internal text-generation service backed by `llama.cpp` and a GGUF Qwen3 model. Exposed through the gateway for non-streaming chat completions and model discovery. Intended for Nova workflows such as creator bios, metadata suggestions, moderation helper text, and other short internal generation tasks.
|
||
|
||
## Prerequisites
|
||
|
||
- Docker Desktop (with `docker compose`) or a Docker environment.
|
||
- Recommended: at least 8GB RAM for CPU-only; more for model memory or GPU use.
|
||
|
||
## Start the stack
|
||
|
||
Before starting the stack, create a `.env` file for runtime secrets and environment overrides.
|
||
|
||
Minimum example:
|
||
|
||
```bash
|
||
API_KEY=your_api_key_here
|
||
HUGGINGFACE_TOKEN=your_huggingface_token_here
|
||
```
|
||
|
||
Notes:
|
||
- `API_KEY` protects gateway endpoints.
|
||
- `HUGGINGFACE_TOKEN` is required if the configured BLIP model requires Hugging Face authentication.
|
||
- Startup uses container healthchecks, so initial boot can take longer while models download and warm up.
|
||
|
||
Optional maturity configuration (can be added to `.env` to override defaults):
|
||
|
||
```bash
|
||
MATURITY_MODEL=Falconsai/nsfw_image_detection
|
||
MATURITY_THRESHOLD_MATURE=0.80
|
||
MATURITY_THRESHOLD_REVIEW=0.60
|
||
MATURITY_ENABLED=true
|
||
```
|
||
|
||
- `MATURITY_THRESHOLD_MATURE`: score above this → `mature` + `flag_high` (default `0.80`).
|
||
- `MATURITY_THRESHOLD_REVIEW`: score above this but below mature threshold → `mature` + `review` (default `0.60`).
|
||
- `MATURITY_ENABLED`: set to `false` to disable maturity endpoints at the gateway without removing the service.
|
||
|
||
Optional LLM configuration:
|
||
|
||
```bash
|
||
LLM_URL=http://llm:8080
|
||
LLM_ENABLED=false
|
||
LLM_TIMEOUT=120
|
||
LLM_DEFAULT_MODEL=qwen3-1.7b-instruct-q4_k_m
|
||
LLM_MAX_TOKENS_DEFAULT=256
|
||
LLM_MAX_TOKENS_HARD_LIMIT=1024
|
||
LLM_MAX_REQUEST_BYTES=65536
|
||
|
||
# Local llm profile only
|
||
MODEL_PATH=/models/Qwen3-1.7B-Instruct-Q4_K_M.gguf
|
||
LLM_CONTEXT_SIZE=4096
|
||
LLM_THREADS=4
|
||
LLM_GPU_LAYERS=0
|
||
LLM_EXTRA_ARGS=
|
||
```
|
||
|
||
Run from repository root:
|
||
|
||
```bash
|
||
docker compose up -d --build
|
||
```
|
||
|
||
That starts the default vision stack only.
|
||
|
||
To also start the local LLM service:
|
||
|
||
```bash
|
||
docker compose --profile llm up -d --build
|
||
```
|
||
|
||
Before enabling the `llm` profile, provision the GGUF model described in [models/qwen3/README.md](models/qwen3/README.md) and set `LLM_ENABLED=true` in `.env`.
|
||
|
||
For small production hosts, the preferred setup is usually to keep the gateway local and point `LLM_URL` at a separate private LLM host:
|
||
|
||
```bash
|
||
LLM_ENABLED=true
|
||
LLM_URL=http://private-llm-host:8080
|
||
```
|
||
|
||
Stop:
|
||
|
||
```bash
|
||
docker compose down
|
||
```
|
||
|
||
View logs:
|
||
|
||
```bash
|
||
docker compose logs -f
|
||
docker compose logs -f gateway
|
||
```
|
||
|
||
## Health
|
||
|
||
Check the gateway health endpoint:
|
||
|
||
```bash
|
||
curl https://vision.klevze.net/health
|
||
```
|
||
|
||
Check LLM-specific gateway health:
|
||
|
||
```bash
|
||
curl -H "X-API-Key: <your-api-key>" https://vision.klevze.net/ai/health
|
||
```
|
||
|
||
## LLM smoke test checklist
|
||
|
||
Use this sequence on a machine with Docker available after you have mounted the GGUF model and enabled the gateway with `LLM_ENABLED=true`.
|
||
|
||
1. Start the gateway with the `llm` profile.
|
||
|
||
```bash
|
||
docker compose --profile llm up -d --build gateway llm
|
||
```
|
||
|
||
2. Confirm the LLM service came up cleanly.
|
||
|
||
```bash
|
||
docker compose ps llm
|
||
docker compose logs --tail=100 llm
|
||
```
|
||
|
||
3. Check the repo-owned internal health endpoint.
|
||
|
||
```bash
|
||
curl http://127.0.0.1:8080/health
|
||
```
|
||
|
||
Expected fields: `status`, `model`, `context_size`, `threads`.
|
||
|
||
4. Confirm the gateway sees the LLM backend.
|
||
|
||
```bash
|
||
curl -H "X-API-Key: <your-api-key>" http://127.0.0.1:8003/health
|
||
curl -H "X-API-Key: <your-api-key>" http://127.0.0.1:8003/ai/health
|
||
```
|
||
|
||
5. Verify model discovery.
|
||
|
||
```bash
|
||
curl -H "X-API-Key: <your-api-key>" http://127.0.0.1:8003/v1/models
|
||
curl -H "X-API-Key: <your-api-key>" http://127.0.0.1:8003/ai/models
|
||
```
|
||
|
||
6. Run a small chat request through the gateway.
|
||
|
||
```bash
|
||
curl -X POST http://127.0.0.1:8003/v1/chat/completions \
|
||
-H "X-API-Key: <your-api-key>" \
|
||
-H "Content-Type: application/json" \
|
||
-d '{
|
||
"messages": [
|
||
{"role": "system", "content": "You are a concise assistant for Skinbase Nova."},
|
||
{"role": "user", "content": "Write one short admin help sentence about reviewing wallpaper metadata."}
|
||
],
|
||
"max_tokens": 60,
|
||
"stream": false
|
||
}'
|
||
```
|
||
|
||
7. If startup or health fails, inspect the relevant logs.
|
||
|
||
```bash
|
||
docker compose logs --tail=200 llm
|
||
docker compose logs --tail=200 gateway
|
||
```
|
||
|
||
## Universal analyze (ALL)
|
||
|
||
Analyze an image by URL (gateway aggregates CLIP, BLIP, YOLO):
|
||
|
||
```bash
|
||
curl -X POST https://vision.klevze.net/analyze/all \
|
||
-H "X-API-Key: <your-api-key>" \
|
||
-H "Content-Type: application/json" \
|
||
-d '{"url":"https://files.skinbase.org/img/aa/bb/cc/md.webp","limit":5}'
|
||
```
|
||
|
||
File upload (multipart):
|
||
|
||
```bash
|
||
curl -X POST https://vision.klevze.net/analyze/all/file \
|
||
-H "X-API-Key: <your-api-key>" \
|
||
-F "file=@/path/to/image.webp" \
|
||
-F "limit=5"
|
||
```
|
||
|
||
Parameters:
|
||
- `limit`: optional integer to limit returned tag/caption items.
|
||
|
||
## Individual services (via gateway)
|
||
|
||
These endpoints call the specific service through the gateway.
|
||
|
||
### CLIP — tags
|
||
|
||
URL request:
|
||
|
||
```bash
|
||
curl -X POST https://vision.klevze.net/analyze/clip \
|
||
-H "X-API-Key: <your-api-key>" \
|
||
-H "Content-Type: application/json" \
|
||
-d '{"url":"https://files.skinbase.org/img/aa/bb/cc/md.webp","limit":5}'
|
||
```
|
||
|
||
File upload:
|
||
|
||
```bash
|
||
curl -X POST https://vision.klevze.net/analyze/clip/file \
|
||
-H "X-API-Key: <your-api-key>" \
|
||
-F "file=@/path/to/image.webp" \
|
||
-F "limit=5"
|
||
```
|
||
|
||
Return: JSON list of tags with confidence scores.
|
||
|
||
### BLIP — captioning
|
||
|
||
URL request:
|
||
|
||
```bash
|
||
curl -X POST https://vision.klevze.net/analyze/blip \
|
||
-H "X-API-Key: <your-api-key>" \
|
||
-H "Content-Type: application/json" \
|
||
-d '{"url":"https://files.skinbase.org/img/aa/bb/cc/md.webp","variants":3}'
|
||
```
|
||
|
||
File upload:
|
||
|
||
```bash
|
||
curl -X POST https://vision.klevze.net/analyze/blip/file \
|
||
-H "X-API-Key: <your-api-key>" \
|
||
-F "file=@/path/to/image.webp" \
|
||
-F "variants=3" \
|
||
-F "max_length=60"
|
||
```
|
||
|
||
Parameters:
|
||
- `variants`: number of caption variants to return.
|
||
- `max_length`: optional maximum caption length.
|
||
|
||
Return: one or more caption strings (optionally with scores).
|
||
|
||
### YOLO — object detection
|
||
|
||
URL request:
|
||
|
||
```bash
|
||
curl -X POST https://vision.klevze.net/analyze/yolo \
|
||
-H "X-API-Key: <your-api-key>" \
|
||
-H "Content-Type: application/json" \
|
||
-d '{"url":"https://files.skinbase.org/img/aa/bb/cc/md.webp","conf":0.25}'
|
||
```
|
||
|
||
File upload:
|
||
|
||
```bash
|
||
curl -X POST https://vision.klevze.net/analyze/yolo/file \
|
||
-H "X-API-Key: <your-api-key>" \
|
||
-F "file=@/path/to/image.webp" \
|
||
-F "conf=0.25"
|
||
```
|
||
|
||
Parameters:
|
||
- `conf`: confidence threshold (0.0–1.0).
|
||
|
||
Return: detected objects with `class`, `confidence`, and `bbox` (bounding box coordinates).
|
||
|
||
### Maturity — NSFW / maturity analysis
|
||
|
||
Analyzes an image for mature or NSFW content and returns a structured signal intended for Nova moderation workflows.
|
||
|
||
URL request:
|
||
|
||
```bash
|
||
curl -X POST https://vision.klevze.net/analyze/maturity \
|
||
-H "X-API-Key: <your-api-key>" \
|
||
-H "Content-Type: application/json" \
|
||
-d '{"url":"https://files.skinbase.org/img/aa/bb/cc/md.webp"}'
|
||
```
|
||
|
||
File upload:
|
||
|
||
```bash
|
||
curl -X POST https://vision.klevze.net/analyze/maturity/file \
|
||
-H "X-API-Key: <your-api-key>" \
|
||
-F "file=@/path/to/image.webp"
|
||
```
|
||
|
||
Example response:
|
||
|
||
```json
|
||
{
|
||
"maturity_label": "mature",
|
||
"confidence": 0.94,
|
||
"score": 0.94,
|
||
"labels": ["nsfw"],
|
||
"model": "Falconsai/nsfw_image_detection",
|
||
"threshold_used": 0.80,
|
||
"analysis_time_ms": 183.0,
|
||
"source": "maturity-service",
|
||
"action_hint": "flag_high",
|
||
"advisory": "High-confidence mature content detected"
|
||
}
|
||
```
|
||
|
||
Response fields:
|
||
|
||
| Field | Type | Description |
|
||
|---|---|---|
|
||
| `maturity_label` | string | `safe` or `mature` |
|
||
| `confidence` | float | Confidence in the label decision (0–1). For `safe`, this is `1 - score`. |
|
||
| `score` | float | Raw NSFW probability from the model (0–1). |
|
||
| `labels` | array | Sublabels when mature: currently `["nsfw"]`. Empty for safe results. |
|
||
| `model` | string | Model identifier / HuggingFace model ID. |
|
||
| `threshold_used` | float | The threshold value that determined the label. |
|
||
| `analysis_time_ms` | float | Inference time in milliseconds. |
|
||
| `source` | string | Always `maturity-service`. |
|
||
| `action_hint` | string | `safe`, `review`, or `flag_high`. Use this in Nova to drive blur/queue/flag decisions. |
|
||
| `advisory` | string | Short human-readable explanation. |
|
||
|
||
`action_hint` decision logic:
|
||
- `flag_high`: score ≥ `MATURITY_THRESHOLD_MATURE` (default 0.80) — high-confidence mature, flag for moderation.
|
||
- `review`: score ≥ `MATURITY_THRESHOLD_REVIEW` (default 0.60) but below mature threshold — possible mature, queue for human review.
|
||
- `safe`: score below both thresholds — content appears safe.
|
||
|
||
If the maturity service is unavailable the gateway returns a `502` or `503` error. **Nova must not treat a gateway failure as a `safe` result** — retry or queue for later processing.
|
||
|
||
## LLM / Chat endpoints
|
||
|
||
The gateway validates requests, clamps `max_tokens` to configured limits, rejects oversized payloads, and normalizes downstream failures into JSON under an `error` key.
|
||
|
||
### OpenAI-style chat completions
|
||
|
||
```bash
|
||
curl -X POST https://vision.klevze.net/v1/chat/completions \
|
||
-H "X-API-Key: <your-api-key>" \
|
||
-H "Content-Type: application/json" \
|
||
-d '{
|
||
"messages": [
|
||
{"role": "system", "content": "You are a concise assistant for Skinbase Nova."},
|
||
{"role": "user", "content": "Write a short biography for a creator known for sci-fi environments."}
|
||
],
|
||
"temperature": 0.7,
|
||
"max_tokens": 220,
|
||
"stream": false
|
||
}'
|
||
```
|
||
|
||
Supported request fields:
|
||
- `messages` (required)
|
||
- `temperature`
|
||
- `max_tokens`
|
||
- `stream` (`false` only in v1)
|
||
- `top_p`
|
||
- `stop`
|
||
- `presence_penalty`
|
||
- `frequency_penalty`
|
||
|
||
Validation rules:
|
||
- At least one message is required.
|
||
- Roles must be `system`, `user`, or `assistant`.
|
||
- Empty message content is rejected.
|
||
- Oversized request bodies return `413`.
|
||
- `max_tokens` is clamped to `LLM_MAX_TOKENS_HARD_LIMIT`.
|
||
|
||
### Project-friendly chat response
|
||
|
||
```bash
|
||
curl -X POST https://vision.klevze.net/ai/chat \
|
||
-H "X-API-Key: <your-api-key>" \
|
||
-H "Content-Type: application/json" \
|
||
-d '{
|
||
"messages": [
|
||
{"role": "system", "content": "You are a helpful metadata assistant."},
|
||
{"role": "user", "content": "Suggest five tags for a fantasy castle wallpaper."}
|
||
]
|
||
}'
|
||
```
|
||
|
||
Example response:
|
||
|
||
```json
|
||
{
|
||
"model": "qwen3-1.7b-instruct-q4_k_m",
|
||
"content": "fantasy castle, moonlit fortress, medieval towers, epic landscape, digital painting",
|
||
"finish_reason": "stop",
|
||
"usage": {
|
||
"prompt_tokens": 48,
|
||
"completion_tokens": 19,
|
||
"total_tokens": 67
|
||
}
|
||
}
|
||
```
|
||
|
||
### Model discovery
|
||
|
||
```bash
|
||
curl -H "X-API-Key: <your-api-key>" https://vision.klevze.net/v1/models
|
||
curl -H "X-API-Key: <your-api-key>" https://vision.klevze.net/ai/models
|
||
```
|
||
|
||
### Failure modes
|
||
|
||
- `401`: missing or invalid API key
|
||
- `413`: request body exceeds `LLM_MAX_REQUEST_BYTES`
|
||
- `422`: validation failure or unsupported streaming request
|
||
- `503`: LLM disabled or upstream unavailable
|
||
- `504`: upstream timeout
|
||
|
||
## Vector DB (Qdrant)
|
||
|
||
Use the Qdrant gateway endpoints to store image embeddings and find visually similar images. Embeddings are generated automatically by the CLIP service.
|
||
|
||
Qdrant point IDs must be either an unsigned integer or a UUID string. If you send another string value, the wrapper may replace it with a generated UUID and store the original value in metadata as `_original_id`.
|
||
|
||
#### Upsert (store) an image by URL
|
||
|
||
```bash
|
||
curl -X POST https://vision.klevze.net/vectors/upsert \
|
||
-H "X-API-Key: <your-api-key>" \
|
||
-H "Content-Type: application/json" \
|
||
-d '{"url":"https://files.skinbase.org/img/aa/bb/cc/md.webp","id":"550e8400-e29b-41d4-a716-446655440000","metadata":{"category":"wallpaper","source":"upload"}}'
|
||
```
|
||
|
||
Parameters:
|
||
- `url` (required): image URL to embed and store.
|
||
- `id` (optional): point ID. Use an unsigned integer or UUID string. If omitted, a UUID is auto-generated.
|
||
- `metadata` (optional): arbitrary key-value payload stored alongside the vector.
|
||
- `collection` (optional): target collection name (defaults to `images`).
|
||
|
||
#### Upsert by file upload
|
||
|
||
```bash
|
||
curl -X POST https://vision.klevze.net/vectors/upsert/file \
|
||
-H "X-API-Key: <your-api-key>" \
|
||
-F "file=@/path/to/image.webp" \
|
||
-F 'id=550e8400-e29b-41d4-a716-446655440001' \
|
||
-F 'metadata_json={"category":"photo"}'
|
||
```
|
||
|
||
#### Upsert a pre-computed vector
|
||
|
||
```bash
|
||
curl -X POST https://vision.klevze.net/vectors/upsert/vector \
|
||
-H "X-API-Key: <your-api-key>" \
|
||
-H "Content-Type: application/json" \
|
||
-d '{"vector":[0.1,0.2,...],"id":"550e8400-e29b-41d4-a716-446655440002","metadata":{"custom":"data"}}'
|
||
```
|
||
|
||
#### Search similar images by URL
|
||
|
||
```bash
|
||
curl -X POST https://vision.klevze.net/vectors/search \
|
||
-H "X-API-Key: <your-api-key>" \
|
||
-H "Content-Type: application/json" \
|
||
-d '{"url":"https://files.skinbase.org/img/aa/bb/cc/md.webp","limit":5}'
|
||
```
|
||
|
||
Parameters:
|
||
- `url` (required): query image URL.
|
||
- `limit` (optional, default 5): number of results.
|
||
- `score_threshold` (optional): minimum cosine similarity (0.0–1.0).
|
||
- `filter_metadata` (optional): filter results by payload fields, e.g. `{"is_public":true,"category_id":3}`.
|
||
- `collection` (optional): collection to search.
|
||
- `hnsw_ef` (optional, int): override the HNSW ef parameter at query time. Higher = better recall, slightly more latency.
|
||
- `exact` (optional, bool, default false): brute-force exact search. Avoid on large collections.
|
||
- `indexed_only` (optional, bool, default false): restrict search to fully indexed segments only. Useful during bulk ingest.
|
||
|
||
Return: list of `{"id", "score", "metadata"}` sorted by similarity.
|
||
|
||
#### Search by file upload
|
||
|
||
```bash
|
||
curl -X POST https://vision.klevze.net/vectors/search/file \
|
||
-H "X-API-Key: <your-api-key>" \
|
||
-F "file=@/path/to/image.webp" \
|
||
-F "limit=5" \
|
||
-F 'filter_metadata_json={"is_public":true}'
|
||
```
|
||
|
||
All URL search parameters are available as form fields; use `filter_metadata_json` (JSON string) for filters.
|
||
|
||
#### Search by pre-computed vector
|
||
|
||
```bash
|
||
curl -X POST https://vision.klevze.net/vectors/search/vector \
|
||
-H "X-API-Key: <your-api-key>" \
|
||
-H "Content-Type: application/json" \
|
||
-d '{"vector":[0.1,0.2,...],"limit":5,"hnsw_ef":128}'
|
||
```
|
||
|
||
#### Collection management
|
||
|
||
List all collections:
|
||
```bash
|
||
curl -H "X-API-Key: <your-api-key>" https://vision.klevze.net/vectors/collections
|
||
```
|
||
|
||
Get collection info:
|
||
```bash
|
||
curl -H "X-API-Key: <your-api-key>" https://vision.klevze.net/vectors/collections/images
|
||
```
|
||
|
||
Create a custom collection:
|
||
```bash
|
||
curl -X POST https://vision.klevze.net/vectors/collections \
|
||
-H "X-API-Key: <your-api-key>" \
|
||
-H "Content-Type: application/json" \
|
||
-d '{"name":"my_collection","vector_dim":512,"distance":"cosine"}'
|
||
```
|
||
|
||
Delete a collection:
|
||
```bash
|
||
curl -H "X-API-Key: <your-api-key>" -X DELETE https://vision.klevze.net/vectors/collections/my_collection
|
||
```
|
||
|
||
#### Full diagnostic inspect
|
||
|
||
Returns HNSW config, optimizer config, quantization, segment count, payload index coverage percentages, and RAM footprint estimate for every collection.
|
||
|
||
```bash
|
||
curl -H "X-API-Key: <your-api-key>" https://vision.klevze.net/vectors/inspect
|
||
```
|
||
|
||
#### Payload index management
|
||
|
||
Payload indexes are critical for fast filtered vector search. Always create indexes for fields used in `filter_metadata` filters.
|
||
|
||
```bash
|
||
# List existing indexes
|
||
curl -H "X-API-Key: <your-api-key>" https://vision.klevze.net/vectors/collections/images/indexes
|
||
|
||
# Create a single index
|
||
curl -X POST https://vision.klevze.net/vectors/collections/images/indexes \
|
||
-H "X-API-Key: <your-api-key>" \
|
||
-H "Content-Type: application/json" \
|
||
-d '{"field":"is_public","type":"bool"}'
|
||
|
||
# Ensure multiple indexes exist (idempotent — safe to run multiple times)
|
||
curl -X POST https://vision.klevze.net/vectors/collections/images/ensure-indexes \
|
||
-H "X-API-Key: <your-api-key>" \
|
||
-H "Content-Type: application/json" \
|
||
-d '{"fields":[{"field":"is_public","type":"bool"},{"field":"is_deleted","type":"bool"},{"field":"category_id","type":"integer"},{"field":"user_id","type":"keyword"}]}'
|
||
```
|
||
|
||
Supported index types: `keyword`, `integer`, `float`, `bool`, `geo`, `datetime`, `text`, `uuid`.
|
||
|
||
#### Collection configuration (HNSW / optimizer / quantization)
|
||
|
||
Updates HNSW, optimizer, or scalar quantization settings on an existing collection without data loss. HNSW graph and segment changes apply to newly created segments.
|
||
|
||
```bash
|
||
curl -X POST https://vision.klevze.net/vectors/collections/images/configure \
|
||
-H "X-API-Key: <your-api-key>" \
|
||
-H "Content-Type: application/json" \
|
||
-d '{
|
||
"hnsw_m": 16,
|
||
"hnsw_ef_construct": 200,
|
||
"hnsw_on_disk": false,
|
||
"indexing_threshold": 20000,
|
||
"default_segment_number": 4,
|
||
"quantization_type": "int8",
|
||
"quantization_quantile": 0.99,
|
||
"quantization_always_ram": true
|
||
}'
|
||
```
|
||
|
||
Parameters:
|
||
- `hnsw_m` (int, 4–64): edges per node in the HNSW graph.
|
||
- `hnsw_ef_construct` (int, 10–1000): ef during index construction.
|
||
- `hnsw_on_disk` (bool): store HNSW graph on disk (saves RAM, slightly slower queries).
|
||
- `indexing_threshold` (int): minimum vector changes before a segment is indexed.
|
||
- `default_segment_number` (int, 1–32): target segment count for parallelism.
|
||
- `quantization_type` (string, `"int8"` or null): enable scalar quantization (~4× RAM reduction).
|
||
- `quantization_quantile` (float, 0.5–1.0, default 0.99): calibration quantile.
|
||
- `quantization_always_ram` (bool, default true): keep quantized vectors in RAM.
|
||
|
||
#### Delete points
|
||
|
||
```bash
|
||
curl -X POST https://vision.klevze.net/vectors/delete \
|
||
-H "X-API-Key: <your-api-key>" \
|
||
-H "Content-Type: application/json" \
|
||
-d '{"ids":["550e8400-e29b-41d4-a716-446655440000","550e8400-e29b-41d4-a716-446655440001"]}'
|
||
```
|
||
|
||
#### Get a point by ID
|
||
|
||
```bash
|
||
curl -H "X-API-Key: <your-api-key>" https://vision.klevze.net/vectors/points/550e8400-e29b-41d4-a716-446655440000
|
||
```
|
||
|
||
#### Get a point by original application ID
|
||
|
||
If the wrapper had to replace your string `id` with a generated UUID, the original value is preserved in metadata as `_original_id`.
|
||
|
||
```bash
|
||
curl -H "X-API-Key: <your-api-key>" https://vision.klevze.net/vectors/points/by-original-id/img-001
|
||
```
|
||
|
||
## Card Renderer
|
||
|
||
The card renderer generates branded social-card images from artwork photos. It applies smart center-weighted cropping, a gradient overlay, title/subtitle/username/category text, optional tags, and an optional logo.
|
||
|
||
Default output: 1200×630 WebP (`nova-artwork-v1` template).
|
||
|
||
### List available templates
|
||
|
||
```bash
|
||
curl -H "X-API-Key: <your-api-key>" https://vision.klevze.net/cards/templates
|
||
```
|
||
|
||
### Render a card from a URL
|
||
|
||
```bash
|
||
curl -X POST https://vision.klevze.net/cards/render \
|
||
-H "X-API-Key: <your-api-key>" \
|
||
-H "Content-Type: application/json" \
|
||
-d '{
|
||
"url": "https://files.skinbase.org/img/aa/bb/cc/md.webp",
|
||
"title": "Artwork Title",
|
||
"subtitle": "Optional subtitle",
|
||
"username": "@artist",
|
||
"category": "Digital Art",
|
||
"tags": ["surreal", "landscape"],
|
||
"template": "nova-artwork-v1",
|
||
"width": 1200,
|
||
"height": 630,
|
||
"output": "webp",
|
||
"quality": 90,
|
||
"show_logo": true
|
||
}'
|
||
```
|
||
|
||
Returns binary image bytes with `Content-Type: image/webp`.
|
||
|
||
### Render a card from a file upload
|
||
|
||
```bash
|
||
curl -X POST https://vision.klevze.net/cards/render/file \
|
||
-H "X-API-Key: <your-api-key>" \
|
||
-F "file=@/path/to/image.webp" \
|
||
-F "title=Artwork Title" \
|
||
-F "username=@artist" \
|
||
-F "template=nova-artwork-v1" \
|
||
-F "show_logo=true"
|
||
```
|
||
|
||
Returns binary image bytes.
|
||
|
||
### Get card layout metadata (no image rendered)
|
||
|
||
```bash
|
||
curl -X POST https://vision.klevze.net/cards/render/meta \
|
||
-H "X-API-Key: <your-api-key>" \
|
||
-H "Content-Type: application/json" \
|
||
-d '{"url":"https://files.skinbase.org/img/aa/bb/cc/md.webp","title":"Artwork Title"}'
|
||
```
|
||
|
||
Returns crop coordinates and layout data without producing an image.
|
||
|
||
## Request/Response notes
|
||
|
||
- For URL requests use `Content-Type: application/json`.
|
||
- For uploads use `multipart/form-data` with a `file` field.
|
||
- Most gateway endpoints require the `X-API-Key` header.
|
||
- Remote image URLs must resolve to public hosts and return an image content type.
|
||
- The gateway aggregates and normalizes outputs for `/analyze/all`.
|
||
|
||
## Running a single service
|
||
|
||
To run only one service via docker compose:
|
||
|
||
```bash
|
||
docker compose up -d --build clip
|
||
```
|
||
|
||
Or run locally (Python env) from the service folder:
|
||
|
||
```bash
|
||
# inside clip/ or blip/ or yolo/
|
||
uvicorn main:app --host 0.0.0.0 --port 8000
|
||
```
|
||
|
||
## Production tips
|
||
|
||
- Add authentication (API keys or OAuth) at the gateway.
|
||
- Add rate-limiting and per-client quotas.
|
||
- Keep model services on an internal Docker network.
|
||
- For GPU: enable NVIDIA runtime and update service Dockerfiles / compose profiles.
|
||
|
||
## Troubleshooting
|
||
|
||
- Service fails to start: check `docker compose logs <service>` for model load errors.
|
||
- BLIP startup error about Hugging Face auth: set `HUGGINGFACE_TOKEN` in `.env` and rebuild `blip`.
|
||
- Qdrant upsert error about invalid point ID: use a UUID or unsigned integer for `id`, or omit it and use the returned generated `id`.
|
||
- Image URL rejected before download: the URL may point to localhost, a private IP, a non-`http/https` scheme, or a non-image content type.
|
||
- High memory / OOM: increase host memory or reduce model footprint; consider GPUs.
|
||
- Slow startup: model weights load on service startup — expect extra time. The maturity service (`start_period: 90s`) may take longer on first boot as it downloads the classifier weights (~330 MB). Mount `~/.cache/huggingface` as a volume to persist across rebuilds.
|
||
- Maturity endpoint returns `503`: `MATURITY_ENABLED` is set to `false` in environment configuration.
|
||
- Maturity endpoint returns `502`: the maturity container is unhealthy or still starting up; wait and retry.
|
||
|
||
## Extending
|
||
|
||
- Swap or update models in each service by editing that service's `main.py`.
|
||
- Add request validation, timeouts, and retries in the gateway to improve robustness.
|
||
|
||
## Files of interest
|
||
|
||
- `docker-compose.yml` — composition and service definitions.
|
||
- `gateway/` — gateway FastAPI server.
|
||
- `clip/`, `blip/`, `yolo/` — service implementations and Dockerfiles.
|
||
- `maturity/` — NSFW/maturity classifier service (ViT-based, HuggingFace `Falconsai/nsfw_image_detection`).
|
||
- `qdrant/` — Qdrant API wrapper service (FastAPI).
|
||
- `card-renderer/` — card rendering service (FastAPI).
|
||
- `common/` — shared helpers (e.g., image I/O).
|