Compare commits
4 Commits
58ee1b3bdd
...
3f925e17d5
| Author | SHA1 | Date | |
|---|---|---|---|
| 3f925e17d5 | |||
| 6ea91c3452 | |||
| 609485a0f0 | |||
| c7ea347e2b |
76
README.md
76
README.md
@@ -1,6 +1,6 @@
|
|||||||
# Skinbase Vision Stack (CLIP + BLIP + YOLO + Qdrant) – Dockerized FastAPI
|
# Skinbase Vision Stack (CLIP + BLIP + YOLO + Qdrant + Card Renderer) – Dockerized FastAPI
|
||||||
|
|
||||||
This repository provides **four standalone vision services** (CLIP / BLIP / YOLO / Qdrant)
|
This repository provides **five standalone vision services** (CLIP / BLIP / YOLO / Qdrant / Card Renderer)
|
||||||
and a **Gateway API** that can call them individually or together.
|
and a **Gateway API** that can call them individually or together.
|
||||||
|
|
||||||
## Services & Ports
|
## Services & Ports
|
||||||
@@ -11,6 +11,7 @@ and a **Gateway API** that can call them individually or together.
|
|||||||
- `yolo`: internal only
|
- `yolo`: internal only
|
||||||
- `qdrant`: vector DB (port `6333` exposed for direct access)
|
- `qdrant`: vector DB (port `6333` exposed for direct access)
|
||||||
- `qdrant-svc`: internal Qdrant API wrapper
|
- `qdrant-svc`: internal Qdrant API wrapper
|
||||||
|
- `card-renderer`: internal card rendering service
|
||||||
|
|
||||||
## Run
|
## Run
|
||||||
|
|
||||||
@@ -129,14 +130,17 @@ curl -H "X-API-Key: <your-api-key>" -X POST https://vision.klevze.net/vectors/up
|
|||||||
```bash
|
```bash
|
||||||
curl -H "X-API-Key: <your-api-key>" -X POST https://vision.klevze.net/vectors/search \
|
curl -H "X-API-Key: <your-api-key>" -X POST https://vision.klevze.net/vectors/search \
|
||||||
-H "Content-Type: application/json" \
|
-H "Content-Type: application/json" \
|
||||||
-d '{"url":"https://files.skinbase.org/img/aa/bb/cc/md.webp","limit":5}'
|
-d '{"url":"https://files.skinbase.org/img/aa/bb/cc/md.webp","limit":5,"filter_metadata":{"is_public":true}}'
|
||||||
```
|
```
|
||||||
|
|
||||||
|
Optional search parameters: `hnsw_ef` (int), `exact` (bool), `indexed_only` (bool), `score_threshold` (float), `filter_metadata` (object).
|
||||||
|
|
||||||
### Search similar images by file upload
|
### Search similar images by file upload
|
||||||
```bash
|
```bash
|
||||||
curl -H "X-API-Key: <your-api-key>" -X POST https://vision.klevze.net/vectors/search/file \
|
curl -H "X-API-Key: <your-api-key>" -X POST https://vision.klevze.net/vectors/search/file \
|
||||||
-F "file=@/path/to/image.webp" \
|
-F "file=@/path/to/image.webp" \
|
||||||
-F "limit=5"
|
-F "limit=5" \
|
||||||
|
-F 'filter_metadata_json={"is_public":true}'
|
||||||
```
|
```
|
||||||
|
|
||||||
### List collections
|
### List collections
|
||||||
@@ -149,6 +153,38 @@ curl -H "X-API-Key: <your-api-key>" https://vision.klevze.net/vectors/collection
|
|||||||
curl -H "X-API-Key: <your-api-key>" https://vision.klevze.net/vectors/collections/images
|
curl -H "X-API-Key: <your-api-key>" https://vision.klevze.net/vectors/collections/images
|
||||||
```
|
```
|
||||||
|
|
||||||
|
### Full diagnostic inspect
|
||||||
|
```bash
|
||||||
|
curl -H "X-API-Key: <your-api-key>" https://vision.klevze.net/vectors/inspect
|
||||||
|
```
|
||||||
|
|
||||||
|
Returns HNSW config, optimizer config, quantization, segment count, payload index coverage, and RAM estimate for every collection.
|
||||||
|
|
||||||
|
### Payload index management
|
||||||
|
```bash
|
||||||
|
# List indexes
|
||||||
|
curl -H "X-API-Key: <your-api-key>" https://vision.klevze.net/vectors/collections/images/indexes
|
||||||
|
|
||||||
|
# Create a single index
|
||||||
|
curl -H "X-API-Key: <your-api-key>" -X POST https://vision.klevze.net/vectors/collections/images/indexes \
|
||||||
|
-H "Content-Type: application/json" \
|
||||||
|
-d '{"field":"is_public","type":"bool"}'
|
||||||
|
|
||||||
|
# Ensure multiple indexes (idempotent)
|
||||||
|
curl -H "X-API-Key: <your-api-key>" -X POST https://vision.klevze.net/vectors/collections/images/ensure-indexes \
|
||||||
|
-H "Content-Type: application/json" \
|
||||||
|
-d '{"fields":[{"field":"is_public","type":"bool"},{"field":"category_id","type":"integer"}]}'
|
||||||
|
```
|
||||||
|
|
||||||
|
Supported index types: `keyword`, `integer`, `float`, `bool`, `geo`, `datetime`, `text`, `uuid`.
|
||||||
|
|
||||||
|
### Collection configuration (HNSW / optimizer / quantization)
|
||||||
|
```bash
|
||||||
|
curl -H "X-API-Key: <your-api-key>" -X POST https://vision.klevze.net/vectors/collections/images/configure \
|
||||||
|
-H "Content-Type: application/json" \
|
||||||
|
-d '{"hnsw_m":16,"hnsw_ef_construct":200,"indexing_threshold":20000,"quantization_type":"int8"}'
|
||||||
|
```
|
||||||
|
|
||||||
### Delete points
|
### Delete points
|
||||||
```bash
|
```bash
|
||||||
curl -H "X-API-Key: <your-api-key>" -X POST https://vision.klevze.net/vectors/delete \
|
curl -H "X-API-Key: <your-api-key>" -X POST https://vision.klevze.net/vectors/delete \
|
||||||
@@ -158,6 +194,38 @@ curl -H "X-API-Key: <your-api-key>" -X POST https://vision.klevze.net/vectors/de
|
|||||||
|
|
||||||
If you let the wrapper generate a UUID, use the returned `id` value for later `get`, `search`, or `delete` operations.
|
If you let the wrapper generate a UUID, use the returned `id` value for later `get`, `search`, or `delete` operations.
|
||||||
|
|
||||||
|
## Card Renderer
|
||||||
|
|
||||||
|
### List available templates
|
||||||
|
```bash
|
||||||
|
curl -H "X-API-Key: <your-api-key>" https://vision.klevze.net/cards/templates
|
||||||
|
```
|
||||||
|
|
||||||
|
### Render a card from a URL
|
||||||
|
```bash
|
||||||
|
curl -H "X-API-Key: <your-api-key>" -X POST https://vision.klevze.net/cards/render \
|
||||||
|
-H "Content-Type: application/json" \
|
||||||
|
-d '{"url":"https://files.skinbase.org/img/aa/bb/cc/md.webp","title":"Artwork Title","username":"@artist","template":"nova-artwork-v1"}'
|
||||||
|
```
|
||||||
|
|
||||||
|
Returns binary image bytes (WebP by default).
|
||||||
|
|
||||||
|
### Render a card from a file upload
|
||||||
|
```bash
|
||||||
|
curl -H "X-API-Key: <your-api-key>" -X POST https://vision.klevze.net/cards/render/file \
|
||||||
|
-F "file=@/path/to/image.webp" \
|
||||||
|
-F "title=Artwork Title" \
|
||||||
|
-F "username=@artist" \
|
||||||
|
-F "template=nova-artwork-v1"
|
||||||
|
```
|
||||||
|
|
||||||
|
### Get card layout metadata (no image rendered)
|
||||||
|
```bash
|
||||||
|
curl -H "X-API-Key: <your-api-key>" -X POST https://vision.klevze.net/cards/render/meta \
|
||||||
|
-H "Content-Type: application/json" \
|
||||||
|
-d '{"url":"https://files.skinbase.org/img/aa/bb/cc/md.webp","title":"Artwork Title"}'
|
||||||
|
```
|
||||||
|
|
||||||
## Notes
|
## Notes
|
||||||
|
|
||||||
- This is a **starter scaffold**. Models are loaded at service startup.
|
- This is a **starter scaffold**. Models are loaded at service startup.
|
||||||
|
|||||||
144
USAGE.md
144
USAGE.md
@@ -4,7 +4,7 @@ This document explains how to run and use the Skinbase Vision Stack (Gateway + C
|
|||||||
|
|
||||||
## Overview
|
## Overview
|
||||||
|
|
||||||
- Services: `gateway`, `clip`, `blip`, `yolo`, `qdrant`, `qdrant-svc` (FastAPI each, except `qdrant` which is the official Qdrant DB).
|
- Services: `gateway`, `clip`, `blip`, `yolo`, `qdrant`, `qdrant-svc`, `card-renderer` (FastAPI each, except `qdrant` which is the official Qdrant DB).
|
||||||
- Gateway is the public API endpoint; the other services are internal.
|
- Gateway is the public API endpoint; the other services are internal.
|
||||||
|
|
||||||
## Model overview
|
## Model overview
|
||||||
@@ -17,6 +17,8 @@ This document explains how to run and use the Skinbase Vision Stack (Gateway + C
|
|||||||
|
|
||||||
- **Qdrant**: High-performance vector similarity search engine. Stores CLIP image embeddings and enables reverse image search (find similar images). The `qdrant-svc` wrapper auto-embeds images via CLIP before upserting.
|
- **Qdrant**: High-performance vector similarity search engine. Stores CLIP image embeddings and enables reverse image search (find similar images). The `qdrant-svc` wrapper auto-embeds images via CLIP before upserting.
|
||||||
|
|
||||||
|
- **Card Renderer**: Generates branded social-card images (e.g. Open Graph previews) from artwork images. Applies smart center-weighted cropping, gradient overlays, title/username/tag text, and an optional logo. Returns binary image bytes (WebP by default). Template: `nova-artwork-v1`.
|
||||||
|
|
||||||
## Prerequisites
|
## Prerequisites
|
||||||
|
|
||||||
- Docker Desktop (with `docker compose`) or a Docker environment.
|
- Docker Desktop (with `docker compose`) or a Docker environment.
|
||||||
@@ -219,8 +221,11 @@ Parameters:
|
|||||||
- `url` (required): query image URL.
|
- `url` (required): query image URL.
|
||||||
- `limit` (optional, default 5): number of results.
|
- `limit` (optional, default 5): number of results.
|
||||||
- `score_threshold` (optional): minimum cosine similarity (0.0–1.0).
|
- `score_threshold` (optional): minimum cosine similarity (0.0–1.0).
|
||||||
- `filter_metadata` (optional): filter results by metadata, e.g. `{"category":"wallpaper"}`.
|
- `filter_metadata` (optional): filter results by payload fields, e.g. `{"is_public":true,"category_id":3}`.
|
||||||
- `collection` (optional): collection to search.
|
- `collection` (optional): collection to search.
|
||||||
|
- `hnsw_ef` (optional, int): override the HNSW ef parameter at query time. Higher = better recall, slightly more latency.
|
||||||
|
- `exact` (optional, bool, default false): brute-force exact search. Avoid on large collections.
|
||||||
|
- `indexed_only` (optional, bool, default false): restrict search to fully indexed segments only. Useful during bulk ingest.
|
||||||
|
|
||||||
Return: list of `{"id", "score", "metadata"}` sorted by similarity.
|
Return: list of `{"id", "score", "metadata"}` sorted by similarity.
|
||||||
|
|
||||||
@@ -230,16 +235,19 @@ Return: list of `{"id", "score", "metadata"}` sorted by similarity.
|
|||||||
curl -X POST https://vision.klevze.net/vectors/search/file \
|
curl -X POST https://vision.klevze.net/vectors/search/file \
|
||||||
-H "X-API-Key: <your-api-key>" \
|
-H "X-API-Key: <your-api-key>" \
|
||||||
-F "file=@/path/to/image.webp" \
|
-F "file=@/path/to/image.webp" \
|
||||||
-F "limit=5"
|
-F "limit=5" \
|
||||||
|
-F 'filter_metadata_json={"is_public":true}'
|
||||||
```
|
```
|
||||||
|
|
||||||
|
All URL search parameters are available as form fields; use `filter_metadata_json` (JSON string) for filters.
|
||||||
|
|
||||||
#### Search by pre-computed vector
|
#### Search by pre-computed vector
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
curl -X POST https://vision.klevze.net/vectors/search/vector \
|
curl -X POST https://vision.klevze.net/vectors/search/vector \
|
||||||
-H "X-API-Key: <your-api-key>" \
|
-H "X-API-Key: <your-api-key>" \
|
||||||
-H "Content-Type: application/json" \
|
-H "Content-Type: application/json" \
|
||||||
-d '{"vector":[0.1,0.2,...],"limit":5}'
|
-d '{"vector":[0.1,0.2,...],"limit":5,"hnsw_ef":128}'
|
||||||
```
|
```
|
||||||
|
|
||||||
#### Collection management
|
#### Collection management
|
||||||
@@ -267,6 +275,67 @@ Delete a collection:
|
|||||||
curl -H "X-API-Key: <your-api-key>" -X DELETE https://vision.klevze.net/vectors/collections/my_collection
|
curl -H "X-API-Key: <your-api-key>" -X DELETE https://vision.klevze.net/vectors/collections/my_collection
|
||||||
```
|
```
|
||||||
|
|
||||||
|
#### Full diagnostic inspect
|
||||||
|
|
||||||
|
Returns HNSW config, optimizer config, quantization, segment count, payload index coverage percentages, and RAM footprint estimate for every collection.
|
||||||
|
|
||||||
|
```bash
|
||||||
|
curl -H "X-API-Key: <your-api-key>" https://vision.klevze.net/vectors/inspect
|
||||||
|
```
|
||||||
|
|
||||||
|
#### Payload index management
|
||||||
|
|
||||||
|
Payload indexes are critical for fast filtered vector search. Always create indexes for fields used in `filter_metadata` filters.
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# List existing indexes
|
||||||
|
curl -H "X-API-Key: <your-api-key>" https://vision.klevze.net/vectors/collections/images/indexes
|
||||||
|
|
||||||
|
# Create a single index
|
||||||
|
curl -X POST https://vision.klevze.net/vectors/collections/images/indexes \
|
||||||
|
-H "X-API-Key: <your-api-key>" \
|
||||||
|
-H "Content-Type: application/json" \
|
||||||
|
-d '{"field":"is_public","type":"bool"}'
|
||||||
|
|
||||||
|
# Ensure multiple indexes exist (idempotent — safe to run multiple times)
|
||||||
|
curl -X POST https://vision.klevze.net/vectors/collections/images/ensure-indexes \
|
||||||
|
-H "X-API-Key: <your-api-key>" \
|
||||||
|
-H "Content-Type: application/json" \
|
||||||
|
-d '{"fields":[{"field":"is_public","type":"bool"},{"field":"is_deleted","type":"bool"},{"field":"category_id","type":"integer"},{"field":"user_id","type":"keyword"}]}'
|
||||||
|
```
|
||||||
|
|
||||||
|
Supported index types: `keyword`, `integer`, `float`, `bool`, `geo`, `datetime`, `text`, `uuid`.
|
||||||
|
|
||||||
|
#### Collection configuration (HNSW / optimizer / quantization)
|
||||||
|
|
||||||
|
Updates HNSW, optimizer, or scalar quantization settings on an existing collection without data loss. HNSW graph and segment changes apply to newly created segments.
|
||||||
|
|
||||||
|
```bash
|
||||||
|
curl -X POST https://vision.klevze.net/vectors/collections/images/configure \
|
||||||
|
-H "X-API-Key: <your-api-key>" \
|
||||||
|
-H "Content-Type: application/json" \
|
||||||
|
-d '{
|
||||||
|
"hnsw_m": 16,
|
||||||
|
"hnsw_ef_construct": 200,
|
||||||
|
"hnsw_on_disk": false,
|
||||||
|
"indexing_threshold": 20000,
|
||||||
|
"default_segment_number": 4,
|
||||||
|
"quantization_type": "int8",
|
||||||
|
"quantization_quantile": 0.99,
|
||||||
|
"quantization_always_ram": true
|
||||||
|
}'
|
||||||
|
```
|
||||||
|
|
||||||
|
Parameters:
|
||||||
|
- `hnsw_m` (int, 4–64): edges per node in the HNSW graph.
|
||||||
|
- `hnsw_ef_construct` (int, 10–1000): ef during index construction.
|
||||||
|
- `hnsw_on_disk` (bool): store HNSW graph on disk (saves RAM, slightly slower queries).
|
||||||
|
- `indexing_threshold` (int): minimum vector changes before a segment is indexed.
|
||||||
|
- `default_segment_number` (int, 1–32): target segment count for parallelism.
|
||||||
|
- `quantization_type` (string, `"int8"` or null): enable scalar quantization (~4× RAM reduction).
|
||||||
|
- `quantization_quantile` (float, 0.5–1.0, default 0.99): calibration quantile.
|
||||||
|
- `quantization_always_ram` (bool, default true): keep quantized vectors in RAM.
|
||||||
|
|
||||||
#### Delete points
|
#### Delete points
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
@@ -290,6 +359,67 @@ If the wrapper had to replace your string `id` with a generated UUID, the origin
|
|||||||
curl -H "X-API-Key: <your-api-key>" https://vision.klevze.net/vectors/points/by-original-id/img-001
|
curl -H "X-API-Key: <your-api-key>" https://vision.klevze.net/vectors/points/by-original-id/img-001
|
||||||
```
|
```
|
||||||
|
|
||||||
|
## Card Renderer
|
||||||
|
|
||||||
|
The card renderer generates branded social-card images from artwork photos. It applies smart center-weighted cropping, a gradient overlay, title/subtitle/username/category text, optional tags, and an optional logo.
|
||||||
|
|
||||||
|
Default output: 1200×630 WebP (`nova-artwork-v1` template).
|
||||||
|
|
||||||
|
### List available templates
|
||||||
|
|
||||||
|
```bash
|
||||||
|
curl -H "X-API-Key: <your-api-key>" https://vision.klevze.net/cards/templates
|
||||||
|
```
|
||||||
|
|
||||||
|
### Render a card from a URL
|
||||||
|
|
||||||
|
```bash
|
||||||
|
curl -X POST https://vision.klevze.net/cards/render \
|
||||||
|
-H "X-API-Key: <your-api-key>" \
|
||||||
|
-H "Content-Type: application/json" \
|
||||||
|
-d '{
|
||||||
|
"url": "https://files.skinbase.org/img/aa/bb/cc/md.webp",
|
||||||
|
"title": "Artwork Title",
|
||||||
|
"subtitle": "Optional subtitle",
|
||||||
|
"username": "@artist",
|
||||||
|
"category": "Digital Art",
|
||||||
|
"tags": ["surreal", "landscape"],
|
||||||
|
"template": "nova-artwork-v1",
|
||||||
|
"width": 1200,
|
||||||
|
"height": 630,
|
||||||
|
"output": "webp",
|
||||||
|
"quality": 90,
|
||||||
|
"show_logo": true
|
||||||
|
}'
|
||||||
|
```
|
||||||
|
|
||||||
|
Returns binary image bytes with `Content-Type: image/webp`.
|
||||||
|
|
||||||
|
### Render a card from a file upload
|
||||||
|
|
||||||
|
```bash
|
||||||
|
curl -X POST https://vision.klevze.net/cards/render/file \
|
||||||
|
-H "X-API-Key: <your-api-key>" \
|
||||||
|
-F "file=@/path/to/image.webp" \
|
||||||
|
-F "title=Artwork Title" \
|
||||||
|
-F "username=@artist" \
|
||||||
|
-F "template=nova-artwork-v1" \
|
||||||
|
-F "show_logo=true"
|
||||||
|
```
|
||||||
|
|
||||||
|
Returns binary image bytes.
|
||||||
|
|
||||||
|
### Get card layout metadata (no image rendered)
|
||||||
|
|
||||||
|
```bash
|
||||||
|
curl -X POST https://vision.klevze.net/cards/render/meta \
|
||||||
|
-H "X-API-Key: <your-api-key>" \
|
||||||
|
-H "Content-Type: application/json" \
|
||||||
|
-d '{"url":"https://files.skinbase.org/img/aa/bb/cc/md.webp","title":"Artwork Title"}'
|
||||||
|
```
|
||||||
|
|
||||||
|
Returns crop coordinates and layout data without producing an image.
|
||||||
|
|
||||||
## Request/Response notes
|
## Request/Response notes
|
||||||
|
|
||||||
- For URL requests use `Content-Type: application/json`.
|
- For URL requests use `Content-Type: application/json`.
|
||||||
@@ -340,9 +470,5 @@ uvicorn main:app --host 0.0.0.0 --port 8000
|
|||||||
- `gateway/` — gateway FastAPI server.
|
- `gateway/` — gateway FastAPI server.
|
||||||
- `clip/`, `blip/`, `yolo/` — service implementations and Dockerfiles.
|
- `clip/`, `blip/`, `yolo/` — service implementations and Dockerfiles.
|
||||||
- `qdrant/` — Qdrant API wrapper service (FastAPI).
|
- `qdrant/` — Qdrant API wrapper service (FastAPI).
|
||||||
|
- `card-renderer/` — card rendering service (FastAPI).
|
||||||
- `common/` — shared helpers (e.g., image I/O).
|
- `common/` — shared helpers (e.g., image I/O).
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
If you want, I can merge these same contents into the project `README.md`,
|
|
||||||
create a Postman collection, or add example response schemas for each endpoint.
|
|
||||||
|
|||||||
@@ -77,6 +77,7 @@ services:
|
|||||||
- CLIP_URL=http://clip:8000
|
- CLIP_URL=http://clip:8000
|
||||||
- COLLECTION_NAME=images
|
- COLLECTION_NAME=images
|
||||||
- VECTOR_DIM=512
|
- VECTOR_DIM=512
|
||||||
|
- SEARCH_HNSW_EF=128
|
||||||
depends_on:
|
depends_on:
|
||||||
qdrant:
|
qdrant:
|
||||||
condition: service_healthy
|
condition: service_healthy
|
||||||
|
|||||||
@@ -243,13 +243,21 @@ async def vectors_search_file(
|
|||||||
limit: int = Form(5),
|
limit: int = Form(5),
|
||||||
score_threshold: Optional[float] = Form(None),
|
score_threshold: Optional[float] = Form(None),
|
||||||
collection: Optional[str] = Form(None),
|
collection: Optional[str] = Form(None),
|
||||||
|
hnsw_ef: Optional[int] = Form(None),
|
||||||
|
exact: bool = Form(False),
|
||||||
|
indexed_only: bool = Form(False),
|
||||||
|
filter_metadata_json: Optional[str] = Form(None),
|
||||||
):
|
):
|
||||||
data = await file.read()
|
data = await file.read()
|
||||||
fields: Dict[str, Any] = {"limit": int(limit)}
|
fields: Dict[str, Any] = {"limit": int(limit), "exact": exact, "indexed_only": indexed_only}
|
||||||
if score_threshold is not None:
|
if score_threshold is not None:
|
||||||
fields["score_threshold"] = float(score_threshold)
|
fields["score_threshold"] = float(score_threshold)
|
||||||
if collection is not None:
|
if collection is not None:
|
||||||
fields["collection"] = collection
|
fields["collection"] = collection
|
||||||
|
if hnsw_ef is not None:
|
||||||
|
fields["hnsw_ef"] = int(hnsw_ef)
|
||||||
|
if filter_metadata_json is not None:
|
||||||
|
fields["filter_metadata_json"] = filter_metadata_json
|
||||||
async with httpx.AsyncClient(timeout=VISION_TIMEOUT) as client:
|
async with httpx.AsyncClient(timeout=VISION_TIMEOUT) as client:
|
||||||
return await _post_file(client, f"{QDRANT_SVC_URL}/search/file", data, fields)
|
return await _post_file(client, f"{QDRANT_SVC_URL}/search/file", data, fields)
|
||||||
|
|
||||||
@@ -284,6 +292,13 @@ async def vectors_collection_info(name: str):
|
|||||||
return await _get_json(client, f"{QDRANT_SVC_URL}/collections/{name}")
|
return await _get_json(client, f"{QDRANT_SVC_URL}/collections/{name}")
|
||||||
|
|
||||||
|
|
||||||
|
@app.get("/vectors/inspect")
|
||||||
|
async def vectors_inspect():
|
||||||
|
"""Full diagnostic summary for all Qdrant collections (HNSW, optimizer, payload indexes, RAM estimate)."""
|
||||||
|
async with httpx.AsyncClient(timeout=VISION_TIMEOUT) as client:
|
||||||
|
return await _get_json(client, f"{QDRANT_SVC_URL}/inspect")
|
||||||
|
|
||||||
|
|
||||||
@app.delete("/vectors/collections/{name}")
|
@app.delete("/vectors/collections/{name}")
|
||||||
async def vectors_delete_collection(name: str):
|
async def vectors_delete_collection(name: str):
|
||||||
async with httpx.AsyncClient(timeout=VISION_TIMEOUT) as client:
|
async with httpx.AsyncClient(timeout=VISION_TIMEOUT) as client:
|
||||||
@@ -416,3 +431,33 @@ async def cards_render_meta(payload: Dict[str, Any]):
|
|||||||
"""Return crop and layout metadata for a card render (no image produced)."""
|
"""Return crop and layout metadata for a card render (no image produced)."""
|
||||||
async with httpx.AsyncClient(timeout=VISION_TIMEOUT) as client:
|
async with httpx.AsyncClient(timeout=VISION_TIMEOUT) as client:
|
||||||
return await _post_json(client, f"{CARD_RENDERER_URL}/render/meta", payload)
|
return await _post_json(client, f"{CARD_RENDERER_URL}/render/meta", payload)
|
||||||
|
|
||||||
|
|
||||||
|
# ---- Qdrant administration endpoints (index management + collection config) ----
|
||||||
|
|
||||||
|
@app.get("/vectors/collections/{name}/indexes")
|
||||||
|
async def vectors_collection_indexes(name: str):
|
||||||
|
"""List payload indexes for a collection."""
|
||||||
|
async with httpx.AsyncClient(timeout=VISION_TIMEOUT) as client:
|
||||||
|
return await _get_json(client, f"{QDRANT_SVC_URL}/collections/{name}/indexes")
|
||||||
|
|
||||||
|
|
||||||
|
@app.post("/vectors/collections/{name}/indexes")
|
||||||
|
async def vectors_create_payload_index(name: str, payload: Dict[str, Any]):
|
||||||
|
"""Create a payload index on a field in a collection."""
|
||||||
|
async with httpx.AsyncClient(timeout=VISION_TIMEOUT) as client:
|
||||||
|
return await _post_json(client, f"{QDRANT_SVC_URL}/collections/{name}/indexes", payload)
|
||||||
|
|
||||||
|
|
||||||
|
@app.post("/vectors/collections/{name}/ensure-indexes")
|
||||||
|
async def vectors_ensure_indexes(name: str, payload: Dict[str, Any]):
|
||||||
|
"""Idempotently ensure payload indexes exist for a list of fields."""
|
||||||
|
async with httpx.AsyncClient(timeout=VISION_TIMEOUT) as client:
|
||||||
|
return await _post_json(client, f"{QDRANT_SVC_URL}/collections/{name}/ensure-indexes", payload)
|
||||||
|
|
||||||
|
|
||||||
|
@app.post("/vectors/collections/{name}/configure")
|
||||||
|
async def vectors_configure_collection(name: str, payload: Dict[str, Any]):
|
||||||
|
"""Update HNSW and optimizer configuration for a collection."""
|
||||||
|
async with httpx.AsyncClient(timeout=VISION_TIMEOUT) as client:
|
||||||
|
return await _post_json(client, f"{QDRANT_SVC_URL}/collections/{name}/configure", payload)
|
||||||
|
|||||||
325
qdrant/main.py
325
qdrant/main.py
@@ -16,6 +16,12 @@ from qdrant_client.models import (
|
|||||||
Filter,
|
Filter,
|
||||||
FieldCondition,
|
FieldCondition,
|
||||||
MatchValue,
|
MatchValue,
|
||||||
|
HnswConfigDiff,
|
||||||
|
OptimizersConfigDiff,
|
||||||
|
SearchParams,
|
||||||
|
PayloadSchemaType,
|
||||||
|
ScalarQuantizationConfig,
|
||||||
|
ScalarType,
|
||||||
)
|
)
|
||||||
|
|
||||||
# ---------------------------------------------------------------------------
|
# ---------------------------------------------------------------------------
|
||||||
@@ -27,6 +33,8 @@ QDRANT_PORT = int(os.getenv("QDRANT_PORT", "6333"))
|
|||||||
CLIP_URL = os.getenv("CLIP_URL", "http://clip:8000")
|
CLIP_URL = os.getenv("CLIP_URL", "http://clip:8000")
|
||||||
COLLECTION_NAME = os.getenv("COLLECTION_NAME", "images")
|
COLLECTION_NAME = os.getenv("COLLECTION_NAME", "images")
|
||||||
VECTOR_DIM = int(os.getenv("VECTOR_DIM", "512"))
|
VECTOR_DIM = int(os.getenv("VECTOR_DIM", "512"))
|
||||||
|
# hnsw_ef at query time: higher = better recall, slightly more latency (Qdrant default ~100)
|
||||||
|
SEARCH_HNSW_EF = int(os.getenv("SEARCH_HNSW_EF", "128"))
|
||||||
|
|
||||||
app = FastAPI(title="Skinbase Qdrant Service", version="1.0.0")
|
app = FastAPI(title="Skinbase Qdrant Service", version="1.0.0")
|
||||||
client: QdrantClient = None # type: ignore[assignment]
|
client: QdrantClient = None # type: ignore[assignment]
|
||||||
@@ -44,12 +52,21 @@ def startup():
|
|||||||
|
|
||||||
|
|
||||||
def _ensure_collection():
|
def _ensure_collection():
|
||||||
"""Create the default collection if it does not exist yet."""
|
"""Create the default collection with production-friendly defaults if it does not exist yet."""
|
||||||
collections = [c.name for c in client.get_collections().collections]
|
collections = [c.name for c in client.get_collections().collections]
|
||||||
if COLLECTION_NAME not in collections:
|
if COLLECTION_NAME not in collections:
|
||||||
client.create_collection(
|
client.create_collection(
|
||||||
collection_name=COLLECTION_NAME,
|
collection_name=COLLECTION_NAME,
|
||||||
vectors_config=VectorParams(size=VECTOR_DIM, distance=Distance.COSINE),
|
vectors_config=VectorParams(size=VECTOR_DIM, distance=Distance.COSINE),
|
||||||
|
hnsw_config=HnswConfigDiff(
|
||||||
|
m=16,
|
||||||
|
ef_construct=200, # higher than default 100 = better index quality
|
||||||
|
on_disk=False, # keep HNSW graph in RAM for fast traversal
|
||||||
|
),
|
||||||
|
optimizers_config=OptimizersConfigDiff(
|
||||||
|
indexing_threshold=20000, # start indexing after 20k accumulated vectors
|
||||||
|
default_segment_number=4, # parallelism-friendly segment count
|
||||||
|
),
|
||||||
)
|
)
|
||||||
|
|
||||||
|
|
||||||
@@ -77,6 +94,9 @@ class SearchUrlRequest(BaseModel):
|
|||||||
score_threshold: Optional[float] = Field(default=None, ge=0.0, le=1.0)
|
score_threshold: Optional[float] = Field(default=None, ge=0.0, le=1.0)
|
||||||
collection: Optional[str] = None
|
collection: Optional[str] = None
|
||||||
filter_metadata: Dict[str, Any] = Field(default_factory=dict)
|
filter_metadata: Dict[str, Any] = Field(default_factory=dict)
|
||||||
|
hnsw_ef: Optional[int] = Field(default=None, ge=1, le=512, description="Override ef at query time. Higher = better recall, slightly higher latency.")
|
||||||
|
exact: bool = Field(default=False, description="Brute-force exact search. Avoid on large collections.")
|
||||||
|
indexed_only: bool = Field(default=False, description="Search only fully indexed segments. Useful during bulk ingest.")
|
||||||
|
|
||||||
|
|
||||||
class SearchVectorRequest(BaseModel):
|
class SearchVectorRequest(BaseModel):
|
||||||
@@ -85,6 +105,9 @@ class SearchVectorRequest(BaseModel):
|
|||||||
score_threshold: Optional[float] = Field(default=None, ge=0.0, le=1.0)
|
score_threshold: Optional[float] = Field(default=None, ge=0.0, le=1.0)
|
||||||
collection: Optional[str] = None
|
collection: Optional[str] = None
|
||||||
filter_metadata: Dict[str, Any] = Field(default_factory=dict)
|
filter_metadata: Dict[str, Any] = Field(default_factory=dict)
|
||||||
|
hnsw_ef: Optional[int] = Field(default=None, ge=1, le=512)
|
||||||
|
exact: bool = False
|
||||||
|
indexed_only: bool = False
|
||||||
|
|
||||||
|
|
||||||
class DeleteRequest(BaseModel):
|
class DeleteRequest(BaseModel):
|
||||||
@@ -189,6 +212,79 @@ def health():
|
|||||||
return {"status": "error", "detail": str(e)}
|
return {"status": "error", "detail": str(e)}
|
||||||
|
|
||||||
|
|
||||||
|
@app.get("/inspect")
|
||||||
|
def inspect():
|
||||||
|
"""Return a full diagnostic summary for every collection.
|
||||||
|
|
||||||
|
Covers: vector counts, segment counts, HNSW config, optimizer config,
|
||||||
|
quantization, payload indexes and their coverage. Designed for production
|
||||||
|
health checks and the Qdrant optimization workflow.
|
||||||
|
"""
|
||||||
|
try:
|
||||||
|
all_collections = client.get_collections().collections
|
||||||
|
except Exception as exc:
|
||||||
|
return {"status": "error", "detail": str(exc)}
|
||||||
|
|
||||||
|
result = {}
|
||||||
|
for col_desc in all_collections:
|
||||||
|
name = col_desc.name
|
||||||
|
try:
|
||||||
|
info = client.get_collection(name)
|
||||||
|
cfg = info.config
|
||||||
|
hnsw = cfg.hnsw_config
|
||||||
|
opt = cfg.optimizer_config
|
||||||
|
quant = cfg.quantization_config
|
||||||
|
params = cfg.params
|
||||||
|
|
||||||
|
# Estimate raw RAM footprint: vectors * dim * 4 bytes * 1.5 safety factor
|
||||||
|
vec_count = info.vectors_count or 0
|
||||||
|
vec_dim = (
|
||||||
|
params.vectors.size
|
||||||
|
if hasattr(params.vectors, "size")
|
||||||
|
else VECTOR_DIM
|
||||||
|
)
|
||||||
|
ram_estimate_mb = round(vec_count * vec_dim * 4 * 1.5 / 1_048_576, 1)
|
||||||
|
|
||||||
|
result[name] = {
|
||||||
|
"status": info.status.value if info.status else None,
|
||||||
|
"optimizer_status": str(info.optimizer_status) if info.optimizer_status else None,
|
||||||
|
"vectors_count": vec_count,
|
||||||
|
"indexed_vectors_count": info.indexed_vectors_count,
|
||||||
|
"points_count": info.points_count,
|
||||||
|
"segments_count": info.segments_count,
|
||||||
|
"ram_estimate_mb": ram_estimate_mb,
|
||||||
|
"hnsw": {
|
||||||
|
"m": hnsw.m,
|
||||||
|
"ef_construct": hnsw.ef_construct,
|
||||||
|
"on_disk": hnsw.on_disk,
|
||||||
|
"full_scan_threshold": hnsw.full_scan_threshold,
|
||||||
|
"max_indexing_threads": hnsw.max_indexing_threads,
|
||||||
|
} if hnsw else None,
|
||||||
|
"optimizer": {
|
||||||
|
"indexing_threshold": opt.indexing_threshold,
|
||||||
|
"default_segment_number": opt.default_segment_number,
|
||||||
|
"max_segment_size": opt.max_segment_size,
|
||||||
|
"memmap_threshold": opt.memmap_threshold,
|
||||||
|
"flush_interval_sec": opt.flush_interval_sec,
|
||||||
|
} if opt else None,
|
||||||
|
"quantization": str(quant) if quant else None,
|
||||||
|
"payload_indexes": {
|
||||||
|
k: {
|
||||||
|
"type": v.data_type.value if hasattr(v.data_type, "value") else str(v.data_type),
|
||||||
|
"points": v.points,
|
||||||
|
"coverage_pct": round(v.points / max(vec_count, 1) * 100, 1),
|
||||||
|
}
|
||||||
|
for k, v in (info.payload_schema or {}).items()
|
||||||
|
},
|
||||||
|
"payload_index_count": len(info.payload_schema or {}),
|
||||||
|
"search_hnsw_ef": SEARCH_HNSW_EF,
|
||||||
|
}
|
||||||
|
except Exception as exc:
|
||||||
|
result[name] = {"error": str(exc)}
|
||||||
|
|
||||||
|
return {"collections": result, "total": len(result)}
|
||||||
|
|
||||||
|
|
||||||
# ---------------------------------------------------------------------------
|
# ---------------------------------------------------------------------------
|
||||||
# Collection management
|
# Collection management
|
||||||
# ---------------------------------------------------------------------------
|
# ---------------------------------------------------------------------------
|
||||||
@@ -204,9 +300,13 @@ def create_collection(req: CollectionRequest):
|
|||||||
if req.name in collections:
|
if req.name in collections:
|
||||||
raise HTTPException(409, f"Collection '{req.name}' already exists")
|
raise HTTPException(409, f"Collection '{req.name}' already exists")
|
||||||
|
|
||||||
|
# Apply the same production defaults as _ensure_collection so all
|
||||||
|
# collections start with tuned HNSW and optimizer settings.
|
||||||
client.create_collection(
|
client.create_collection(
|
||||||
collection_name=req.name,
|
collection_name=req.name,
|
||||||
vectors_config=VectorParams(size=req.vector_dim, distance=dist),
|
vectors_config=VectorParams(size=req.vector_dim, distance=dist),
|
||||||
|
hnsw_config=HnswConfigDiff(m=16, ef_construct=200, on_disk=False),
|
||||||
|
optimizers_config=OptimizersConfigDiff(indexing_threshold=20000, default_segment_number=4),
|
||||||
)
|
)
|
||||||
return {"created": req.name, "vector_dim": req.vector_dim, "distance": req.distance}
|
return {"created": req.name, "vector_dim": req.vector_dim, "distance": req.distance}
|
||||||
|
|
||||||
@@ -221,11 +321,40 @@ def list_collections():
|
|||||||
def collection_info(name: str):
|
def collection_info(name: str):
|
||||||
try:
|
try:
|
||||||
info = client.get_collection(name)
|
info = client.get_collection(name)
|
||||||
|
cfg = info.config
|
||||||
|
hnsw = cfg.hnsw_config
|
||||||
|
opt = cfg.optimizer_config
|
||||||
|
quant = cfg.quantization_config
|
||||||
return {
|
return {
|
||||||
"name": name,
|
"name": name,
|
||||||
"vectors_count": info.vectors_count,
|
"vectors_count": info.vectors_count,
|
||||||
|
"indexed_vectors_count": info.indexed_vectors_count,
|
||||||
"points_count": info.points_count,
|
"points_count": info.points_count,
|
||||||
|
"segments_count": info.segments_count,
|
||||||
"status": info.status.value if info.status else None,
|
"status": info.status.value if info.status else None,
|
||||||
|
"optimizer_status": str(info.optimizer_status) if info.optimizer_status else None,
|
||||||
|
"hnsw": {
|
||||||
|
"m": hnsw.m,
|
||||||
|
"ef_construct": hnsw.ef_construct,
|
||||||
|
"on_disk": hnsw.on_disk,
|
||||||
|
"full_scan_threshold": hnsw.full_scan_threshold,
|
||||||
|
"max_indexing_threads": hnsw.max_indexing_threads,
|
||||||
|
} if hnsw else None,
|
||||||
|
"optimizer": {
|
||||||
|
"indexing_threshold": opt.indexing_threshold,
|
||||||
|
"default_segment_number": opt.default_segment_number,
|
||||||
|
"max_segment_size": opt.max_segment_size,
|
||||||
|
"memmap_threshold": opt.memmap_threshold,
|
||||||
|
"flush_interval_sec": opt.flush_interval_sec,
|
||||||
|
} if opt else None,
|
||||||
|
"quantization": str(quant) if quant else None,
|
||||||
|
"payload_schema": {
|
||||||
|
k: {
|
||||||
|
"type": v.data_type.value if hasattr(v.data_type, "value") else str(v.data_type),
|
||||||
|
"points": v.points,
|
||||||
|
}
|
||||||
|
for k, v in (info.payload_schema or {}).items()
|
||||||
|
},
|
||||||
}
|
}
|
||||||
except Exception as e:
|
except Exception as e:
|
||||||
raise HTTPException(404, str(e))
|
raise HTTPException(404, str(e))
|
||||||
@@ -325,7 +454,7 @@ def upsert_vector(req: UpsertVectorRequest):
|
|||||||
async def search_url(req: SearchUrlRequest):
|
async def search_url(req: SearchUrlRequest):
|
||||||
"""Embed an image by URL via CLIP, then search Qdrant for similar vectors."""
|
"""Embed an image by URL via CLIP, then search Qdrant for similar vectors."""
|
||||||
vector = await _embed_url(req.url)
|
vector = await _embed_url(req.url)
|
||||||
return _do_search(vector, req.limit, req.score_threshold, req.collection, req.filter_metadata)
|
return _do_search(vector, req.limit, req.score_threshold, req.collection, req.filter_metadata, req.hnsw_ef, req.exact, req.indexed_only)
|
||||||
|
|
||||||
|
|
||||||
@app.post("/search/file")
|
@app.post("/search/file")
|
||||||
@@ -334,17 +463,28 @@ async def search_file(
|
|||||||
limit: int = Form(5),
|
limit: int = Form(5),
|
||||||
score_threshold: Optional[float] = Form(None),
|
score_threshold: Optional[float] = Form(None),
|
||||||
collection: Optional[str] = Form(None),
|
collection: Optional[str] = Form(None),
|
||||||
|
hnsw_ef: Optional[int] = Form(None),
|
||||||
|
exact: bool = Form(False),
|
||||||
|
indexed_only: bool = Form(False),
|
||||||
|
filter_metadata_json: Optional[str] = Form(None),
|
||||||
):
|
):
|
||||||
"""Embed an uploaded image via CLIP, then search Qdrant for similar vectors."""
|
"""Embed an uploaded image via CLIP, then search Qdrant for similar vectors."""
|
||||||
|
import json
|
||||||
|
filter_metadata: Dict[str, Any] = {}
|
||||||
|
if filter_metadata_json:
|
||||||
|
try:
|
||||||
|
filter_metadata = json.loads(filter_metadata_json)
|
||||||
|
except json.JSONDecodeError:
|
||||||
|
raise HTTPException(400, "filter_metadata_json must be valid JSON")
|
||||||
data = await file.read()
|
data = await file.read()
|
||||||
vector = await _embed_bytes(data)
|
vector = await _embed_bytes(data)
|
||||||
return _do_search(vector, int(limit), score_threshold, collection, {})
|
return _do_search(vector, int(limit), score_threshold, collection, filter_metadata, hnsw_ef, exact, indexed_only)
|
||||||
|
|
||||||
|
|
||||||
@app.post("/search/vector")
|
@app.post("/search/vector")
|
||||||
def search_vector(req: SearchVectorRequest):
|
def search_vector(req: SearchVectorRequest):
|
||||||
"""Search Qdrant using a pre-computed vector."""
|
"""Search Qdrant using a pre-computed vector."""
|
||||||
return _do_search(req.vector, req.limit, req.score_threshold, req.collection, req.filter_metadata)
|
return _do_search(req.vector, req.limit, req.score_threshold, req.collection, req.filter_metadata, req.hnsw_ef, req.exact, req.indexed_only)
|
||||||
|
|
||||||
|
|
||||||
def _do_search(
|
def _do_search(
|
||||||
@@ -353,9 +493,13 @@ def _do_search(
|
|||||||
score_threshold: Optional[float],
|
score_threshold: Optional[float],
|
||||||
collection: Optional[str],
|
collection: Optional[str],
|
||||||
filter_metadata: Dict[str, Any],
|
filter_metadata: Dict[str, Any],
|
||||||
|
hnsw_ef: Optional[int] = None,
|
||||||
|
exact: bool = False,
|
||||||
|
indexed_only: bool = False,
|
||||||
):
|
):
|
||||||
col = _col(collection)
|
col = _col(collection)
|
||||||
qfilter = _build_filter(filter_metadata)
|
qfilter = _build_filter(filter_metadata)
|
||||||
|
ef = hnsw_ef if hnsw_ef is not None else SEARCH_HNSW_EF
|
||||||
|
|
||||||
results = client.query_points(
|
results = client.query_points(
|
||||||
collection_name=col,
|
collection_name=col,
|
||||||
@@ -363,6 +507,7 @@ def _do_search(
|
|||||||
limit=limit,
|
limit=limit,
|
||||||
score_threshold=score_threshold,
|
score_threshold=score_threshold,
|
||||||
query_filter=qfilter,
|
query_filter=qfilter,
|
||||||
|
search_params=SearchParams(hnsw_ef=ef, exact=exact, indexed_only=indexed_only),
|
||||||
)
|
)
|
||||||
|
|
||||||
hits = []
|
hits = []
|
||||||
@@ -438,3 +583,175 @@ def get_point_by_original_id(original_id: str, collection: Optional[str] = None)
|
|||||||
raise
|
raise
|
||||||
except Exception as e:
|
except Exception as e:
|
||||||
raise HTTPException(404, str(e))
|
raise HTTPException(404, str(e))
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# Payload index management
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
_SCHEMA_TYPE_MAP: Dict[str, PayloadSchemaType] = {
|
||||||
|
t.value: t for t in PayloadSchemaType
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
def _resolve_schema_type(type_str: str) -> PayloadSchemaType:
|
||||||
|
schema = _SCHEMA_TYPE_MAP.get(type_str.lower())
|
||||||
|
if schema is None:
|
||||||
|
raise HTTPException(400, f"Unknown index type '{type_str}'. Valid: {', '.join(_SCHEMA_TYPE_MAP)}")
|
||||||
|
return schema
|
||||||
|
|
||||||
|
|
||||||
|
class PayloadIndexRequest(BaseModel):
|
||||||
|
field: str
|
||||||
|
type: str = Field(default="keyword", description="keyword | integer | float | bool | geo | datetime | text | uuid")
|
||||||
|
collection: Optional[str] = None
|
||||||
|
|
||||||
|
|
||||||
|
class EnsureIndexesRequest(BaseModel):
|
||||||
|
"""List of field specs, each with 'field' and optional 'type' keys."""
|
||||||
|
fields: List[Dict[str, str]]
|
||||||
|
collection: Optional[str] = None
|
||||||
|
|
||||||
|
|
||||||
|
@app.get("/collections/{name}/indexes")
|
||||||
|
def collection_indexes(name: str):
|
||||||
|
"""List all payload indexes for a collection."""
|
||||||
|
try:
|
||||||
|
info = client.get_collection(name)
|
||||||
|
schema = info.payload_schema or {}
|
||||||
|
return {
|
||||||
|
"collection": name,
|
||||||
|
"indexes": {
|
||||||
|
k: {
|
||||||
|
"type": v.data_type.value if hasattr(v.data_type, "value") else str(v.data_type),
|
||||||
|
"points": v.points,
|
||||||
|
}
|
||||||
|
for k, v in schema.items()
|
||||||
|
},
|
||||||
|
"count": len(schema),
|
||||||
|
}
|
||||||
|
except Exception as e:
|
||||||
|
raise HTTPException(404, str(e))
|
||||||
|
|
||||||
|
|
||||||
|
@app.post("/collections/{name}/indexes")
|
||||||
|
def create_index(name: str, req: PayloadIndexRequest):
|
||||||
|
"""Create a payload index on a single field."""
|
||||||
|
col = req.collection or name
|
||||||
|
schema = _resolve_schema_type(req.type)
|
||||||
|
try:
|
||||||
|
client.create_payload_index(
|
||||||
|
collection_name=col,
|
||||||
|
field_name=req.field,
|
||||||
|
field_schema=schema,
|
||||||
|
)
|
||||||
|
return {"collection": col, "field": req.field, "type": req.type, "status": "created"}
|
||||||
|
except Exception as e:
|
||||||
|
raise HTTPException(500, str(e))
|
||||||
|
|
||||||
|
|
||||||
|
@app.post("/collections/{name}/ensure-indexes")
|
||||||
|
def ensure_indexes(name: str, req: EnsureIndexesRequest):
|
||||||
|
"""Idempotently ensure payload indexes exist for a list of fields.
|
||||||
|
|
||||||
|
Skips fields that are already indexed; only creates the missing ones.
|
||||||
|
Example body: {"fields": [{"field": "is_public", "type": "bool"}, {"field": "category_id", "type": "integer"}]}
|
||||||
|
"""
|
||||||
|
col = req.collection or name
|
||||||
|
try:
|
||||||
|
info = client.get_collection(col)
|
||||||
|
except Exception as e:
|
||||||
|
raise HTTPException(404, str(e))
|
||||||
|
|
||||||
|
existing = set(info.payload_schema.keys()) if info.payload_schema else set()
|
||||||
|
created: List[str] = []
|
||||||
|
skipped: List[str] = []
|
||||||
|
|
||||||
|
for field_spec in req.fields:
|
||||||
|
field = field_spec.get("field")
|
||||||
|
type_str = field_spec.get("type", "keyword")
|
||||||
|
if not field:
|
||||||
|
raise HTTPException(400, "Each field spec must include a 'field' key")
|
||||||
|
if field in existing:
|
||||||
|
skipped.append(field)
|
||||||
|
continue
|
||||||
|
schema = _resolve_schema_type(type_str)
|
||||||
|
try:
|
||||||
|
client.create_payload_index(
|
||||||
|
collection_name=col,
|
||||||
|
field_name=field,
|
||||||
|
field_schema=schema,
|
||||||
|
)
|
||||||
|
created.append(field)
|
||||||
|
except Exception as exc:
|
||||||
|
raise HTTPException(500, f"Failed to index '{field}': {exc}")
|
||||||
|
|
||||||
|
return {"collection": col, "created": created, "skipped": skipped}
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# Collection HNSW + optimizer configuration
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
class CollectionConfigRequest(BaseModel):
|
||||||
|
hnsw_m: Optional[int] = Field(default=None, ge=4, le=64, description="Edges per node in the HNSW graph.")
|
||||||
|
hnsw_ef_construct: Optional[int] = Field(default=None, ge=10, le=1000, description="ef during index construction. Changes apply to new segments only.")
|
||||||
|
hnsw_on_disk: Optional[bool] = Field(default=None, description="Store HNSW graph on disk (saves RAM, slightly slower queries).")
|
||||||
|
indexing_threshold: Optional[int] = Field(default=None, ge=0, description="Min payload changes before a segment is indexed.")
|
||||||
|
default_segment_number: Optional[int] = Field(default=None, ge=1, le=32, description="Target number of segments for parallelism.")
|
||||||
|
# Scalar quantization — reduces RAM ~4x, often speeds up search on large collections.
|
||||||
|
# Set quantization_type='int8' to enable. Use always_ram=True to keep quantized
|
||||||
|
# vectors in RAM (recommended on VPS with limited memory but fast disk).
|
||||||
|
quantization_type: Optional[str] = Field(default=None, description="Enable scalar quantization: 'int8'. Set to null to keep current setting.")
|
||||||
|
quantization_quantile: float = Field(default=0.99, ge=0.5, le=1.0, description="Fraction of vectors used to calibrate quantization range (0.99 recommended).")
|
||||||
|
quantization_always_ram: bool = Field(default=True, description="Keep quantized vectors in RAM even when raw vectors are on disk.")
|
||||||
|
|
||||||
|
|
||||||
|
@app.post("/collections/{name}/configure")
|
||||||
|
def configure_collection(name: str, req: CollectionConfigRequest):
|
||||||
|
"""Apply HNSW and optimizer configuration updates to an existing collection.
|
||||||
|
|
||||||
|
Changes are applied in-place without data loss or re-ingestion.
|
||||||
|
Note: hnsw_m and hnsw_ef_construct only affect newly created segments.
|
||||||
|
"""
|
||||||
|
hnsw_kwargs = {k: v for k, v in {
|
||||||
|
"m": req.hnsw_m,
|
||||||
|
"ef_construct": req.hnsw_ef_construct,
|
||||||
|
"on_disk": req.hnsw_on_disk,
|
||||||
|
}.items() if v is not None}
|
||||||
|
|
||||||
|
opt_kwargs = {k: v for k, v in {
|
||||||
|
"indexing_threshold": req.indexing_threshold,
|
||||||
|
"default_segment_number": req.default_segment_number,
|
||||||
|
}.items() if v is not None}
|
||||||
|
|
||||||
|
# Build optional scalar quantization config
|
||||||
|
quant_config = None
|
||||||
|
if req.quantization_type is not None:
|
||||||
|
if req.quantization_type.lower() != "int8":
|
||||||
|
raise HTTPException(400, f"Unsupported quantization_type '{req.quantization_type}'. Only 'int8' is supported.")
|
||||||
|
quant_config = ScalarQuantizationConfig(
|
||||||
|
type=ScalarType.INT8,
|
||||||
|
quantile=req.quantization_quantile,
|
||||||
|
always_ram=req.quantization_always_ram,
|
||||||
|
)
|
||||||
|
|
||||||
|
if not hnsw_kwargs and not opt_kwargs and quant_config is None:
|
||||||
|
raise HTTPException(400, "No configuration fields provided")
|
||||||
|
|
||||||
|
try:
|
||||||
|
client.update_collection(
|
||||||
|
collection_name=name,
|
||||||
|
hnsw_config=HnswConfigDiff(**hnsw_kwargs) if hnsw_kwargs else None,
|
||||||
|
optimizers_config=OptimizersConfigDiff(**opt_kwargs) if opt_kwargs else None,
|
||||||
|
quantization_config=quant_config,
|
||||||
|
)
|
||||||
|
return {
|
||||||
|
"collection": name,
|
||||||
|
"status": "updated",
|
||||||
|
"hnsw_changes": hnsw_kwargs,
|
||||||
|
"optimizer_changes": opt_kwargs,
|
||||||
|
"quantization": {"type": req.quantization_type, "quantile": req.quantization_quantile, "always_ram": req.quantization_always_ram} if quant_config else None,
|
||||||
|
}
|
||||||
|
except Exception as exc:
|
||||||
|
raise HTTPException(500, str(exc))
|
||||||
|
|||||||
Reference in New Issue
Block a user