optimizations

This commit is contained in:
2026-03-28 19:15:39 +01:00
parent 0b25d9570a
commit cab4fbd83e
509 changed files with 1016804 additions and 1605 deletions

View File

@@ -2,6 +2,8 @@
Covers the trending system, following feed, personalized homepage, similar artworks, unified activity feed, and all input signal collection that powers the ranking formula.
This document also covers the v3 AI discovery layer: vision metadata extraction, vector indexing, AI similar-artwork search, reverse image search, and the hybrid feed section controls.
---
## Table of Contents
@@ -19,13 +21,14 @@ Covers the trending system, following feed, personalized homepage, similar artwo
11. [Caching Strategy](#11-caching-strategy)
12. [Scheduled Jobs](#12-scheduled-jobs)
13. [Testing](#13-testing)
14. [Operational Runbook](#14-operational-runbook)
14. [AI Discovery v3](#14-ai-discovery-v3)
15. [Operational Runbook](#15-operational-runbook)
---
## 1. Architecture Overview
```
```text
Browser
├─ POST /api/art/{id}/view → ArtworkViewController
@@ -74,7 +77,7 @@ Browser
**Deduplication (layered):**
| Layer | Mechanism | Scope |
|---|---|---|
| --- | --- | --- |
| Client-side | `sessionStorage` key `sb_viewed_{id}` set before the request | Browser tab lifetime |
| Server-side | `$request->session()->put('art_viewed.{id}', true)` | Laravel session lifetime |
| Throttle | `throttle:5,10` route middleware | Per-IP per-artwork |
@@ -83,7 +86,7 @@ The React component `ArtworkActions.jsx` fires a `useEffect` on mount that check
**What gets incremented:**
```
```text
artwork_stats.views +1 (all-time)
artwork_stats.views_24h +1 (zeroed nightly)
artwork_stats.views_7d +1 (zeroed weekly)
@@ -101,6 +104,7 @@ Via `ArtworkStatsService::incrementViews()` with `defer: true` (Redis when avail
**Throttle:** 10 requests per minute per IP
The endpoint:
1. Inserts a row in `artwork_downloads` (persistent event log with `created_at`)
2. Increments `artwork_stats.downloads`, `downloads_24h`, `downloads_7d`
3. Returns `{"ok": true, "url": "<highest-res thumbnail URL>"}` for the native browser download
@@ -109,7 +113,7 @@ The `<a download>` buttons in `ArtworkActions.jsx` call `trackDownload()` on cli
**What gets incremented:**
```
```text
artwork_downloads INSERT (event log, persisted forever)
artwork_stats.downloads +1 (all-time)
artwork_stats.downloads_24h +1 (recomputed from log nightly)
@@ -124,7 +128,7 @@ Via `ArtworkStatsService::incrementDownloads()` with `defer: true`.
### 2.3 Other signals (already existed)
| Signal | Endpoint / Service | Written to |
|---|---|---|
| --- | --- | --- |
| Favorite toggle | `POST /api/artworks/{id}/favorite` | `user_favorites`, `artwork_stats.favorites` |
| Reaction toggle | `POST /api/artworks/{id}/reactions` | `artwork_reactions` |
| Award | `ArtworkAwardController` | `artwork_award_stats.score_total` |
@@ -156,7 +160,7 @@ The trending formula needs _recent_ activity, not all-time totals. `artwork_stat
The solution is four cached window columns refreshed on a schedule:
| Column | Meaning | Reset cadence |
|---|---|---|
| --- | --- | --- |
| `views_24h` | Views since last midnight reset | Nightly at 03:30 |
| `views_7d` | Views since last Monday reset | Weekly (Mon) at 03:30 |
| `downloads_24h` | Downloads in last 24 h | Nightly at 03:30 (recomputed from log) |
@@ -194,7 +198,7 @@ Uses chunked PHP loop (no `GREATEST()` / `INTERVAL` MySQL syntax) → works in b
### 4.1 Formula
```
```text
score = (award_score × 5.0)
+ (favorites × 3.0)
+ (reactions × 2.0)
@@ -210,7 +214,7 @@ Weights are constants in `TrendingService` (`W_AWARD`, `W_FAVORITE`, etc.) — a
### 4.2 Output columns
| Artworks column | Meaning |
|---|---|
| --- | --- |
| `trending_score_24h` | Score using `views_24h` + `downloads_24h`; targets artworks ≤ 7 days old |
| `trending_score_7d` | Score using `views_7d` + `downloads_7d`; targets artworks ≤ 30 days old |
| `last_trending_calculated_at` | Timestamp of last calculation |
@@ -244,7 +248,7 @@ php artisan skinbase:recalculate-trending --chunk=500 # smaller DB
All routes under `/discover/*` are registered in `routes/web.php` and handled by `App\Http\Controllers\Web\DiscoverController`. All use **Meilisearch sorting** — no SQL `ORDER BY` in the hot path.
| Route | Name | Sort key | Auth |
|---|---|---|---|
| --- | --- | --- | --- |
| `/discover/trending` | `discover.trending` | `trending_score_7d:desc` | No |
| `/discover/fresh` | `discover.fresh` | `created_at:desc` | No |
| `/discover/top-rated` | `discover.top-rated` | `likes:desc` | No |
@@ -260,7 +264,7 @@ All routes under `/discover/*` are registered in `routes/web.php` and handled by
### Logic
```
```text
1. Get user's following IDs from user_followers
2. If empty → show empty state (see below)
3. If present → Artwork::whereIn('user_id', $followingIds)
@@ -321,7 +325,7 @@ When the user follows nobody:
Computes preferences from the user's **favourited artworks**:
| Output key | Source |
|---|---|
| --- | --- |
| `top_tags` (up to 5) | Tags on artworks in `artwork_favourites` |
| `top_categories` (up to 3) | Categories on artworks in `artwork_favourites` |
| `followed_creators` | IDs from `user_followers` |
@@ -354,7 +358,7 @@ Falls back to `getTrendingFromDb()` — `orderByDesc('trending_score_7d')` with
Meilisearch filters are built in priority order:
```
```text
is_public = true
is_approved = true
id != {source_id}
@@ -377,7 +381,7 @@ Meilisearch's own ranking then sorts by relevance within those filters. Results
### `activity_events` schema
| Column | Type | Notes |
|---|---|---|
| --- | --- | --- |
| `id` | bigint PK | |
| `actor_id` | bigint FK users | Who did the action |
| `type` | varchar | `upload` `comment` `favorite` `award` `follow` |
@@ -389,7 +393,7 @@ Meilisearch's own ranking then sorts by relevance within those filters. Results
### Where events are recorded
| Event type | Recording point |
|---|---|
| --- | --- |
| `upload` | `UploadController::finish()` on publish |
| `follow` | `FollowService::follow()` |
| `award` | `ArtworkAwardController::store()` |
@@ -412,6 +416,7 @@ The controller enriches each event batch with its target objects in a single que
Configured in `config/scout.php` under `meilisearch.index-settings`.
Push settings to a running instance:
```bash
php artisan scout:sync-index-settings
```
@@ -419,6 +424,7 @@ php artisan scout:sync-index-settings
### Artworks index settings
**Searchable attributes** (ranked in order):
1. `title`
2. `tags`
3. `author_name`
@@ -451,7 +457,7 @@ php artisan scout:sync-index-settings
## 11. Caching Strategy
| Data | Cache key | TTL | Driver |
|---|---|---|---|
| --- | --- | --- | --- |
| Homepage trending | `homepage.trending.{limit}` | 5 min | Redis/file |
| Homepage fresh | `homepage.fresh.{limit}` | 5 min | Redis/file |
| Homepage hero | `homepage.hero` | 5 min | Redis/file |
@@ -461,6 +467,7 @@ php artisan scout:sync-index-settings
| Similar artworks | `api.similar.{artwork_id}` | 5 min | Redis/file |
**Rules:**
- Personalized data (`from_following`, `by_tags`, `by_categories`) is **not** independently cached — it falls inside `allForUser()` which is called fresh per request.
- Long-running cache busting: the trending command and reset command do not explicitly clear cache — the TTL is short enough that stale data self-expires within one trending cycle.
@@ -471,7 +478,7 @@ php artisan scout:sync-index-settings
All registered in `routes/console.php` via `Schedule::command()`.
| Time | Command | Purpose |
|---|---|---|
| --- | --- | --- |
| Every 30 min | `skinbase:recalculate-trending --period=24h` | Update `trending_score_24h` |
| Every 30 min | `skinbase:recalculate-trending --period=7d --skip-index` | Update `trending_score_7d` (background) |
| 03:00 daily | `uploads:cleanup` | Remove stale draft uploads |
@@ -489,7 +496,7 @@ All registered in `routes/console.php` via `Schedule::command()`.
All tests live under `tests/Feature/Discovery/`.
| Test file | Coverage |
|---|---|
| --- | --- |
| `ActivityEventRecordingTest.php` | `ActivityEvent::record()`, all 5 types, actor relation, meta, route smoke tests for the activity feed |
| `FollowingFeedTest.php` | Auth redirect, empty state fallback, pagination, creator exclusion |
| `HomepagePersonalizationTest.php` | Guest vs auth homepage sections, preferences shape, 200 responses |
@@ -499,11 +506,13 @@ All tests live under `tests/Feature/Discovery/`.
| `WindowedStatsTest.php` | `incrementViews/Downloads` update all 3 columns, reset command zeros views, recomputes downloads from log, window boundary correctness |
Run all discovery tests:
```bash
php artisan test tests/Feature/Discovery/
```
Run specific suite:
```bash
php artisan test tests/Feature/Discovery/SignalTrackingTest.php
```
@@ -512,7 +521,74 @@ php artisan test tests/Feature/Discovery/SignalTrackingTest.php
---
## 14. Operational Runbook
## 14. AI Discovery v3
### 15.1 Overview
The v3 layer augments the existing recommendation engine with:
- CLIP-derived embeddings and tags
- BLIP captions
- YOLO object detections
- vector-gateway similarity search
- hybrid feed reranking and section generation
Primary request paths:
- `GET /api/art/{id}/similar-ai`
- `POST /api/search/image`
- `POST /api/uploads/{id}/vision-suggest`
Primary async jobs:
- `AutoTagArtworkJob`
- `GenerateArtworkEmbeddingJob`
- `SyncArtworkVectorIndexJob`
- `BackfillArtworkVectorIndexJob`
### 15.2 Core configuration
Vision gateway:
- `VISION_ENABLED`
- `VISION_GATEWAY_URL`
- `VISION_GATEWAY_TIMEOUT`
- `VISION_GATEWAY_CONNECT_TIMEOUT`
Vector gateway:
- `VISION_VECTOR_GATEWAY_ENABLED`
- `VISION_VECTOR_GATEWAY_URL`
- `VISION_VECTOR_GATEWAY_API_KEY`
- `VISION_VECTOR_GATEWAY_COLLECTION`
- `VISION_VECTOR_GATEWAY_UPSERT_ENDPOINT`
- `VISION_VECTOR_GATEWAY_SEARCH_ENDPOINT`
Hybrid feed:
- `DISCOVERY_V3_ENABLED`
- `DISCOVERY_V3_CACHE_TTL_MINUTES`
- `DISCOVERY_V3_VECTOR_SIMILARITY_WEIGHT`
- `DISCOVERY_V3_VECTOR_BASE_SCORE`
- `DISCOVERY_V3_MAX_SEED_ARTWORKS`
- `DISCOVERY_V3_VECTOR_CANDIDATE_POOL`
AI section sizing:
- `DISCOVERY_V3_SECTION_SIMILAR_STYLE_LIMIT`
- `DISCOVERY_V3_SECTION_YOU_MAY_ALSO_LIKE_LIMIT`
- `DISCOVERY_V3_SECTION_VISUALLY_RELATED_LIMIT`
### 15.3 Behavior notes
- Upload publish remains non-blocking for AI processing; derivatives can complete and the AI jobs are queued after the upload is finalized.
- The synchronous `vision-suggest` endpoint is only for immediate upload-step prefill and does not replace the queued persistence path.
- `similar-ai` and reverse image search return vector-gateway results only when the gateway is configured; otherwise they fail closed with explicit JSON reasons.
- Discovery sections are now tunable from config rather than fixed in code, which makes production adjustments safe without service edits.
---
## 15. Operational Runbook
### Trending scores are stuck / not updating