# MySQL Slow Query Optimization Plan **Source:** `/var/log/mysql/slow.log` — 68,950 total queries, 151,576s total exec time **Period:** 2026-04-04 → 2026-04-26 (22 days) **Threshold:** 500ms **Server:** server3 / database: `skinbase` --- ## Summary Stats | Metric | Value | |---|---| | Total queries logged | 68,950 | | Unique query fingerprints | 139 | | Total execution time | 151,576s (~42h) | | Average exec time | 2s | | 95th percentile | 3s | | Total rows examined | 15.39B | | Total bytes sent | 40.79GB | --- ## Priority 1 — Critical (fix immediately) ### P1-A: Correlated subquery counting artworks per tag (Query 8) **Total time:** 1,004s · **Calls:** 1,138 · **Rows examined/call:** ~240k **Current query:** ```sql SELECT tags.*, (SELECT count(*) FROM artworks INNER JOIN artwork_tag ON artworks.id = artwork_tag.artwork_id WHERE tags.id = artwork_tag.tag_id AND artworks.deleted_at IS NULL) AS artworks_count FROM tags ORDER BY artworks_count DESC LIMIT 10 ``` **Problem:** N+1 correlated subquery — one full `artworks JOIN artwork_tag` count per tag row. **Fix options (pick one):** 1. **Best — cached counter column:** Add `artworks_count INT DEFAULT 0` to `tags`, maintained by an Eloquent observer on `artwork_tag` attach/detach. Query becomes `SELECT * FROM tags ORDER BY artworks_count DESC LIMIT 10` — instant. 2. **Quick — JOIN + GROUP BY:** ```sql SELECT t.*, COALESCE(cnt.c, 0) AS artworks_count FROM tags t LEFT JOIN ( SELECT at.tag_id, COUNT(*) AS c FROM artwork_tag at JOIN artworks a ON a.id = at.artwork_id AND a.deleted_at IS NULL GROUP BY at.tag_id ) cnt ON cnt.tag_id = t.id ORDER BY artworks_count DESC LIMIT 10; ``` **Migration needed:** `php artisan make:migration add_artworks_count_to_tags` --- ### P1-B: Correlated subquery counting artworks per user (Query 33) **Total time:** 85s · **Calls:** 131 · **Rows examined/call:** ~61k **Current query:** ```sql SELECT users.*, (SELECT count(*) FROM artworks WHERE users.id = artworks.user_id AND is_approved=1 AND is_public=1 ...) AS artworks_count FROM users HAVING artworks_count > 0 ORDER BY artworks_count DESC LIMIT 6 ``` **Problem:** Same correlated N+1 pattern per user. `HAVING` on a subquery forces a full users scan. **Fix:** Use the existing `user_statistics` table. Add `public_artworks_count INT DEFAULT 0` if not present, maintained by artwork publish/unpublish observer. Then: ```sql SELECT users.*, us.public_artworks_count AS artworks_count FROM users JOIN user_statistics us ON us.user_id = users.id WHERE us.public_artworks_count > 0 AND users.deleted_at IS NULL ORDER BY us.public_artworks_count DESC LIMIT 6; ``` --- ### P1-C: Jobs table LIKE scan on JSON payload (Query 38) **Total time:** 64s · **Calls:** 58 · **Rows examined/call:** ~55k **Current query:** ```sql SELECT count(*) FROM jobs WHERE payload LIKE '%AutoTagArtworkJob%' AND payload LIKE '%69756%' ``` **Problem:** Full scan of the `jobs` table JSON payload column — no index possible on LIKE '%...%'. **Fix:** Replace this deduplication check with a Redis key or a dedicated `job_dedup` table with an indexed `(job_class, subject_id)` column. Example: ```php // Instead of scanning jobs table: if (Cache::has("auto-tag-queued:{$artworkId}")) return; Cache::put("auto-tag-queued:{$artworkId}", true, now()->addHours(1)); AutoTagArtworkJob::dispatch($artwork); ``` --- ### P1-D: `SELECT DISTINCT artwork_id` from snapshots with no index (Query 12) **Total time:** 923s · **Calls:** 280 · **Rows examined/call:** ~3.97M (max 8.3M!) **Current query:** ```sql SELECT DISTINCT artwork_id FROM artwork_metric_snapshots_hourly WHERE bucket_hour BETWEEN '...' AND '...' ``` **Problem:** No covering index on `(bucket_hour, artwork_id)`. Full or large range scan every time. **Fix:** Add a compound index: ```sql ALTER TABLE artwork_metric_snapshots_hourly ADD INDEX idx_bucket_artwork (bucket_hour, artwork_id); ``` Migration: `php artisan make:migration add_index_bucket_artwork_to_metric_snapshots` --- ### P1-E: Full-text LIKE searches on artworks title + description (Queries 50, 53) **Total time:** ~62s combined · **Calls:** ~48 · Pattern: `WHERE title LIKE '%keyword%' OR description LIKE '%keyword%' OR ...` (20+ OR conditions) **Problem:** Leading wildcard LIKE cannot use B-tree indexes. Full table scan every time. **Fix:** Use **Meilisearch** (already in use for artwork search). Route AI-tag search queries through `ArtworkSearchService` instead of raw LIKE. For any fallback that must stay in MySQL: ```sql -- Add FULLTEXT index: ALTER TABLE artworks ADD FULLTEXT INDEX ft_artwork_text (title, description); -- Then use MATCH..AGAINST instead of LIKE: WHERE MATCH(title, description) AGAINST ('+moon +lunar' IN BOOLEAN MODE) ``` --- ## Priority 2 — High Impact ### P2-A: Artwork aggregate stats queries — top 2 time consumers (Queries 1 & 2) **Total time:** 83,766s + 34,068s = **117,834s** (78% of all slow query time) **Calls:** 37,124 + 13,789 = ~51k · **Avg:** 2.2s · **Rows examined/call:** ~215–234k These are the same heavy SELECT pattern loading per-artwork stats from multiple tables: ```sql SELECT a.id, a.user_id, a.published_at, a.is_public, a.is_approved, (a.thumb_ext IS NOT NULL AND a.thumb_ext != '') AS has_thumbnail, COALESCE(ast.views, 0) AS views_all, COALESCE(ast.downloads, 0) AS downloads_all, COALESCE(ast.favorites, 0) AS favourites_all, COALESCE(cc.cnt, 0) AS comments_count, COALESCE(sc.cnt, 0) AS shares_count, COALESCE(ast.views_7d, 0) AS views_7d, ... FROM artworks a LEFT JOIN artwork_stats ast ON ast.artwork_id = a.id LEFT JOIN (SELECT artwork_id, COUNT(*) cnt FROM artwork_favourites WHERE created_at >= ...) fav7 ... LEFT JOIN (SELECT artwork_id, COUNT(*) cnt FROM artwork_comments ...) cc ... LEFT JOIN (SELECT artwork_id, COUNT(*) cnt FROM artwork_shares ...) sc ... WHERE a.is_public = 1 AND a.is_approved = 1 AND a.deleted_at IS NULL AND a.published_at <= NOW() ORDER BY a.id ASC LIMIT 500 ``` **Problems:** - Inline derived subqueries for `fav7`, `cc`, `sc` run per page — not cached. - `ORDER BY a.id` with filters requires index on `(deleted_at, is_public, is_approved, published_at, id)`. - 51k calls/22 days = ~2,300 calls/day = every ~37 seconds, all day long. This is a **scheduled job or background process** churning through artworks. **Fix — composite index (immediate):** ```sql ALTER TABLE artworks ADD INDEX idx_public_approved_published (deleted_at, is_public, is_approved, published_at, id); ``` **Fix — pre-aggregate counts (medium term):** Ensure `artwork_stats` already stores `favorites_7d`, `comments_count`, `shares_count`. If so, move all counts to `artwork_stats` maintenance jobs and remove the inline derived joins. The SELECT becomes a single fast `LEFT JOIN artwork_stats`. **Fix — reduce call frequency:** If this is a scheduler-driven scan, batch it into chunks with exponential backoff and persist cursor position so it doesn't re-scan from scratch every run. --- ### P2-B: User stats calculation — 28 second average (Query 3) **Total time:** 8,847s · **Calls:** 320 · **Avg:** 28s · **Max:** 180s! Complex query joining `users`, `user_xp_logs`, `user_followers`, `artwork_likes`, `artworks`, `artwork_metric_snapshots_hourly`. **Fix:** This should **never run on demand**. Route it to: 1. A scheduled background job that pre-aggregates into `user_statistics` (runs every N minutes). 2. The controller/service reads from `user_statistics` only — single-row lookup by `user_id`. Also ensure `user_statistics` has `INDEX idx_user_id (user_id)`. --- ### P2-C: `artwork_metric_snapshots_hourly` heavy join query (Query 4) **Total time:** 6,328s · **Calls:** 237 · **Avg:** 27s · **Max:** 52s ```sql SELECT artworks.*, ... FROM artworks JOIN artwork_metric_snapshots_hourly amsh ON amsh.artwork_id = artworks.id JOIN artwork_likes al ON ... JOIN artwork_downloads ad ON ... JOIN artwork_comments ac ON ... WHERE amsh.bucket_hour BETWEEN ... AND ... ``` **Fix:** 1. Add index from P1-D: `idx_bucket_artwork (bucket_hour, artwork_id)`. 2. Pre-aggregate hourly snapshots into daily/weekly summary tables and query those instead. 3. Reduce time range of `BETWEEN` clause if querying recent data only. --- ### P2-D: rank_artwork_scores queries — ~10 variants (Queries 26, 27, 29, 37, 39, 41, 42, 43, 46) **Total time:** ~750s combined · ~870 combined calls · **Avg:** ~750ms · **Rows examined/call:** ~140–150k Pattern: ```sql SELECT ras.artwork_id, a.user_id, ras.score_trending FROM rank_artwork_scores ras INNER JOIN artworks a ON a.id = ras.artwork_id AND a.is_public=1 AND a.is_approved=1 AND a.deleted_at IS NULL WHERE ras.model_version = 'rank_v2' ORDER BY ras.score_trending DESC LIMIT 200 ``` **Problem:** `WHERE model_version = 'rank_v2' ORDER BY score_X DESC` — no composite index covers both. **Fix — add partial/composite indexes:** ```sql ALTER TABLE rank_artwork_scores ADD INDEX idx_mv_trending (model_version, score_trending DESC), ADD INDEX idx_mv_new_hot (model_version, score_new_hot DESC), ADD INDEX idx_mv_best (model_version, score_best DESC), ADD INDEX idx_mv_score (model_version, score_new_hot, score_trending, score_best); ``` **Fix — cache ranking results:** These are pre-computed ranking scores. Cache the TOP 200 list per `(model_version, score_column)` for 5–15 minutes in Redis. The ranking job already runs on a schedule — warm the cache at the end of each ranking job run. --- ### P2-E: `artworks` public count via full scan (Query 30) **Total time:** 100s · **Calls:** 127 · **Avg:** 790ms · **Rows examined/call:** 97k ```sql SELECT count(*) FROM artworks WHERE deleted_at IS NULL AND is_approved=1 AND is_public=1 AND published_at IS NOT NULL AND published_at <= NOW() ``` **Fix — maintain a counter cache:** ```php // In config or cache: Cache::remember('artworks.public_count', 300, fn() => Artwork::public()->published()->count()); ``` Or store the count in a `site_statistics` / `system_settings` table, updated by the publish observer. --- ### P2-F: artworks ORDER BY id scanner for publish pipeline (Queries 18, 22, 23, 40, 47) **Total time:** ~650s combined · Many calls · Pattern: `SELECT * FROM artworks WHERE ... ORDER BY id ASC` or `ORDER BY trending_score_7d DESC` **Problem:** `SELECT *` loads all columns including large `description` blobs. The publish/ranking pipeline only needs IDs. **Fix:** - Use `SELECT id` or `SELECT id, user_id` instead of `SELECT *`. - Ensure `trending_score_7d` is indexed if using `ORDER BY trending_score_7d`. - Add index from P2-A: `idx_public_approved_published`. --- ## Priority 3 — Medium Impact ### P3-A: Popular tags query with tag_interaction_daily_metrics (Query 6) **Total time:** 2,234s · **Calls:** 2,349 · **Avg:** 951ms Joining `artworks → artwork_tag → tags → tag_interaction_daily_metrics`. 246k rows examined/call. **Fix:** - Cache the result: popular tags change slowly — cache 5–15 minutes. - Add index on `tag_interaction_daily_metrics (tag_id, metric_date)`. - Precompute `tag_interaction_daily_metrics` aggregates into a `tag_trending_scores` table. --- ### P3-B: Browse/gallery category + tag joins (Query 7) **Total time:** 1,758s · **Calls:** 2,344 · **Avg:** 750ms `artworks + categories + artwork_category + artwork_tag` — 80M total rows examined. **Fix:** - Verify indexes: `artwork_category(category_id, artwork_id)`, `artwork_tag(tag_id, artwork_id)`. - Pagination: ensure cursor/keyset pagination is used, not `OFFSET`. - Cache browse results per category (already partially done in HomepageService). --- ### P3-C: artwork_metric_snapshots_hourly backup full scan (Query 15) **Total time:** 646s · **Calls:** 93 · **Max:** 127s · **User:** `backuper` `SELECT /*!40001 SQL_NO_CACHE */ * FROM artwork_metric_snapshots_hourly` — mysqldump reading full table. **Fix (ops):** - Partition `artwork_metric_snapshots_hourly` by month on `bucket_hour`. Backup only reads the active partition. - Or: exclude this table from hot backup and back it up separately during low-traffic window (02:00–04:00 UTC). - Archive data older than 90 days to a cold table. --- ### P3-D: `artworks LEFT JOIN artwork_stats` with OR condition (Query 32) **Total time:** 86s · **Calls:** 110 ```sql WHERE (artworks.created_at >= '...' OR (s.ranking_score IS NOT NULL AND s.ranking_score > 0)) ``` **Problem:** OR prevents index usage on `created_at`. **Fix:** Rewrite as UNION: ```sql SELECT id FROM artworks WHERE created_at >= '...' AND deleted_at IS NULL AND is_approved=1 UNION SELECT a.id FROM artworks a JOIN artwork_stats s ON s.artwork_id = a.id WHERE s.ranking_score > 0 AND a.deleted_at IS NULL AND a.is_approved=1 ``` --- ### P3-E: GROUP BY user_id from artworks (Query 31) **Total time:** 92s · **Calls:** 110 ```sql SELECT a.user_id, COALESCE(us.followers_count, 0), COALESCE(us.favorites_received_count, 0) FROM artworks a LEFT JOIN user_statistics us ON us.user_id = a.user_id WHERE a.is_public=1 AND a.is_approved=1 AND a.deleted_at IS NULL GROUP BY a.user_id, us.followers_count, us.favorites_received_count ``` **Problem:** Full artworks scan + GROUP BY without a covering index. **Fix:** Add index `(deleted_at, is_public, is_approved, user_id)` on artworks. Or pre-aggregate into `user_statistics` and query that directly without touching `artworks`. --- ### P3-F: Forum posts moderation scan (Query 62) **Total time:** 12s · **Calls:** 15 ```sql SELECT * FROM forum_posts WHERE (moderation_checked = 0 OR last_ai_scan_at IS NULL OR updated_at > last_ai_scan_at) AND id IS NOT NULL AND deleted_at IS NULL ORDER BY id ASC LIMIT 50 ``` **Fix:** Add a partial index for unmoderated posts: ```sql ALTER TABLE forum_posts ADD INDEX idx_needs_moderation (moderation_checked, last_ai_scan_at, id) WHERE moderation_checked = 0; ``` Or maintain a `moderation_queue` table with only pending IDs. --- ## Recommended Index Migrations Apply these in order (fastest wins first): ```bash php artisan make:migration add_performance_indexes_batch1 ``` ```php public function up(): void { // P1-D: snapshot bucket+artwork lookup Schema::table('artwork_metric_snapshots_hourly', function (Blueprint $table) { $table->index(['bucket_hour', 'artwork_id'], 'idx_bucket_artwork'); }); // P2-A + P2-F: public artwork scans & ORDER BY id Schema::table('artworks', function (Blueprint $table) { $table->index( ['deleted_at', 'is_public', 'is_approved', 'published_at', 'id'], 'idx_public_approved_published_id' ); $table->index( ['deleted_at', 'is_public', 'is_approved', 'user_id'], 'idx_public_approved_user' ); }); // P2-D: rank_artwork_scores per model_version Schema::table('rank_artwork_scores', function (Blueprint $table) { $table->index(['model_version', 'score_trending'], 'idx_mv_trending'); $table->index(['model_version', 'score_new_hot'], 'idx_mv_new_hot'); $table->index(['model_version', 'score_best'], 'idx_mv_best'); $table->index(['model_version', 'score_new_hot'], 'idx_mv_score'); }); // P3-A: tag daily metrics Schema::table('tag_interaction_daily_metrics', function (Blueprint $table) { $table->index(['tag_id', 'metric_date'], 'idx_tag_date'); }); // P1-A: tag artworks_count cached column Schema::table('tags', function (Blueprint $table) { $table->unsignedInteger('artworks_count')->default(0)->after('slug'); $table->index('artworks_count', 'idx_artworks_count'); }); } ``` --- ## Caching Quick Wins (no schema change needed) | Surface | Cache Key | TTL | Notes | |---|---|---|---| | Popular tags | `tags.popular.10` | 15 min | Currently: correlated subquery per request | | Ranking top-200 lists | `rank.{model}.{score}.200` | 10 min | Warm at end of ranking job | | Public artwork count | `artworks.public_count` | 5 min | Used in sitemaps, stats | | User artworks_count | `user.{id}.artworks_count` | 5 min | Warm on publish/unpublish | | Group leaderboard | `leaderboard.group.monthly.5` | 30 min | Already in leaderboards service | --- ## Implementation Roadmap | Phase | Items | Effort | Expected gain | |---|---|---|---| | **Week 1** | P1-A correlated tag query, P1-C jobs LIKE, P1-D snapshot index, all P2-D rank indexes | Low–Med | ~15–20% reduction in slow queries | | **Week 2** | P2-A artworks composite index, P1-B user artworks_count, caching quick wins table | Med | ~40–50% reduction; kills #1 and #2 slow query families | | **Week 3** | P2-B user stats background job, P2-C snapshot pre-aggregation, P1-E fulltext index | High | ~65–75% reduction; kills 28s queries | | **Month 2** | P2-F SELECT * → SELECT id, P3-C partition snapshots table, P3-D OR→UNION rewrite | High | Remaining tail | --- ## Monitoring After Changes ```bash # Reset slow query log on server mysql -e "FLUSH SLOW LOGS;" # Re-run pt-query-digest after 1 week: pt-query-digest /var/log/mysql/slow.log > /tmp/slow-report-after.txt # Check query plan for top queries: EXPLAIN SELECT ... \G ``` Key metrics to watch: - Total slow query count per day (target: -50% in week 2) - `Rows_examined` for artwork queries (target: <10k instead of 234k) - MySQL CPU usage during ranking job windows