Files
SkinbaseNova/docs/slow-query-optimization-plan.md

17 KiB
Raw Blame History

MySQL Slow Query Optimization Plan

Source: /var/log/mysql/slow.log — 68,950 total queries, 151,576s total exec time
Period: 2026-04-04 → 2026-04-26 (22 days)
Threshold: 500ms
Server: server3 / database: skinbase


Summary Stats

Metric Value
Total queries logged 68,950
Unique query fingerprints 139
Total execution time 151,576s (~42h)
Average exec time 2s
95th percentile 3s
Total rows examined 15.39B
Total bytes sent 40.79GB

Priority 1 — Critical (fix immediately)

P1-A: Correlated subquery counting artworks per tag (Query 8)

Total time: 1,004s · Calls: 1,138 · Rows examined/call: ~240k

Current query:

SELECT tags.*, 
  (SELECT count(*) FROM artworks 
   INNER JOIN artwork_tag ON artworks.id = artwork_tag.artwork_id 
   WHERE tags.id = artwork_tag.tag_id AND artworks.deleted_at IS NULL) 
   AS artworks_count
FROM tags
ORDER BY artworks_count DESC
LIMIT 10

Problem: N+1 correlated subquery — one full artworks JOIN artwork_tag count per tag row.

Fix options (pick one):

  1. Best — cached counter column: Add artworks_count INT DEFAULT 0 to tags, maintained by an Eloquent observer on artwork_tag attach/detach. Query becomes SELECT * FROM tags ORDER BY artworks_count DESC LIMIT 10 — instant.
  2. Quick — JOIN + GROUP BY:
SELECT t.*, COALESCE(cnt.c, 0) AS artworks_count
FROM tags t
LEFT JOIN (
  SELECT at.tag_id, COUNT(*) AS c
  FROM artwork_tag at
  JOIN artworks a ON a.id = at.artwork_id AND a.deleted_at IS NULL
  GROUP BY at.tag_id
) cnt ON cnt.tag_id = t.id
ORDER BY artworks_count DESC
LIMIT 10;

Migration needed: php artisan make:migration add_artworks_count_to_tags


P1-B: Correlated subquery counting artworks per user (Query 33)

Total time: 85s · Calls: 131 · Rows examined/call: ~61k

Current query:

SELECT users.*, 
  (SELECT count(*) FROM artworks 
   WHERE users.id = artworks.user_id AND is_approved=1 AND is_public=1 ...) 
   AS artworks_count
FROM users
HAVING artworks_count > 0
ORDER BY artworks_count DESC
LIMIT 6

Problem: Same correlated N+1 pattern per user. HAVING on a subquery forces a full users scan.

Fix: Use the existing user_statistics table. Add public_artworks_count INT DEFAULT 0 if not present, maintained by artwork publish/unpublish observer. Then:

SELECT users.*, us.public_artworks_count AS artworks_count
FROM users
JOIN user_statistics us ON us.user_id = users.id
WHERE us.public_artworks_count > 0 AND users.deleted_at IS NULL
ORDER BY us.public_artworks_count DESC
LIMIT 6;

P1-C: Jobs table LIKE scan on JSON payload (Query 38)

Total time: 64s · Calls: 58 · Rows examined/call: ~55k

Current query:

SELECT count(*) FROM jobs 
WHERE payload LIKE '%AutoTagArtworkJob%' AND payload LIKE '%69756%'

Problem: Full scan of the jobs table JSON payload column — no index possible on LIKE '%...%'.

Fix: Replace this deduplication check with a Redis key or a dedicated job_dedup table with an indexed (job_class, subject_id) column. Example:

// Instead of scanning jobs table:
if (Cache::has("auto-tag-queued:{$artworkId}")) return;
Cache::put("auto-tag-queued:{$artworkId}", true, now()->addHours(1));
AutoTagArtworkJob::dispatch($artwork);

P1-D: SELECT DISTINCT artwork_id from snapshots with no index (Query 12)

Total time: 923s · Calls: 280 · Rows examined/call: ~3.97M (max 8.3M!)

Current query:

SELECT DISTINCT artwork_id 
FROM artwork_metric_snapshots_hourly
WHERE bucket_hour BETWEEN '...' AND '...'

Problem: No covering index on (bucket_hour, artwork_id). Full or large range scan every time.

Fix: Add a compound index:

ALTER TABLE artwork_metric_snapshots_hourly 
  ADD INDEX idx_bucket_artwork (bucket_hour, artwork_id);

Migration: php artisan make:migration add_index_bucket_artwork_to_metric_snapshots


P1-E: Full-text LIKE searches on artworks title + description (Queries 50, 53)

Total time: ~62s combined · Calls: ~48 · Pattern: WHERE title LIKE '%keyword%' OR description LIKE '%keyword%' OR ... (20+ OR conditions)

Problem: Leading wildcard LIKE cannot use B-tree indexes. Full table scan every time.

Fix: Use Meilisearch (already in use for artwork search). Route AI-tag search queries through ArtworkSearchService instead of raw LIKE. For any fallback that must stay in MySQL:

-- Add FULLTEXT index:
ALTER TABLE artworks ADD FULLTEXT INDEX ft_artwork_text (title, description);
-- Then use MATCH..AGAINST instead of LIKE:
WHERE MATCH(title, description) AGAINST ('+moon +lunar' IN BOOLEAN MODE)

Priority 2 — High Impact

P2-A: Artwork aggregate stats queries — top 2 time consumers (Queries 1 & 2)

Total time: 83,766s + 34,068s = 117,834s (78% of all slow query time)
Calls: 37,124 + 13,789 = ~51k · Avg: 2.2s · Rows examined/call: ~215234k

These are the same heavy SELECT pattern loading per-artwork stats from multiple tables:

SELECT a.id, a.user_id, a.published_at, a.is_public, a.is_approved,
  (a.thumb_ext IS NOT NULL AND a.thumb_ext != '') AS has_thumbnail,
  COALESCE(ast.views, 0) AS views_all,
  COALESCE(ast.downloads, 0) AS downloads_all,
  COALESCE(ast.favorites, 0) AS favourites_all,
  COALESCE(cc.cnt, 0) AS comments_count,
  COALESCE(sc.cnt, 0) AS shares_count,
  COALESCE(ast.views_7d, 0) AS views_7d,
  ...
FROM artworks a
LEFT JOIN artwork_stats ast ON ast.artwork_id = a.id
LEFT JOIN (SELECT artwork_id, COUNT(*) cnt FROM artwork_favourites WHERE created_at >= ...) fav7 ...
LEFT JOIN (SELECT artwork_id, COUNT(*) cnt FROM artwork_comments ...) cc ...
LEFT JOIN (SELECT artwork_id, COUNT(*) cnt FROM artwork_shares ...) sc ...
WHERE a.is_public = 1 AND a.is_approved = 1 AND a.deleted_at IS NULL
  AND a.published_at <= NOW()
ORDER BY a.id ASC
LIMIT 500

Problems:

  • Inline derived subqueries for fav7, cc, sc run per page — not cached.
  • ORDER BY a.id with filters requires index on (deleted_at, is_public, is_approved, published_at, id).
  • 51k calls/22 days = ~2,300 calls/day = every ~37 seconds, all day long. This is a scheduled job or background process churning through artworks.

Fix — composite index (immediate):

ALTER TABLE artworks 
  ADD INDEX idx_public_approved_published (deleted_at, is_public, is_approved, published_at, id);

Fix — pre-aggregate counts (medium term): Ensure artwork_stats already stores favorites_7d, comments_count, shares_count. If so, move all counts to artwork_stats maintenance jobs and remove the inline derived joins. The SELECT becomes a single fast LEFT JOIN artwork_stats.

Fix — reduce call frequency: If this is a scheduler-driven scan, batch it into chunks with exponential backoff and persist cursor position so it doesn't re-scan from scratch every run.


P2-B: User stats calculation — 28 second average (Query 3)

Total time: 8,847s · Calls: 320 · Avg: 28s · Max: 180s!

Complex query joining users, user_xp_logs, user_followers, artwork_likes, artworks, artwork_metric_snapshots_hourly.

Fix: This should never run on demand. Route it to:

  1. A scheduled background job that pre-aggregates into user_statistics (runs every N minutes).
  2. The controller/service reads from user_statistics only — single-row lookup by user_id.

Also ensure user_statistics has INDEX idx_user_id (user_id).


P2-C: artwork_metric_snapshots_hourly heavy join query (Query 4)

Total time: 6,328s · Calls: 237 · Avg: 27s · Max: 52s

SELECT artworks.*, ... 
FROM artworks
JOIN artwork_metric_snapshots_hourly amsh ON amsh.artwork_id = artworks.id
JOIN artwork_likes al ON ...
JOIN artwork_downloads ad ON ...
JOIN artwork_comments ac ON ...
WHERE amsh.bucket_hour BETWEEN ... AND ...

Fix:

  1. Add index from P1-D: idx_bucket_artwork (bucket_hour, artwork_id).
  2. Pre-aggregate hourly snapshots into daily/weekly summary tables and query those instead.
  3. Reduce time range of BETWEEN clause if querying recent data only.

P2-D: rank_artwork_scores queries — ~10 variants (Queries 26, 27, 29, 37, 39, 41, 42, 43, 46)

Total time: ~750s combined · ~870 combined calls · Avg: ~750ms · Rows examined/call: ~140150k

Pattern:

SELECT ras.artwork_id, a.user_id, ras.score_trending
FROM rank_artwork_scores ras
INNER JOIN artworks a ON a.id = ras.artwork_id AND a.is_public=1 AND a.is_approved=1 AND a.deleted_at IS NULL
WHERE ras.model_version = 'rank_v2'
ORDER BY ras.score_trending DESC
LIMIT 200

Problem: WHERE model_version = 'rank_v2' ORDER BY score_X DESC — no composite index covers both.

Fix — add partial/composite indexes:

ALTER TABLE rank_artwork_scores
  ADD INDEX idx_mv_trending   (model_version, score_trending DESC),
  ADD INDEX idx_mv_new_hot    (model_version, score_new_hot DESC),
  ADD INDEX idx_mv_best       (model_version, score_best DESC),
  ADD INDEX idx_mv_score      (model_version, score_new_hot, score_trending, score_best);

Fix — cache ranking results: These are pre-computed ranking scores. Cache the TOP 200 list per (model_version, score_column) for 515 minutes in Redis. The ranking job already runs on a schedule — warm the cache at the end of each ranking job run.


P2-E: artworks public count via full scan (Query 30)

Total time: 100s · Calls: 127 · Avg: 790ms · Rows examined/call: 97k

SELECT count(*) FROM artworks 
WHERE deleted_at IS NULL AND is_approved=1 AND is_public=1 
  AND published_at IS NOT NULL AND published_at <= NOW()

Fix — maintain a counter cache:

// In config or cache:
Cache::remember('artworks.public_count', 300, fn() => Artwork::public()->published()->count());

Or store the count in a site_statistics / system_settings table, updated by the publish observer.


P2-F: artworks ORDER BY id scanner for publish pipeline (Queries 18, 22, 23, 40, 47)

Total time: ~650s combined · Many calls · Pattern: SELECT * FROM artworks WHERE ... ORDER BY id ASC or ORDER BY trending_score_7d DESC

Problem: SELECT * loads all columns including large description blobs. The publish/ranking pipeline only needs IDs.

Fix:

  • Use SELECT id or SELECT id, user_id instead of SELECT *.
  • Ensure trending_score_7d is indexed if using ORDER BY trending_score_7d.
  • Add index from P2-A: idx_public_approved_published.

Priority 3 — Medium Impact

Total time: 2,234s · Calls: 2,349 · Avg: 951ms

Joining artworks → artwork_tag → tags → tag_interaction_daily_metrics. 246k rows examined/call.

Fix:

  • Cache the result: popular tags change slowly — cache 515 minutes.
  • Add index on tag_interaction_daily_metrics (tag_id, metric_date).
  • Precompute tag_interaction_daily_metrics aggregates into a tag_trending_scores table.

P3-B: Browse/gallery category + tag joins (Query 7)

Total time: 1,758s · Calls: 2,344 · Avg: 750ms

artworks + categories + artwork_category + artwork_tag — 80M total rows examined.

Fix:

  • Verify indexes: artwork_category(category_id, artwork_id), artwork_tag(tag_id, artwork_id).
  • Pagination: ensure cursor/keyset pagination is used, not OFFSET.
  • Cache browse results per category (already partially done in HomepageService).

P3-C: artwork_metric_snapshots_hourly backup full scan (Query 15)

Total time: 646s · Calls: 93 · Max: 127s · User: backuper

SELECT /*!40001 SQL_NO_CACHE */ * FROM artwork_metric_snapshots_hourly — mysqldump reading full table.

Fix (ops):

  • Partition artwork_metric_snapshots_hourly by month on bucket_hour. Backup only reads the active partition.
  • Or: exclude this table from hot backup and back it up separately during low-traffic window (02:0004:00 UTC).
  • Archive data older than 90 days to a cold table.

P3-D: artworks LEFT JOIN artwork_stats with OR condition (Query 32)

Total time: 86s · Calls: 110

WHERE (artworks.created_at >= '...' OR (s.ranking_score IS NOT NULL AND s.ranking_score > 0))

Problem: OR prevents index usage on created_at.

Fix: Rewrite as UNION:

SELECT id FROM artworks WHERE created_at >= '...' AND deleted_at IS NULL AND is_approved=1
UNION
SELECT a.id FROM artworks a JOIN artwork_stats s ON s.artwork_id = a.id 
WHERE s.ranking_score > 0 AND a.deleted_at IS NULL AND a.is_approved=1

P3-E: GROUP BY user_id from artworks (Query 31)

Total time: 92s · Calls: 110

SELECT a.user_id, COALESCE(us.followers_count, 0), COALESCE(us.favorites_received_count, 0)
FROM artworks a 
LEFT JOIN user_statistics us ON us.user_id = a.user_id
WHERE a.is_public=1 AND a.is_approved=1 AND a.deleted_at IS NULL
GROUP BY a.user_id, us.followers_count, us.favorites_received_count

Problem: Full artworks scan + GROUP BY without a covering index.

Fix: Add index (deleted_at, is_public, is_approved, user_id) on artworks. Or pre-aggregate into user_statistics and query that directly without touching artworks.


P3-F: Forum posts moderation scan (Query 62)

Total time: 12s · Calls: 15

SELECT * FROM forum_posts
WHERE (moderation_checked = 0 OR last_ai_scan_at IS NULL OR updated_at > last_ai_scan_at)
  AND id IS NOT NULL AND deleted_at IS NULL
ORDER BY id ASC LIMIT 50

Fix: Add a partial index for unmoderated posts:

ALTER TABLE forum_posts ADD INDEX idx_needs_moderation (moderation_checked, last_ai_scan_at, id)
  WHERE moderation_checked = 0;

Or maintain a moderation_queue table with only pending IDs.


Apply these in order (fastest wins first):

php artisan make:migration add_performance_indexes_batch1
public function up(): void
{
    // P1-D: snapshot bucket+artwork lookup
    Schema::table('artwork_metric_snapshots_hourly', function (Blueprint $table) {
        $table->index(['bucket_hour', 'artwork_id'], 'idx_bucket_artwork');
    });

    // P2-A + P2-F: public artwork scans & ORDER BY id
    Schema::table('artworks', function (Blueprint $table) {
        $table->index(
            ['deleted_at', 'is_public', 'is_approved', 'published_at', 'id'],
            'idx_public_approved_published_id'
        );
        $table->index(
            ['deleted_at', 'is_public', 'is_approved', 'user_id'],
            'idx_public_approved_user'
        );
    });

    // P2-D: rank_artwork_scores per model_version
    Schema::table('rank_artwork_scores', function (Blueprint $table) {
        $table->index(['model_version', 'score_trending'],  'idx_mv_trending');
        $table->index(['model_version', 'score_new_hot'],   'idx_mv_new_hot');
        $table->index(['model_version', 'score_best'],      'idx_mv_best');
        $table->index(['model_version', 'score_new_hot'],   'idx_mv_score');
    });

    // P3-A: tag daily metrics
    Schema::table('tag_interaction_daily_metrics', function (Blueprint $table) {
        $table->index(['tag_id', 'metric_date'], 'idx_tag_date');
    });

    // P1-A: tag artworks_count cached column
    Schema::table('tags', function (Blueprint $table) {
        $table->unsignedInteger('artworks_count')->default(0)->after('slug');
        $table->index('artworks_count', 'idx_artworks_count');
    });
}

Caching Quick Wins (no schema change needed)

Surface Cache Key TTL Notes
Popular tags tags.popular.10 15 min Currently: correlated subquery per request
Ranking top-200 lists rank.{model}.{score}.200 10 min Warm at end of ranking job
Public artwork count artworks.public_count 5 min Used in sitemaps, stats
User artworks_count user.{id}.artworks_count 5 min Warm on publish/unpublish
Group leaderboard leaderboard.group.monthly.5 30 min Already in leaderboards service

Implementation Roadmap

Phase Items Effort Expected gain
Week 1 P1-A correlated tag query, P1-C jobs LIKE, P1-D snapshot index, all P2-D rank indexes LowMed ~1520% reduction in slow queries
Week 2 P2-A artworks composite index, P1-B user artworks_count, caching quick wins table Med ~4050% reduction; kills #1 and #2 slow query families
Week 3 P2-B user stats background job, P2-C snapshot pre-aggregation, P1-E fulltext index High ~6575% reduction; kills 28s queries
Month 2 P2-F SELECT * → SELECT id, P3-C partition snapshots table, P3-D OR→UNION rewrite High Remaining tail

Monitoring After Changes

# Reset slow query log on server
mysql -e "FLUSH SLOW LOGS;"

# Re-run pt-query-digest after 1 week:
pt-query-digest /var/log/mysql/slow.log > /tmp/slow-report-after.txt

# Check query plan for top queries:
EXPLAIN SELECT ... \G

Key metrics to watch:

  • Total slow query count per day (target: -50% in week 2)
  • Rows_examined for artwork queries (target: <10k instead of 234k)
  • MySQL CPU usage during ranking job windows