sql-optimization

# SQL Optimization (Deep Workflow) Optimization without measurement is guesswork. Structure the work as **observe → explain (plan) → change → verify**, with explicit attention to **correctness**, **locks**, and **write amplification** from indexes. ## When to Offer This Workflow **Trigger conditions:** - Slow queries, growing P95/P99, replication lag, lock waits - ORM-generated SQL surprises; N+1 at DB layer - Index explosion, bloat, or “we added indexes everywhere” **Initial offer:** Use **six stages**: (1) frame the problem, (2) reproduce & measure, (3) read execution plans, (4) schema & indexes, (5) query & transaction tuning, (6) verify & guardrail. Confirm **engine** (PostgreSQL, MySQL, SQL Server, etc.) and **environment** (prod-like data volume). --- ## Stage 1: Frame the Problem **Goal:** Define **SLO**, **scope**, and **non-goals**. ### Questions 1. Which **queries** or **endpoints** are slow? User-facing vs batch? 2. **Regression**—did deploy, data volume, or stats change? 3. **Isolation level** and **consistency** requirements—can we read replicas? 4. **Write risk**: is this table write-heavy? Index cost? **Exit condition:** One-line **problem statement** with metric (e.g., “p95 2.4s on `/reports` at 10k RPS”). --- ## Stage 2: Reproduce & Measure **Goal:** **Stable repro** with representative **cardinality** and **parameters**. ### Actions - Capture **exact SQL**, parameters, and **frequency** - Use **EXPLAIN (ANALYZE, BUFFERS)** or equivalent—engine-specific - Check **buffer cache** effects: cold vs warm cache; run twice when needed - Compare **prod stats** vs staging—row counts, histograms ### Pitfalls - Optimizing on empty dev DB - Different **parameter sniffing** values changing plan choice **Exit condition:** Baseline numbers + **plan hash** or saved plan for A/B. --- ## Stage 3: Read Execution Plans **Goal:** Name the **dominant cost**: seq scan, bad join order, sort, hash spill, nested loop explosion. ### Interpret (adapt to engine) - **Seq scan** on large tables—filter selectivity? missing index? stats? - **Index scan** vs **bitmap** vs **index only**—covering indexes trade-offs - **Joins**: wrong order, missing stats, outdated NDV - **Sort/hash** spills to disk—work_mem / memory grants - **Locks**: `FOR UPDATE`, long transactions, hot row updates **Exit condition:** Hypothesis tied to **plan node(s)**, not generic “add index.” --- ## Stage 4: Schema & Indexes **Goal:** **Right indexes** for read paths without destroying writes. ### Strategy - **Composite index column order**: equality → range; avoid redundant indexes - **Partial indexes** for hot subsets - **Covering** indexes vs table bloat—measure write cost - **Foreign keys** and **constraints** affecting plans - **Statistics**: `ANALYZE`, extended stats, histograms—when stale stats lie ### Advanced (when relevant) - **Partitioning** for prune + maintenance - **Materialized views** / pre-aggregation for heavy reports **Exit condition:** DDL proposal with **rationale** and **rollback** (drop index concurrently if supported). --- ## Stage 5: Query & Transaction Tuning **Goal:** Sometimes the fix is **SQL rewrite**, not hardware. ### Techniques - Reduce **rows touched** early (CTEs vs inline—engine-dependent) - **Pagination** without OFFSET on huge pages (keyset) - **Batch** vs row-by-row; **UNION ALL** vs OR - **N+1**: batch queries, joins, data loader patterns - **Transactions**: shorten locks; avoid unnecessary `SELECT FOR UPDATE` - **ORM**: eager vs lazy loading discipline **Exit condition:** New plan shows lower cost / measured latency; **lock time** acceptable. --- ## Stage 6: Verify & Guardrail **Goal:** Improvement **holds** under load and doesn’t regress neighbors. ### Verify - Re-run **EXPLAIN ANALYZE** with production-like parameters - Load test or shadow traffic if available - **Monitor**: buffer hit ratio, index bloat, replication lag ### Guardrails - **Query timeouts** and **statement_timeout** where safe - **Alerts** on sequential scans on large tables if observability supports --- ## Final Review Checklist - [ ] Baseline and target metrics documented - [ ] Plan-based root cause, not guesswork - [ ] Index/DDL changes justified vs write load - [ ] Transaction/lock behavior considered - [ ] Verification on realistic data and load ## Tips for Effective Guidance - Always mention **parameter sniffing** and **stale statistics** as frequent culprits. - Warn when **adding indexes** on very write-heavy tables without measuring bloat. - Prefer **keyset pagination** education for large lists. ## Handling Deviations - **No EXPLAIN access**: infer from symptoms + ORM logs + index list; recommend safe staging repro. - **Vendor DB**: name that **hints** and features differ—avoid PostgreSQL-only advice on SQL Server without caveat.

sql-optimization

标签

通过对话安装

方式一：安装 SkillHub 和技能

方式二：设置 SkillHub 为优先技能安装源

通过命令行安装

下载 Zip 包

sql-optimization