Backend interview questions

Indexing Interview Questions

Indexes are promises: faster lookups in exchange for slower writes and more disk. Interviewers listen for whether you can trace a query from SQL text to execution plan, name why the optimizer ignored your shiny index, and discuss maintenance (bloat, autovacuum, rebuild windows) without being prompted. If you have ever fixed a production slowdown caused by a missing composite index—or worse, a wrong composite order—that story belongs in this prep bucket.

Composite indexes: order matters

Lead with equality predicates, then range predicates; know when a sort can be satisfied from the index and when you are still sorting in memory. If you mention covering indexes, explain the trade-off: wider indexes speed specific reads but fatten writes and cache pressure.

Statistics and bad estimates

Connect to caching and APIs

Fast queries still do not fix chatty services. After indexing, study Redis caching and REST vs GraphQL so you can explain when the database was never the bottleneck—and when masking it with cache created a consistency puzzle.

Automatic tuning mail you actually read (Azure SQL)

Before/after plan shape (conceptual DMVs)

-- What the engine thought was expensive last week
SELECT TOP 20
  qs.query_id,
  rs.avg_duration / 1000.0 AS avg_ms,
  qt.query_sql_text
FROM sys.query_store_query_text qt
JOIN sys.query_store_query q ON qt.query_text_id = q.query_text_id
JOIN sys.query_store_plan p ON q.query_id = p.query_id
JOIN sys.query_store_runtime_stats rs ON p.plan_id = rs.plan_id
ORDER BY rs.avg_duration DESC;

You paste something like this (or the portal equivalent) when explaining "why we added ix_orders_tenant_status." Real software problem: report season, CPU pegged, CFO waiting. Interview gold is linking Query Store insight → index → verified plan change → calmer DTU graph.

Questions with sample answers

These are interview-ready outlines—sound human by swapping in your own metrics, team names, and war stories. The examples are generic on purpose so you can map them to what you actually shipped.

  1. Primary prompt

    A query filters on (tenant_id, status, created_at)—propose a composite index order and defend it.

    (tenant_id, status, created_at DESC) if equality on tenant+status then range sort on created_at— matches filter+order; cardinality of status matters—low-cardinality middle sometimes swapped based on optimizer stats.

  2. Primary prompt

    Covering index vs heap lookup—when does the optimizer choose each?

    Covering index includes all selected columns—index-only scan, fewer lookups; if select * wide, optimizer may pick heap fetch—check EXPLAIN.

  3. Primary prompt

    How do you detect a missing index in production using wait events or slow query logs?

    pg_stat_statements / Azure SQL Query Store, high logical reads with table scan, PAGEIOLATCH waits, slow query log > threshold—rank by total CPU*hours.

  4. Primary prompt

    What is your process for dropping an index that might still be load-bearing?

    Mark unused in staging, use hypothetical/ghost index if available, drop concurrently, monitor plan regressions, keep recreate migration ready.

Follow-ups interviewers often ask

Expect nested "why?" questions—brief answers here; expand with your production defaults.

  1. Follow-up

    How do partial indexes help—and what pitfalls do they introduce?

    Smaller index for WHERE active subset—faster writes; pitfall: queries must match predicate or index ignored.

  2. Follow-up

    What happens to write amplification as you add more indexes?

    Each index updated per row write—OLTP slows; measure insert/update p99 before adding covering indexes everywhere.

  3. Follow-up

    How do you handle index maintenance during peak traffic?

    Online rebuild options, low-priority maintenance windows, partition-level rebuild, avoid full table locks—schedule by region.

  4. Follow-up

    Explain a time statistics were stale and how you fixed it.

    Sudden plan flip after bulk load—ran ANALYZE, increased auto-stats frequency, fixed parameter sniffing with OPTIMIZE FOR UNKNOWN or forced plan cautiously.

  5. Follow-up

    When would you partition instead of index harder?

    Very large tables with time-range queries—partition prune beats giant index; retention/archival story; beware too many partitions metadata cost.