Backend Interview Questions
Backend loops reward people who can connect the database, the API layer, and the messy reality of concurrent users. After a few years, you are expected to talk about migrations without downtime, idempotency keys, retry storms, and the difference between "works on my laptop" and "holds up when marketing sends an email blast." Thin answers name technologies; strong answers name constraints and how you measured them.
Interviewers are not trying to trick you with buzzwords. They want to know whether you can pick a storage engine and API style that match product requirements—and explain the cost when requirements change. If you have ever rolled back a deploy, sat in a read-replica lag incident, or argued about GraphQL complexity limits, those moments are gold. Translate them into calm, specific language.
The guides below split the backend universe into four pressure points teams argue about in real life: how data is stored, how queries stay fast, how clients talk to servers, and how caches behave when everyone hits the same hot key. Drill each with both theory and a story from something you actually ran.
Trending Sub-topics
- SQL vs NoSQL — Schema flexibility vs joins, transactions, reporting needs, and the political moment the team outgrows a document store.
- Indexing — Selectivity, composite column order, covering indexes, and the EXPLAIN output you screenshared in Slack at 2 a.m.
- REST vs GraphQL — Caching at the edge, versioning culture, N+1 in the real world, and when a BFF saves everyone a headache.
- Redis Caching — TTL discipline, eviction policies, cache stampede mitigation, and honesty about eventual consistency.
Signals that read as senior
You ask what consistency the product needs before you draw boxes. You mention observability early: structured logs, metrics around queue depth, traces across service boundaries. You describe rollouts—feature flags, canaries, shadow traffic—without being asked. Those habits signal you have carried features to production, not only to a pull request.
Pair concepts with numbers you have seen
Even approximate recall helps. "Our p95 doubled when we added an index on the wrong column order" beats "indexes are good." "We capped GraphQL depth after an accidental cartesian explosion" beats "GraphQL is flexible." Interviewers remember specifics.
If you are light in one area
Say so, then show how you would close the gap: read a chapter of your database docs, run a load test on a staging clone, shadow a DBA or SRE for a day. Curiosity and a plan beat pretending you have operated a stack you have only read about.
Backend snippets: SQL, cache, and Azure-shaped pain
Point at the code, then tell the story: who paged you, what metric moved, what you rolled back. Azure shows up as managed SQL, Redis, or Cosmos—same ideas, different dashboards.
Azure SQL / Postgres-style: seek instead of scan (pattern)
-- invoices for one tenant last 7 days — needs composite index
SELECT id, amount_cents, created_at
FROM invoices
WHERE tenant_id = @tenant
AND created_at >= DATEADD(day, -7, SYSUTCDATETIME())
ORDER BY created_at DESC;
-- supporting index (column order matters)
CREATE INDEX ix_invoices_tenant_created
ON invoices (tenant_id, created_at DESC)
INCLUDE (id, amount_cents);The interview win is verbal: "tenant_id equality first, range on created_at second, include columns to avoid key lookups." In production on Azure SQL, you confirm with actual plans, not vibes, and you watch DTU or vCore spikes when marketing runs a surprise report.
Cache-aside with TTL (Python + Azure Cache for Redis mindset)
async def get_report(cache, db, tenant_id: str):
key = f"report:{tenant_id}"
hit = await cache.get(key)
if hit:
return json.loads(hit)
rows = await db.fetch("SELECT ... WHERE tenant_id = $1", tenant_id)
payload = json.dumps(rows)
await cache.setex(key, 300, payload) # 5 min TTL — tune per SLA
return rowsReal problem: finance refreshes the same heavy query every thirty seconds during month-end. You shorten TTL when data must be fresher, add a manual invalidation hook after writes, and you watch evictions in Azure Monitor so a memory pressure event does not look like an app bug. Say that sentence and you sound like someone who owns on-call.
Questions with sample answers
These are interview-ready outlines—sound human by swapping in your own metrics, team names, and war stories. The examples are generic on purpose so you can map them to what you actually shipped.
Primary prompt
We need to add full-text style search over orders; current queries use heavy LIKE scans. What is your migration and indexing plan?
Add Postgres tsvector column + GIN index or sync to OpenSearch; backfill job; dual-write search index; switch reads behind feature flag; remove LIKE in hot path.
Primary prompt
Compare REST and GraphQL for a mobile app with spotty connectivity and aggressive caching requirements.
REST coarse-grained resources cache well with standard HTTP; GraphQL flexible but harder CDN—use persisted queries + GET; offline cache keyed by normalized entities either way.
Primary prompt
Redis hit rate dropped after a deploy. How do you triage: app bug, key churn, eviction policy, or infra?
Compare release diff for key prefix/version change, check memory maxmemory evictions, Redis SLOWLOG, client connection count, accidental flush—dashboards per cause.
Primary prompt
Explain a time you chose weaker consistency to ship faster—and how you protected users from visible corruption.
Example: read replicas for dashboard—show "may lag few minutes"; writes still strong; reconciled nightly job for analytics vs billing source.
Follow-ups interviewers often ask
Expect nested "why?" questions—brief answers here; expand with your production defaults.
Follow-up
What metrics prove the database is the bottleneck vs the application layer?
DB active sessions, wait stats, CPU on DB vs app pool, query duration in APM trace—if app idle waiting on DB, it's DB.
Follow-up
How do you handle schema evolution when old clients are still in the wild?
Additive columns nullable default, expand/contract, API versioning, feature detection—never break old mobile without migration path.
Follow-up
What idempotency story do you need for retries from mobile clients?
Idempotency-Key header on POST, server stores request hash 24h, return same response—critical for payments and creates.
Follow-up
How would you load-test this path without masking issues that only appear with real data skew?
Copy anonymized prod distribution, long-tail tenants included, chaos latency on dependencies—not uniform random only.
Follow-up
When would you refuse to add a cache—and what alternative would you propose instead?
When consistency requirements or debugging cost too high—fix query/index first, materialized view, or read replica with acceptable lag.