Skip to main content

Build a Redis-like Distributed In-Memory Cache

  This tests:

  • System design depth

  • Understanding of distributed systems

  • Trade-off navigation (CAP, consistency, latency)

  • Real-world edge case handling

Let’s go step by step and design Redis-like cache from first principles, not using cloud-managed services.


🚀 Goal: Build a Redis-like Distributed In-Memory Cache


🧾 1. Requirements Gathering (Clarify with interviewer)

🔹 Functional

  • Support GET, SET, DEL, TTL

  • Handle concurrent reads/writes

  • Cache keys across multiple nodes

  • Optional: Support pub/sub, data structures (hash, list)

🔹 Non-Functional

  • Low latency (<1ms typical)

  • High availability & fault tolerance

  • Scalable horizontally

  • Eventual or strong consistency

  • Memory-optimized with TTL eviction

Absolutely! Back-of-the-envelope estimations are crucial in system design interviews — they demonstrate your pragmatism, ability to roughly size a system, and to make sound trade-offs.

Let’s break it down for your Redis-like Distributed In-Memory Cache System:


🧠 Scenario:

Let’s say you're designing this for an AI/ML pipeline system, like Google's CMCS ML. It caches:

  • Intermediate model data

  • Feature store results

  • Token metadata

  • Configuration data


📌 Estimation Goals:

We’ll estimate for:

What Example
🔹 Number of keys e.g., 100 million
🔹 Size per key e.g., average 1KB
🔹 Total memory footprint GB / TB scale
🔹 QPS (Queries Per Second) For read/write traffic
🔹 Node count and distribution
🔹 Network bandwidth
🔹 TTL / Eviction rates

⚙️ Step-by-Step Estimation

🔹 1. Number of Keys

Let’s say each ML workflow (pipeline) generates:

  • 10k intermediate cacheable entries

  • 1M workflows per day (across all users)


10k keys/workflow × 1M workflows/day = 10B keys/day

But not all stay in memory. We retain 10% for hot data in memory:

  • 10B × 10% = 1B keys cached at peak


🔹 2. Average Key Size

Let’s assume:

  • Key name: ~100 bytes

  • Value: ~900 bytes

  • TTL/metadata: ~20 bytes overhead

Total = 1KB per key


📦 3. Total Memory Requirement

1B keys × 1KB = 1,000,000,000 KB = ~1 TB
So you’d need ~1 TB of RAM across your cluster

Let’s budget for 30% overhead (replication, GC, fragmentation):

➡️ Effective: ~1.3 TB RAM


🧵 4. QPS (Queries Per Second)

Assume:

  • Each key gets ~10 reads per day → 10B reads/day

  • 1% of keys get hit 90% of the time (Zipfian)

10B reads/day ≈ 115,740 reads/sec
Writes: 1B/day ≈ 11,500 writes/sec
Target QPS:
  • Read QPS: 100K–150K

  • Write QPS: 10K–20K


🧑‍🤝‍🧑 5. Number of Nodes

If 1 machine supports:

  • 64 GB usable memory

  • 10K QPS (to be safe)

  • 10 Gbps NIC

Then:

  • RAM: 1.3 TB / 64 GB ≈ 20 nodes

  • QPS: 150K / 10K = 15 nodes

  • Plan for ~25–30 nodes (for headroom and HA)


🔁 6. Replication Overhead

Assuming:

  • 1 replica per shard for failover

  • 2× memory and network cost

➡️ RAM required: ~2.6 TB ➡️ Bandwidth: double write traffic (~20K writes/sec * 1KB = ~20 MB/sec replication stream)


📶 7. Network Bandwidth

Let’s estimate:

  • 150K reads/sec × 1KB = 150 MB/s

  • 20K writes/sec × 1KB = 20 MB/s

  • Replication = 20 MB/s

📌 Each node should handle:

  • Read bandwidth: ~6 MB/s

  • Write + replication: ~2 MB/s

  • Easily handled by 10 Gbps NIC


⏳ 8. Eviction Rate

Assuming TTL = 1 hour, and 1B keys:

  • Evictions per second = 1B / (60×60) ≈ 277K keys/sec

Eviction algorithm must be efficient:

  • LRU clock algo or async TTL scanner needed


✅ Final Summary

Metric Estimation
Total keys 1 billion
Avg size per key 1 KB
Total RAM (w/ overhead) ~2.6 TB (with replication)
Nodes 25–30 (for HA, QPS, memory headroom)
Read QPS ~150K/sec
Write QPS ~15–20K/sec
Eviction rate ~250–300K/sec
Network per node ~10 MB/s total (within 10Gbps budget)

🎯 Bonus: What Google Might Ask

What would change if you needed to support multi-tenant isolation?
→ Talk about namespacing keys, quota control, per-tenant memory buckets.

What if a single user uploads a 1GB object?
→ Chunk large values or offload to Blob storage and cache pointer.

How would you reduce memory cost?
→ TTL tuning, compression (LZ4), lazy expiration.



🧱 2. High-Level Architecture

                 +------------------------+
                 |  Client Applications   |
                 +------------------------+
                             |
                             v
                    +------------------+
                    |  Coordinator /   |
                    |  Cache Router    | (Optional)
                    +------------------+
                             |
          +------------------+------------------+
          |                                     |
     +----------+                        +-------------+
     |  Cache    |  <-- Gossip/Heartbeat -->  |  Cache     |
     |  Node A   |        Protocol             |  Node B    |
     +----------+                        +-------------+
          |                                     |
     +------------+                       +-------------+
     |  Memory DB |                       |  Memory DB  |
     +------------+                       +-------------+

🧠 3. Core Components

🔸 a. Data Storage (In-Memory)

  • Use hash maps in memory for key-value store

  • TTLs stored with each key (for expiry eviction)

  • Optionally support data types like list, hash, etc.

store = {
  "foo": { value: "bar", expiry: 1681450500 },
  ...
}

🔸 b. Shard & Partition

  • Use consistent hashing to assign keys to nodes

  • Each key Khash(K) % N where N = number of virtual nodes

This avoids rehashing all keys when nodes are added/removed

🔸 c. Cache Router / Coordinator

  • Client can compute hash OR use a proxy router to route to correct cache node

  • Think Twemproxy or Envoy as L7 proxy

🔸 d. Replication

  • Master-Replica model

  • Writes go to master → replicate to replica (async or sync)

  • Replicas take over on master failure

Node A (Master)
  └── Replica A1

🔸 e. Eviction Strategy

  • Support TTL for automatic key expiry

  • Support LRU / LFU / random eviction when memory full

  • Track access counters for eviction ranking

🔸 f. Concurrency

  • Use read-write locks or CAS for thread-safe operations

  • Each node is single-threaded (like Redis) or use event-loop


🔁 4. Replication and Fault Tolerance

Gossip/Heartbeat

  • Nodes send periodic heartbeats to detect failures

  • Use Raft or Gossip protocols for membership

Failover

  • If a master dies → elected replica takes over

  • Redis uses Redis Sentinel to coordinate failover


🧪 5. Optional Features

Feature Description
Pub/Sub Add pub/sub channels per key prefix
Persistence Periodically write snapshots to disk (RDB), or append-only logs (AOF)
Backpressure Queue or reject new writes if memory full
Client Library SDKs to abstract hash ring + failover

🔁 6. CAP Trade-off

CAP Property Redis Default
Consistency Eventual (async replication)
Availability High (failover supported)
Partition Tolerance Yes

📊 Metrics to Expose

  • Hit/Miss rate

  • Memory usage

  • Eviction count

  • Node replication lag

  • Key distribution balance


🧠 Questions Google May Ask You During This Design

  • How would you handle hot keys (one key accessed by many nodes)?

  • How do you rebalance keys after node failure?

  • Would you support atomic operations (INCR, DECR)?

  • How do you prevent cache stampede (multiple clients recomputing a missing key)?

  • How do you handle large objects that exceed single node memory?


🛠️ Tech Stack (if you had to build it)

Component

Tech

Cache server

C++ / Rust / Go (like Redis)

Hashing & comm

Consistent hashing + TCP sockets / gRPC

Replication

Raft or custom pub/sub protocol

Persistence

RocksDB / Flat file / Append logs

Observability

Prometheus + Grafana

 



Perfect — let’s walk through each of those Staff-level system design questions, and I’ll give you solid answers with reasoning, trade-offs, and implementation suggestions. You can use these to explain your decisions clearly in the interview.


🔥 1. How would you handle hot keys (one key accessed by many nodes)?

Problem:

  • One key (like "user:token:123") gets disproportionate access → overloads one node.

✅ Solutions:

  1. Key Replication (Read Replicas):

    • Replicate hot key across multiple nodes.

    • Client-side read load balancing (randomly pick from replicas).

    • Write goes to master → async to replicas.

  2. Key Sharding (Virtual Keys):

    • Split key logically, e.g., "video:views"video:views:1, ...:2

    • Clients randomly select a shard for read/write → reduce contention.

    • Aggregate during reads (costly but effective).

  3. Request Deduplication & Caching at Edge:

    • Use edge cache (like CDN or client-side cache) for super-hot keys.

  4. Rate Limiting / Backpressure:

    • Throttle requests to that key, or queue them on overload.

Interview Tip:

Emphasize dynamic detection of hot keys (via metrics), and adaptive replication or redirection.


💡 2. How do you rebalance keys after node failure?

Problem:

  • Node failure → key space imbalance.

  • Some nodes overloaded, others underused.

✅ Solutions:

  1. Consistent Hashing + Virtual Nodes:

    • Redistribute virtual nodes (vNodes) from failed node to others.

    • Only keys for those vNodes get rebalanced — minimal movement.

  2. Auto-Failover & Reassignment:

    • Use heartbeat to detect failure.

    • Other nodes take over lost slots or ranges.

  3. Key Migration Tools:

    • Background rebalance workers move keys to even out load.

    • Ensure write consistency during move via locking/versioning.

  4. Client-Side Awareness:

    • Clients get updated ring view and re-route requests accordingly.

Interview Tip:

Talk about graceful degradation during rebalancing and minimizing downtime.


⚙️ 3. Would you support atomic operations (INCR, DECR)?

Yes — atomic operations are essential in a caching layer (e.g., counters, rate limits, tokens).

Implementation:

  1. Single-Threaded Execution Model:

    • Like Redis: handle each command sequentially on single-threaded event loop → natural atomicity.

  2. Compare-And-Swap (CAS):

    • For multi-threaded or multi-process setups.

    • Use version numbers or timestamps to detect stale updates.

  3. Locks (Optimistic/Pessimistic):

    • Apply locks on keys for write-modify-write operations.

    • Use with caution to avoid performance degradation.

  4. Use CRDTs (Advanced Option):

    • Conflict-free data types (e.g., GCounter, PNCounter) for distributed atomicity.

Interview Tip:

Highlight that simplicity, speed, and correctness are the priority. Lean toward single-threaded per-key operation for atomicity.


🧊 4. How do you prevent cache stampede (multiple clients recomputing a missing key)?

Problem:

  • TTL expires → 1000 clients query same missing key → backend DDoS.

✅ Solutions:

  1. Lock/SingleFlight:

    • First client computes and sets value.

    • Others wait for value to be written (or reused from intermediate store).

    • Go has sync/singleflight, Redis can simulate with Lua locks.

  2. Stale-While-Revalidate (SWR):

    • Serve expired value temporarily.

    • In background, refresh the cache asynchronously.

  3. Request Coalescing at API Gateway:

    • Gateway buffers duplicate requests until cache is ready.

  4. Early Refresh Strategy:

    • Monitor popular keys.

    • Proactively refresh before TTL expiry.

Interview Tip:

Describe this as a read-heavy resilience pattern. Emphasize proactive + reactive strategies.


📦 5. How do you handle large objects that exceed single node memory?

Problem:

  • A single large key (e.g., serialized ML model, 1GB) doesn't fit in one node.

✅ Solutions:

  1. Key Chunking (Manual Sharding):

    • Split large value into multiple keys (file:1, file:2, file:3).

    • Store each chunk on different nodes.

    • Reassemble during read.

  2. Redirect to Object Store:

    • If object > X MB → store in Blob/File system (Azure Blob / GCS).

    • Cache a pointer/reference in cache instead.

  3. Use a Tiered Cache:

    • Store large objects in a slower (but scalable) cache (like disk-based).

    • Fast cache for hot small keys; slow cache for bulkier data.

  4. Compression:

    • Use lightweight compression (LZ4, Snappy) before storing.

Interview Tip:

Discuss threshold-based offloading and trade-off between latency vs. capacity.


Comments

Popular posts from this blog

Azure key vault with .net framework 4.8

Azure Key Vault  With .Net Framework 4.8 I was asked to migrate asp.net MVC 5 web application to Azure and I were looking for the key vault integrations and access all the secrete out from there. Azure Key Vault Config Builder Configuration builders for ASP.NET  are new in .NET Framework >=4.7.1 and .NET Core >=2.0 and allow for pulling settings from one or many sources. Config builders support a number of different sources like user secrets, environment variables and Azure Key Vault and also you can create your own config builder, to pull in configuration from your own configuration management system. Here I am going to demo Key Vault integrations with Asp.net MVC(download .net framework 4.8). You will find that it's magical, without code, changes how your app can read secretes from the key vault. Just you have to do the few configurations in your web config file. Prerequisite: Following resource are required to run/complete this demo · ...

How to Make a Custom URL Shortener Using C# and .Net Core 3.1

C# and .Net Core 3.1:  Make a Custom URL Shortener Since a Random URL needs to be random and the intent is to generate short URLs that do not span more than 7 - 15 characters, the real thing is to make these short URLs random in real life too and not just a string that is used in the URLs Here is a simple clean approach to develop custom solutions Prerequisite:  Following are used in the demo.  VS CODE/VISUAL STUDIO 2019 or any Create one .Net Core Console Applications Install-Package Microsoft.AspNetCore -Version 2.2.0 Add a class file named ShortLink.cs and put this code: here we are creating two extension methods. public   static   class   ShortLink {      public   static   string   GetUrlChunk ( this   long   key ) =>            WebEncoders . Base64UrlEncode ( BitConverter . GetBytes ( key ));      public   static   long   GetK...

Azure Logic Apps Send Email Using Send Grid Step by Step Example

Azure Logic Apps Send Email Using Send Grid Step by Step     Step 1- Create Send Grid Account Create a SendGrid Account  https://sendgrid.com/ Login and Generate Sendgrid Key and keep it safe that will be used further to send emails You can use Free service. it's enough for the demo purpose Step 2- Logic App Design Login to  https://portal.azure.com Go to Resources and Create Logic App Named "EmailDemo" Go To Newly Created Rosoure Named "EmailDemo" and Select a Trigger "Recurrence", You can choose according to your needs like HTTP, etc. Note* Without trigger you can not insert new steps or Actions Click on Change Connection and add Send Grid Key  Click on Create and Save Button on the Top. As we have recurrence so it will trigger according to our setup(every 3 months) so just for the test click on "RUN" button  Finally, you should get an email like below one: