Sharding & Partitioning

When One Machine Isn't Enough

Slack routes each workspace's messages to a specific database shard so that searching your team's history never touches anyone else's data. Discord shards by guild. Instagram sharded by user ID when a single Postgres node could no longer absorb their write volume. The pattern shows up at every company that outgrows a single database, and it shows up in almost every senior engineering interview for the same reason: it's one of the few scaling decisions that's genuinely hard to undo once you've made it.

The core idea is simple enough. Instead of one machine holding all your data, you split it across many machines, each owning a slice. Interviewers use "partitioning" and "sharding" interchangeably, but if you want to be precise: partitioning is the general concept of dividing data into chunks, while sharding specifically means those chunks live on separate physical nodes. Drop that distinction naturally and you signal you've done more than skim a blog post.

⚠️Common mistake

Candidates hear "scale the database" and immediately reach for sharding. Interviewers will push back hard on this. Sharding brings real pain: rebalancing data when you add nodes, cross-shard queries that are dramatically slower, and distributed transactions that become nightmares. Before you propose sharding, show that you considered read replicas, caching, and query optimization first. Sharding is the tool you reach for when those options are exhausted, not the first word out of your mouth.

How It Works

A user writes a new order. That request hits your application layer, which needs to figure out one thing: which shard holds (or should hold) this data? The answer comes from the shard key, a field you've chosen from your data model. Maybe it's user_id, maybe it's order_id. Whatever it is, the routing layer takes that key, runs it through some deterministic logic (a hash, a range lookup, a directory check), and gets back a shard identifier. The request then goes directly to that shard and nowhere else.

Think of it like a mail sorting facility. Every letter has a zip code (the shard key), the sorting machine (routing layer) reads it, checks which truck goes to that region (shard map), and drops the letter on the right truck (shard). No letter needs to visit every truck. That's the whole point.

Here's what that flow looks like:

Core Sharding Architecture: Request Routing to Shards

The Shard Key Is Everything

If you remember one thing from this section, make it this: the shard key is the single most important decision in your sharding design. Everything downstream depends on it.

A good shard key distributes writes evenly across shards and ensures that the queries your application runs most frequently can be satisfied by hitting a single shard. A bad shard key funnels disproportionate traffic to one node while the others sit idle. When you tell your interviewer "I'd shard by X," they're immediately evaluating whether X creates hot spots, whether it aligns with your query patterns, and whether it forces cross-shard joins.

You don't get to change your shard key easily after the fact. Picking the wrong one is closer to a schema redesign than a config change.

💡Interview tip

When you propose a shard key, say why out loud. "I'd shard on user_id because 90% of our queries are scoped to a single user, so each query hits exactly one shard." That one sentence shows the interviewer you're thinking about access patterns, not just picking a field at random.

The Shard Map: Your System's Address Book

Between the routing layer and the actual shards sits a metadata service, often called the shard map (or partition map, or config server). This is the lookup that answers: "given this key or hash value, which physical node owns it?"

The shard map might be a simple in-memory table that says "hash values 0-1000 go to Shard A, 1001-2000 go to Shard B." It might be a dedicated coordination service like ZooKeeper or etcd. In MongoDB, it's the config server replica set. The routing layer consults this map on every request, which means it needs to be fast and highly available. If the shard map goes down, your routing layer can't route anything.

In practice, most systems cache the shard map aggressively at the routing layer and only refresh it when shards are added, removed, or rebalanced. This keeps the hot path (normal reads and writes) from depending on a synchronous metadata lookup every single time.

What Lives Inside a Shard

Each shard is a fully independent database instance. It has its own tables, its own indexes, its own query engine. It handles its own replication to followers for read scaling and failover. It has its own failure domain, meaning if Shard B's disk dies, Shards A and C keep serving traffic without interruption.

This independence is both the power and the pain of sharding. Power, because a problem on one shard is isolated. Pain, because you now have N independent databases to monitor, back up, upgrade, and keep in sync with your schema changes.

⏱️Your 30-second explanation

"If the interviewer asks you to explain sharding in one breath, here's what you say: Every request carries a shard key. A routing layer hashes or looks up that key against a shard map to determine which database node owns the data, then forwards the request directly to that node. Each shard is an independent database with its own storage, indexes, and replication. The shard key choice determines whether load is distributed evenly or piles up on one node."

Why Your Interviewer Cares About These Mechanics

Deterministic routing means no broadcast. The whole point is that given a shard key, you can compute exactly which node to talk to without asking all of them. The moment you lose this property (because a query doesn't include the shard key, for example), you fall back to scatter-gather across every shard, which kills the performance benefit. Interviewers will test whether you understand this by asking about queries that don't include the shard key.

The shard map is a potential bottleneck and single point of failure. If the interviewer asks "what could go wrong with this architecture," the shard map is a strong answer. You should be ready to explain how caching, replication, and versioning of the shard map mitigate this.

Shard independence means no free joins. Data on Shard A can't do a local join with data on Shard B. There's no shared memory, no shared disk. If your interviewer hears you casually propose a JOIN across two entities that live on different shards, they'll push back. That awareness, knowing what you lose when data is split, is exactly what separates a surface-level answer from a strong one.

Patterns You Need to Know

In an interview, you'll usually need to pick a specific approach. Here are the ones worth knowing.

Each of these patterns answers the same question differently: given a shard key, which node gets the data? Your job is to know the tradeoffs well enough to pick the right one for the scenario and explain why you're picking it.

Hash-Based Sharding

Take the shard key (say, user_id), run it through a hash function, then mod the result by the number of shards. If you have 4 shards and hash(user_id) % 4 = 2, that record lives on shard 2. That's it. The beauty is simplicity: the distribution is nearly uniform regardless of what the underlying keys look like, because the hash function scrambles them.

The problem shows up the moment your cluster size changes. If you go from 4 shards to 5, almost every key's mod result changes. That means nearly all your data needs to move. For a 100GB database, you're looking at a massive, disruptive migration just to add one node. This is why hash-based sharding works best when your cluster size is fixed or changes very rarely. If the interviewer's scenario involves a predictable, stable workload (an internal analytics system, a fixed-size cache tier), this is a perfectly reasonable choice.

When to reach for this: the interviewer describes a system with a known, stable number of nodes and no expectation of frequent scaling. Or you're early in the discussion and want to start simple before upgrading to consistent hashing.

💡Interview tip

If you propose hash-based sharding, the interviewer will almost certainly ask "what happens when you add a shard?" Have your answer ready. Either you accept the resharding cost, or you pivot to consistent hashing. Showing that you anticipated the follow-up is what earns points.

Range-Based Partitioning

Instead of hashing, you divide the key space into contiguous ranges and assign each range to a shard. User IDs 1 through 999 go to Shard A, 1000 through 1999 go to Shard B, and so on. The routing layer just needs to know the boundary values to figure out where a request belongs.

The killer feature here is range queries. Need all orders from January? If you've partitioned by timestamp, those records sit on one or two shards, and you can scan them efficiently without touching every node in the cluster. Hash-based sharding can't do this; a range scan would scatter across all shards.

The killer flaw is hot spots. If you partition a time-series dataset by month and 90% of your reads and writes target the current month, one shard is on fire while the others sit idle. Same thing happens with alphabetical partitioning if your user base skews heavily toward certain name prefixes. The distribution is only as good as your access patterns, and access patterns are rarely uniform.

When to reach for this: the interviewer's scenario involves time-series data, ordered scans, or any query pattern where "give me everything between X and Y" is common. Think logging systems, analytics pipelines, or event stores.

Range-Based Partitioning: Contiguous Key Ranges

Consistent Hashing

This is the one interviewers expect you to know cold. Picture a circular number line (a "ring") from 0 to some large number, like 2^32. Each shard gets hashed onto a position on this ring. When a key arrives, you hash it onto the same ring and walk clockwise until you hit the first shard. That shard owns the key.

Why does this matter? When you add a new shard, it lands at one position on the ring and only takes over keys from its immediate clockwise neighbor. Everything else stays put. Removing a shard works the same way in reverse. Instead of rehashing the entire dataset (like hash-based sharding forces you to), you're only moving a fraction of the keys. For a cluster with N shards, adding one node redistributes roughly 1/N of the data.

There's a catch, though. With only a few physical nodes on the ring, the distribution can be wildly uneven. One node might own a huge arc of the ring while another owns a sliver. The fix is virtual nodes: each physical machine gets placed at multiple positions on the ring (say, 150 to 200 virtual positions per node). This smooths out the distribution dramatically. When the interviewer asks "but what about uneven load?", virtual nodes is your answer.

When to reach for this: any time the interviewer pushes on elasticity, auto-scaling, or "what happens when a node goes down." This is the default answer for systems that need to grow and shrink gracefully. DynamoDB and Cassandra both use variants of this approach, and mentioning that signals you've seen it in practice.

🔑Key insight

Consistent hashing doesn't eliminate data migration when you add nodes. It just minimizes it. You still need a mechanism to physically copy data to the new node and handle requests that arrive mid-migration. Don't gloss over that in your interview.

Consistent Hashing: Hash Ring with Virtual Nodes

Directory-Based Sharding

Forget algorithms entirely. Just maintain an explicit lookup table that says "key X lives on shard 3, key Y lives on shard 1." Every request hits this directory service first, gets told where to go, and then proceeds to the right shard.

This gives you maximum flexibility. Want to move a single high-traffic customer to a dedicated shard? Update one row in the directory. Want to rebalance by migrating a batch of keys? Update the mappings. No hash functions, no ring math, no worrying about whether your key distribution is uniform. You control placement directly.

The tradeoff is that the directory itself becomes a critical dependency. Every single read and write passes through it, so it needs to be fast (in-memory or heavily cached) and highly available (replicated, with failover). If the directory goes down, nothing can be routed. You've essentially traded algorithmic complexity for operational complexity.

When to reach for this: the interviewer describes a multi-tenant system where different tenants have wildly different sizes and you need fine-grained control over placement. Or when the access patterns are so irregular that no algorithmic approach can distribute load evenly.

Directory-Based Sharding: Explicit Lookup Table

Comparing the Patterns

Pattern	Distribution Evenness	Range Query Support	Rebalancing Cost
Hash-based	High (uniform hashing)	None (data is scattered)	Very high (full rehash)
Range-based	Depends on access patterns	Excellent	Medium (split/merge ranges)
Consistent hashing	High (with virtual nodes)	None	Low (only 1/N keys move)
Directory-based	Fully controllable	Possible (if you design for it)	Low (update mappings), but operationally complex

For most interview problems, you'll default to consistent hashing. It handles the "what about scaling?" follow-up gracefully, it's well-understood, and it shows the interviewer you know how real distributed databases work. Reach for range-based partitioning when the problem is fundamentally about ordered data (time-series, leaderboards, sequential IDs with range scans). And if the interviewer describes a multi-tenant scenario where one customer is 1000x larger than the rest, directory-based sharding lets you say "we can pin that tenant to dedicated infrastructure" without redesigning the whole system.

What Trips People Up

Here's where candidates lose points, and it's almost always one of these.

The Mistake: Picking a Shard Key That Creates Hot Spots

"I'd shard by country." Sounds reasonable until you remember that for most global products, the US or India alone might account for 60% of your traffic. You've just put the majority of your write load on a single node while the other shards sit idle. You haven't scaled anything. You've just made one machine work harder with extra routing overhead.

This isn't limited to geographic keys. Sharding by created_date in a time-series system means today's shard absorbs every single write while yesterday's shard does nothing. Sharding by merchant_id in a marketplace means the top 1% of sellers hammer one shard.

The fix depends on the situation. Compound keys help: instead of country, use country + user_id so that within each country, data fans out across multiple shards. Salting is another option: prepend a random prefix (say, 0-9) to the key so that one logical entity's data spreads across 10 shards. The tradeoff is that reads now need to query all 10 and merge results.

💡Interview tip

When you propose a shard key, immediately stress-test it out loud. Say: "Let me think about the access pattern here. If 60% of queries target recent data, this key would create a hot shard. So instead I'd..." That self-correction is exactly what interviewers want to see.

The Mistake: Ignoring Cross-Shard Query Pain

"We'll just query all the shards and combine the results."

That sentence makes interviewers wince. Yes, scatter-gather (fan out to all shards, collect responses, merge) is a real pattern. But candidates toss it out like it's free. It's not. If you have 50 shards, you're now issuing 50 parallel queries, waiting for the slowest one (tail latency kills you), and merging results in the routing layer. Sorting, pagination, aggregation across shards? Each of those is its own headache.

The real problem usually starts earlier: the candidate chose a shard key that doesn't align with the query patterns. If you shard an e-commerce orders table by user_id but the most common dashboard query is "show all orders for this product," every single query becomes a scatter-gather across all shards.

What to say instead: acknowledge that cross-shard queries are expensive and explain how you'd design around them. Denormalization is one approach. Store a copy of the data organized by the secondary access pattern. A global secondary index is another, where a separate service indexes the cross-cutting dimension and points back to the right shards. The key insight is that your shard key should match your dominant query pattern, and you should handle the secondary patterns through purpose-built read paths.

⚠️Common mistake

Candidates say "we'll just scatter-gather." The interviewer hears "I haven't thought about what happens at 100 shards with a P99 latency SLA."

The Mistake: Treating Resharding as a Simple Config Change

"If we need more capacity, we just add more shards."

Just. That word is doing a terrifying amount of heavy lifting. When a candidate says this, the interviewer is going to push: "Okay, walk me through what happens to the data that's already on the existing shards."

With hash-based sharding (hash mod N), changing N means almost every key maps to a different shard. You're not moving 1/Nth of the data. You're potentially moving nearly all of it. Even with consistent hashing, where only ~1/N of keys need to move when you add a node, you still have to physically copy that data to the new shard, handle requests for keys that are mid-migration (do you read from the old shard or the new one?), and update the shard map atomically so no requests get lost.

There's a big difference between offline resharding (take the system down, redistribute, bring it back up) and live migration (move data in the background while serving traffic). Live migration requires dual-reads during the transition, a way to track which keys have been migrated, and careful coordination so writes don't go to the wrong place. MongoDB's balancer does this with chunk migration. It's complex machinery.

💡Interview tip

Instead of "we'll just add shards," say something like: "Consistent hashing reduces the blast radius of adding a node to roughly 1/Nth of the keys. During migration, we'd use a dual-read strategy where the routing layer checks the new shard first and falls back to the old one. We'd migrate data in the background and update the shard map once each chunk is confirmed."

The Mistake: Confusing Sharding with Replication

Under pressure, candidates blur these two concepts together. They'll say something like "we'll shard the database for high availability" or "if a shard goes down, the other shards have the data."

No. The other shards do not have the data. That's the whole point.

Sharding splits data: each shard holds a different subset. If Shard B dies, the data on Shard B is gone (unless you've separately set up replication for it). Replication copies data: each replica holds the same data as its primary. Replication gives you fault tolerance and read scaling. Sharding gives you write scaling and larger total storage.

In practice, you almost always use both. Each shard is itself a replica set (a primary and one or more secondaries). So Shard B has its own replicas that can take over if the primary fails. But the data on Shard B is completely independent from the data on Shard A.

If an interviewer asks "what happens when a shard goes down?", the answer isn't "the other shards pick up the slack." The answer is "the shard's replica promotes to primary, and the routing layer redirects traffic to it." If you don't have replication within each shard, you have a single point of failure, and the interviewer will absolutely call that out.

⚠️Common mistake

Candidates say "sharding gives us redundancy." The interviewer hears "this person doesn't understand the difference between partitioning for scale and replicating for availability."

How to Talk About This in Your Interview

When to Bring It Up

Not every scaling problem is a sharding problem, and interviewers are testing whether you know that. Listen for these cues:

"Our database is hitting write throughput limits" or "we're maxing out IOPS on a single node." This is the clearest signal. Reads can often be solved with replicas and caching; write bottlenecks are where sharding becomes necessary.
"We have 500 million users and growing." When the dataset itself outgrows a single machine's storage or memory, partitioning is inevitable.
"Queries are getting slower even after we've added indexes." If the working set per node is too large to fit in memory, splitting it across shards shrinks each node's responsibility.
"We need to isolate failures. If one region's data goes down, we don't want it to take everything else with it." This is a subtler cue. Sharding creates independent failure domains.

The cue you should not react to: "reads are slow." That's a caching or read-replica problem first. Jumping straight to sharding when the interviewer says "reads are slow" is one of the fastest ways to lose credibility.

Sample Dialogue

Interviewer: "So we've got a relational database backing our e-commerce platform. It's handling about 50,000 writes per second and response times are degrading. What would you do?"

You: "Before I reach for sharding, I'd want to rule out simpler fixes. Are we already using read replicas to offload read traffic? Is there a caching layer for hot product data? If those are in place and writes are still the bottleneck, then yeah, we're looking at sharding. A single node can only absorb so many writes, and vertical scaling has a ceiling."

Interviewer: "Assume we've done all that. Reads are fine. It's the writes."

You: "Okay, then the first question is shard key. For an e-commerce platform, I'd lean toward user_id. Most write operations, like placing an order or updating a cart, are scoped to a single user. That means each write hits exactly one shard, no cross-shard coordination needed. I'd avoid something like product_id because a flash sale on one popular product would hammer a single shard while the others sit idle."

Interviewer: "Interesting. But what about queries that aren't user-scoped? Like, 'show me all orders for this product in the last hour'?"

You: "That's the tradeoff. If we shard by user_id, a query like that becomes a scatter-gather across every shard, which is expensive. I'd handle that with a separate denormalized view or a secondary index. Maybe a dedicated analytics store that receives events asynchronously. You don't want to compromise your primary shard key choice to optimize for a reporting query that can tolerate a few seconds of staleness."

Interviewer: "And if we need to add more shards later?"

You: "That's where I'd use consistent hashing with virtual nodes instead of simple mod-N. With consistent hashing, adding a new node only forces us to migrate a fraction of the keys, roughly 1/N of the total data, instead of reshuffling almost everything. But I want to be honest about the cost: even with consistent hashing, you still have to physically copy data to the new node, handle requests that arrive mid-migration, and update the routing metadata. It's not free. It's just less painful than the alternative."

Follow-Up Questions to Expect

"How do you handle a hot shard?" Compound shard keys or salting. If celebrity_user_id gets disproportionate traffic, you can append a random suffix (salt) to spread that user's data across multiple shards, at the cost of needing to query all salt variants on reads.

"What happens to transactions that span multiple shards?" Two-phase commit works but kills performance. The better answer is to design your shard key so that related data lives on the same shard. When that's impossible, saga patterns with compensating actions are the standard approach.

"How does the routing layer know which shard to hit?" A shard map (sometimes called a configuration server) stores the key-to-shard mapping. The routing layer caches this locally and refreshes it when shards are added or removed. MongoDB's mongos router and its config servers are a concrete example you can reference.

"Can you name a real system that does this?" DynamoDB uses consistent hashing across partitions. PostgreSQL supports declarative range and list partitioning natively (though on a single machine, not distributed). MongoDB's sharded clusters auto-balance chunks across shard replicas. Pick whichever one you've actually worked with.

What Separates Good from Great

A good answer picks a shard key and explains why. A great answer also explains what queries break under that choice and proposes a mitigation strategy (denormalized views, secondary indexes, or a separate read-optimized store). Interviewers remember candidates who volunteer the downsides before being asked.
A good answer mentions consistent hashing. A great answer explains virtual nodes in one sentence ("each physical node gets multiple positions on the ring so data distributes more evenly") and then pivots to the operational reality: you still need a migration strategy for in-flight data, and the routing layer needs to handle the transition period where both the old and new shard might own a key.
A good answer focuses on the sharding mechanism. A great answer closes by naming what you lose: foreign key constraints across shards stop working, aggregation queries that used to be a single GROUP BY now require scatter-gather, and backups become a coordination problem across N independent databases. Ending with tradeoffs signals senior-level thinking.

🎯Key takeaway

Sharding should be the last scaling tool you reach for, not the first, and the strongest signal you can send in an interview is explaining what you tried before sharding and what you'll lose after it.

When One Machine Isn't Enough#

How It Works#

The Shard Key Is Everything#

The Shard Map: Your System's Address Book#

What Lives Inside a Shard#

Why Your Interviewer Cares About These Mechanics#

Patterns You Need to Know#

Hash-Based Sharding#

Range-Based Partitioning#

Consistent Hashing#

Directory-Based Sharding#

Comparing the Patterns#

What Trips People Up#

The Mistake: Picking a Shard Key That Creates Hot Spots#

The Mistake: Ignoring Cross-Shard Query Pain#

The Mistake: Treating Resharding as a Simple Config Change#

The Mistake: Confusing Sharding with Replication#

How to Talk About This in Your Interview#

When to Bring It Up#

Sample Dialogue#

Follow-Up Questions to Expect#

What Separates Good from Great#

When One Machine Isn't Enough

How It Works

The Shard Key Is Everything

The Shard Map: Your System's Address Book

What Lives Inside a Shard

Why Your Interviewer Cares About These Mechanics

Patterns You Need to Know

Hash-Based Sharding

Range-Based Partitioning

Consistent Hashing

Directory-Based Sharding

Comparing the Patterns

What Trips People Up

The Mistake: Picking a Shard Key That Creates Hot Spots

The Mistake: Ignoring Cross-Shard Query Pain

The Mistake: Treating Resharding as a Simple Config Change

The Mistake: Confusing Sharding with Replication

How to Talk About This in Your Interview

When to Bring It Up

Sample Dialogue

Follow-Up Questions to Expect

What Separates Good from Great