Thomas Wang's blog

From journeyman to master.


DDIA Chapter 6. Partitioning

Buy the book https://dataintensive.net/

Intro

Partitioning and Replication

Partitioning of Key-Value Data

Partitioning by Key Range

Partitioning by Hash of Key

Skewed Workloads and Relieving Hot Spots

Partitioning and Secondary Indexes

Partitioning Secondary Indexes by Document

Partitioning Secondary Indexes by Term

Rebalancing Partitions

Strategies for Rebalancing

How not to do it: hash mod N

Problem: move data more than necessary

Fixed number of partitions

Dynamic partitioning

Partitioning proportionally to nodes

Operations: Automatic or Manual Rebalancing

Request Routing

Parallel Query Execution