Great observation! You’ve hit on the exact reason for the Hybrid Model mentioned in the post (often called solving the 'Justin Bieber problem').
To answer your question about overwhelming shards:
The Queue as a Buffer
As the article notes, the Kafka queue decouples the burst from the processing. This allows the Worker Services to consume the 'traffic spike' at a steady, controlled pace rather than trying to process 100M writes instantly, which naturally protects the shards from being overwhelmed.
Switching Strategies
For those specific celebrities with 100M+ followers, the 'hybrid' approach usually means switching to Fanout-on-Read. Instead of writing to 100M separate timelines, the system just saves the tweet once and merges it into followers' feeds when they open the app. This effectively bypasses the hot shard/write-amplification issue entirely for the top 1% of users!
Great observation! You’ve hit on the exact reason for the Hybrid Model mentioned in the post (often called solving the 'Justin Bieber problem').
To answer your question about overwhelming shards:
The Queue as a Buffer
As the article notes, the Kafka queue decouples the burst from the processing. This allows the Worker Services to consume the 'traffic spike' at a steady, controlled pace rather than trying to process 100M writes instantly, which naturally protects the shards from being overwhelmed.
Switching Strategies
For those specific celebrities with 100M+ followers, the 'hybrid' approach usually means switching to Fanout-on-Read. Instead of writing to 100M separate timelines, the system just saves the tweet once and merges it into followers' feeds when they open the app. This effectively bypasses the hot shard/write-amplification issue entirely for the top 1% of users!
Hope I was able to answer your question 🙂