A very good article from the Bluesky engineering team. Often in system design ...

A very good article from the Bluesky engineering team.

Often in system design interviews (and even in real-world problems) we try to optimize all the available metrics.

Best possible throughout, best possible memory usage, best possible CPU load, best possible sharding, and if it doesn’t work with need scaling, MOAR scaling!

This article illustrates quite well why technical solutions should never lose track of the real-world applications, and after a certain threshold it’s ok to embrace lossiness and imperfection.

Bluesky shards the timelines table on user_id. Each user shares their timeline shard with a few thousands of other users.

Every time a user shares a new post, that post is fanned out to the timelines of all of its followers.

That is usually fine if each users follows a few hundreds of users, or a couple of thousands at most.

But there may be users on social media that follow tens, or even hundreds, of users.

In most of the cases they are abusive bots that moderation should trim out - but there may be a few legitimate cases.

Writing hundreds of updates per second to those timelines results in high load and poor performance on the whole shard (hot shard/noisy neighbour situation), impacting the performance not only for the impacted user, but for anyone sharing the same shard.

Only looking at average/median latency and follows risks missing this picture too. If 1 user out of 1000 is statistically “hyper-connected”, then you should look at P99. 9. If each shard hosts the timelines of 10,000 users, there’s a good chance that a noisy neighbour may be present on each of them. If you don’t look at these outliers when sharding, you may get the false impression that fanning out posts can just easily be parallelized, and if each write takes in average a few hundreds of microseconds then you can get away with updating all timelines within a couple or seconds at most.

But what about that user with 20M followers, where 0.1% of those followers follow more than 10k accounts?

Well, that can easily bring the throughput of a timeline update from a couple of seconds to few tens of minutes - all while clogging up your shards.

Bluesky’s solution to the problem could actually be embraced (in a configurable way) also by Fediverse software (Mastodon’s approach of delivering lossless strictly chronological timelines means performance issues on some instances quite similar to those experienced by Bluesky).

It’s about taking a step back and acknowledging that no human user can ever catch up with their timeline if they follow tens/hundreds of thousands of users. Their timeline is going to move too fast for any human brain to process the information.

So it’s ok to be lossy.

Bluesky’s probabilistic approach is quite reasonable in this regard.

When fanning out a new post to all followers, calculate a loss_rate for each of those users as 1 - min(reasonable_follows_limit / user_follows, 1).

Then for each user generate a random number between 0 and 1.

If your number is less than the loss_rate, then the post won’t be sent to that user’s timeline. Otherwise, it will be sent.

Suppose that, after some experimentation, you’ve set reasonable_follows_limit to 2500 (after that, timelines become wrecking trains that nobody can catch up with).

If a user follows less than that number of profiles, then the system will try to deliver all the updates to their timeline.

If a user follows 5000 people, then they have a 50% chance of not getting updates to their timeline. If it’s 25000, they’ll only see 10% of the updates.

Of course, like all probabilistic systems, this algorithm isn’t perfect. The number of follows alone isn’t the only indication of the traffic on a timeline (you may have users that follow tens of thousands of users that barely post once a month, and someone who follows 500 users that just so happen to all be very active), and it probably needs further refining. But it’s definitely a step in the right direction.

When designing a system, always keep the real-world use-case in mind.

If you have to design a system that handles financial transactions, then lossiness is probably not an option (even at the cost of latency or system load). But if you design a social network, then it may just be ok to implement a lossy heuristic - especially if the alternative is a hectic timeline and shards constantly at 100%.

https://jazco.dev/2025/02/19/imperfection/

Fabio Manganiello on Nostr: A very good article from the Bluesky engineering team. Often in system design ...