Hi,👋 we have updated the app and fixed multiple bugs. We are lacking funds, request to free user not to use Adblock. Ads are non intrusive. 😊

@davidgu: we had an incident because we ...

@davidgu
16 views Jun 03, 2026
Advertisement
1
we had an incident because we migrated traffic to a brand new s3 bucket.

our millions of servers instantly crushed the new bucket’s partitions and started getting slammed with 5xx errors
Media image
2
s3 partitions are AWS’s dirty little hack

Last year, S3’s partitioning system took down our production service.

S3 Partitions put a hard-limit on the amount of traffic your bucket can serve.

they are totally opaque, can change unexpectedly and are impossible to configure (unless you call AWS support!)
3
S3 feels like magic. Bottomless storage, instant retrieval and infinitely scalable.

Turns out that it’s just a massive fleet of servers with real capacity limits and those servers 5xx when overloaded.

We run millions of ec2 instances a day and regularly see S3 errors in our logs.
4
Last year had downtime because we overwhelmed S3 server capacity.

Triggered during a migration to a new S3 bucket used for production data.

During the post-mortem we discovered the obscure S3 quirk that brought down our service.
5
Straight from the AWS docs: “Your application can achieve at least 3,500 PUT/COPY/POST/DELETE or 5,500 GET/HEAD requests per second per partitioned Amazon S3 prefix”.

We shard our data across extremely granular S3 prefixes, so it’s impossible for us to hit these limits.

Confused by this apparent contradiction we investigated deeper.
6
We found that S3 partitioned prefixes are NOT the same as S3 prefixes.

If you create an A/B/C prefix structure, the S3 service can dynamically choose to partition your data by A/B, or even just A!

And their load balancer will distribute load evenly across S3 partitions (not prefixes!).
7
S3 has an invisible background job that analyzes the structure for your prefixes, access patterns and dynamically selects the partition strategy on a per-bucket basis.

Even crazier is that you can call AWS support and tell them to “pre-partition” your bucket!
8
The previous bucket had intelligently adapted its partitioning to our specific traffic pattern.

The new bucket had zero history and its default partitioning was a terrible fit for our workload.

Fun fact: when an S3 server is overloaded it will return a 503 Slow Down
9
Media image
Actions
Visual Editor Carousel Maker NEW
Update Thread
What You Can Do
  • Download as PDF
  • Save to Notion
  • Export as Markdown
  • Visual Editor
  • LinkedIn & Instagram Carousel Maker
Create Free Account

Includes 7-day Premium trial

Advertisement