Resend – Incident report for February 21st, 2024

Summary (TL;DR)

On February 21st, 2024, Resend experienced an outage that affected all users due to a database migration that went wrong. This prevented users from using the API (including sending emails) and accessing the dashboard from 05:01 to 17:05 UTC (about 12 hours).

The database migration accidentally deleted data from production servers. We immediately began the restoration process from a backup, which completed 6 hours later. Unfortunately, once it finished, we found that it failed to restore the data, so we had to start the restoration process a second time with a different backup.

During this time, no API requests were being accepted and no data being stored. For data created prior to the migration, there was 5 minute of data loss from when the migration started and the database went offline from 04:50:00 to 04:56:27 UTC. We are currently working on re-populating the data from this 5-minute window.

We sincerely apologize for the impact and inconvenience caused by this outage. We place immense importance on reliability, but this week, we fell short of our commitment to you all. It is clear that we have a long way to go in becoming an industry-leading infrastructure provider, but in learning from this incident, we will improve our operations and tooling to avoid outages like this in the future, whatever the cause.

Timeline

All times are in Coordinated Universal Time (UTC)

February 21st, 2024

04:56: Database migration started
04:57: Noticed tables being dropped from the production database
05:01: Began restoring the database from a backup
05:02: Posted on status page, updating every 30-60 minutes until resolution
11:02: First restoration process completed
11:03: Realized the first backup failed and started to investigate
11:33: Found that the backup failed due to a wrong selection of the backup timestamp
11:48: Increased compute to speed up the restoration process – updated database memory from 128GB to 256GB and CPU from 32-core ARM to 64-core ARM
12:05: Began restoring the database from an older backup
17:01: Second restoration process completed
17:02: API began receiving requests
17:05: Dashboard was accessible again, and incident was resolved

What happened

While building a feature, we performed a database migration command locally, but it incorrectly pointed to the production environment instead, which dropped all tables in production.

The first attempt to restore the database took 6 hours but failed due to a wrong selection of the backup timestamp. The second attempt to restore took an extra 5 hours and succeeded, bringing all data back besides a 5-minute window of data loss.

Impact

All users were unable to send emails, use the API, or access the Resend dashboard for 12 hours from 05:01 to 17:05 UTC.

For data created prior to the migration, there was 5 minutes of data loss from when the migration started and the database went offline from 04:50:00 to 04:56:27 UTC.

Next steps and improvements

Re-populate data from the 5-minute window of data loss.
No accessible user role should have write privileges on the production database.
Improve local development to reduce risks related to database migrations.
Create redundancy to preserve sending function even during a database outage.
Increase cadence for disaster recovery tests.
Implement incident banner on Resend dashboard to inform users quickly.

To our customers, we are deeply sorry that this incident occurred and that it prevented you from delivering your mission-critical emails. We know that actions speak louder than words, so we will continue to learn, grow, and improve, starting by implementing the improvements listed above.

News Week
Magazine PRO

Company

Resend – Incident report for February 21st, 2024

Summary (TL;DR)

Timeline

What happened

Impact

Next steps and improvements

LEAVE A REPLY Cancel reply

Subscribe

Zara Black Friday Sale 2024: 10 Early Deals to Shop Right Now

35 Trending Winter Nail Colors to Try in 2024

12 Early Black Friday Shoe Deals Our Editors Are Shopping in 2024

Ariana Grande Wore This Sparkly Highlighter as Glinda in ‘Wicked’

Lucy Liu’s ‘Red One’ Press Tour: See Every Stunning Look

More like this
Related

Zara Black Friday Sale 2024: 10 Early Deals to Shop Right Now

35 Trending Winter Nail Colors to Try in 2024

12 Early Black Friday Shoe Deals Our Editors Are Shopping in 2024

Ariana Grande Wore This Sparkly Highlighter as Glinda in ‘Wicked’

About us

Company

The latest

Zara Black Friday Sale 2024: 10 Early Deals to Shop Right Now

35 Trending Winter Nail Colors to Try in 2024

12 Early Black Friday Shoe Deals Our Editors Are Shopping in 2024

Subscribe

News WeekMagazine PRO

Company

Resend – Incident report for February 21st, 2024

Summary (TL;DR)

Timeline

What happened

Impact

Next steps and improvements

LEAVE A REPLY Cancel reply

Subscribe

More like thisRelated

About us

Company

The latest

Subscribe

News Week
Magazine PRO

More like this
Related