Skip to main content
00 Days
00 Hrs
00 Min
00 Sec

What Is a Tombstone? The Record That Exists to Say Something Is Gone

Deleting something feels like it should be the simplest operation a database performs. You have a record, you don't want it anymore, you remove it. In a single database on a single machine, that's more or less how it works. But in a distributed system, where the same data lives in copies spread across many machines, deletion turns out to be one of the trickier things to get right, and the solution is a concept that sounds almost paradoxical: a record that exists specifically to say that something doesn't.

That record is called a tombstone. It marks the spot where data used to be, and it exists because in a distributed system, simply removing data creates a problem worse than the one it solves.

To see why, picture data that lives in several copies across different machines, an arrangement distributed systems use constantly for reliability and speed. Now suppose you delete a record, and the way you delete it is to genuinely erase it from one of the machines. That machine now has no record. But the other machines still have their copies, and they have no idea anything happened. As these machines sync with each other, comparing what they hold, one of them notices that a record it has is missing from the machine you deleted from. From its point of view, the most reasonable conclusion is that the record is missing by mistake, so it helpfully sends its copy back. The deletion you performed gets undone by the very synchronization that's supposed to keep the system consistent. The data rises from the dead.

This is the problem a tombstone solves. Instead of erasing the record, the system replaces it with a marker that says, in effect, "this was deleted, at this time." The data is gone in the sense that queries no longer return it, but something remains in its place to record the fact of the deletion. Now when the machines sync, the others don't see a mysteriously missing record. They see an explicit instruction that the record was deleted, and they apply that deletion to their own copies. The absence propagates correctly because the absence was written down.

The name is apt. A tombstone in a graveyard marks where someone is buried; it doesn't bring them back, but it records that they existed and are gone. A database tombstone does the same for data. The record isn't returned in results, but its marker persists, carrying the information that a deletion occurred so the rest of the system can act on it.

This solves the resurrection problem, but it introduces a new question: how long do the tombstones stick around? They can't last forever, because a system that has been deleting data for years would accumulate an enormous pile of these markers, each taking up space and adding to the work of every sync and query. A database full of tombstones is carrying the weight of everything it has ever deleted, which defeats much of the point of deleting.

So tombstones are eventually cleaned up, in a process often called compaction or garbage collection, which permanently removes them after they've done their job. But the cleanup has to wait long enough to be safe. The tombstone needs to survive until every copy in the system has seen it and applied the deletion. Remove the tombstone too early, before some lagging or temporarily offline machine has learned about the deletion, and that machine can still resurrect the data, because the marker that would have told it about the deletion is now itself gone. The cleanup has to be patient enough to guarantee the news has reached everyone.

That timing creates a genuine tension that the people who run these systems have to manage. Keep tombstones too long and they pile up, degrading performance and wasting space. Remove them too soon and you risk data coming back from the dead. The right window depends on how the particular system propagates changes and how long a machine might plausibly be out of contact before rejoining. Getting it wrong in either direction causes real problems, which is why tombstone behavior is something distributed database operators pay close attention to.

There's a subtler consequence worth noting, which is that tombstones can cause performance problems even while they're doing their job correctly. A query that scans a range of data may have to wade through large numbers of tombstones, markers for deleted records, to find the live records it actually wants. In systems where a lot of data gets deleted, these accumulated markers can slow queries down noticeably, a confusing situation where deleting data makes the database slower rather than faster, at least until the tombstones are cleaned up. It's a counterintuitive effect that catches people off guard precisely because it runs against the assumption that removing data should always lighten the load.

The broader lesson tombstones teach is that in distributed systems, absence is not the same as nothing. When data lives in one place, deleting it can be a matter of simply removing it. When data lives in many places that must agree with each other, the fact of a deletion is itself information that has to be recorded, communicated, and eventually retired with care. A tombstone is how a distributed system remembers, for a while, the things it has been told to forget, long enough to make sure everyone forgets them together.