GDPR erasure breaks your foreign keys

The request arrives with a clock attached. A user invokes their right to erasure, or a regulator does it on their behalf, and you have a legal deadline to make their personal data disappear. The obvious first move is also the wrong one: DELETE FROM users WHERE id = @id. Run that on any real schema and you do not get compliance. You get a foreign-key violation, or worse, a cascade that quietly shreds records you were legally required to keep.

Erasure is not a delete. It is a per-row decision about data you have spread across half your system, and getting it right is a data-modeling problem that grows every time someone ships a feature. That is the part worth engineering for, and it is the part Slicekit is actually built around.

Why a single DELETE is wrong

A users row is never alone. The same person is referenced by sessions, API keys, audit events, consent records, permission grants, and whatever the last three features added. Their personal data is also replicated well outside the primary table: application logs, caches, search indexes, analytics pipelines, nightly backups, and third-party processors all hold copies, and all of it is in scope.

So the single-DELETE instinct fails on two fronts. Mechanically, it collides with referential integrity: orphan the children or cascade through them, and either way the database is now lying about its own history. Legally, it deletes too much. The right to erasure is not absolute. Article 17(3) carves out data you are obliged to retain under other law (17(3)(b)) and data you need to establish or defend legal claims (17(3)(e)). Tax records, fraud trails, and the consent receipts that prove the user once agreed are exactly the rows a blanket DELETE would destroy.

The resolution the field has settled on is that erasure often means anonymize rather than delete: keep the row for integrity and retention, strip the personal data out of it. But there is a sharp catch. Anonymization only satisfies Article 17 if re-identification is not reasonably likely. Reversible pseudonymization, where you swap a name for a token but keep the lookup table, does not count. If you can put the human back together, you have not erased anything.

Anonymize, delete, or retain, decided per field

Once you accept that there is no single switch, every user-owned field falls into one of three buckets, and you have to choose deliberately for each one.

A user's personal data

rows keyed by UserId, directly or via a join

Article 15 · export

right of access

Assembled data bundle

one JSON document, secrets excluded

Article 17 · erasure

right to erasure, split per field

Hard delete

sessions, API keys, passkeys, permission grants

Anonymize

Identity + domain User: clear PII, keep the row

Retain for legal

consent records (Article 7(1) bookkeeping)

Erasure is a per-field policy, not a single DELETE: referential integrity and legal retention mean some rows are removed, some are anonymized in place, and a few must be kept on purpose.

Slicekit wires all three into one handler, DeleteUserCommandHandler:

Hard delete for data with no value once the user is gone: refresh-token sessions, API keys, passkeys, permission grants. These go out with ExecuteDeleteAsync.
Anonymize in place for rows a downstream aggregate still depends on. The Identity row and the domain User stay so foreign keys remain valid, but every personal field is cleared:

appUser.Email = null;
appUser.NormalizedEmail = null;
appUser.PhoneNumber = null;
appUser.PasswordHash = null;
appUser.SecurityStamp = Guid.NewGuid().ToString();

Note the rotated SecurityStamp: there is no token left that reconnects this row to a person, which is what keeps it on the right side of the re-identification test rather than being reversible pseudonymization.

Retain unchanged only where no personal data remains after the linked user is anonymized and the law requires the row. UserConsent is the precedent: once anonymized it holds a consent type, a version, and a grant timestamp, non-personal bookkeeping pointed at a user who no longer has a name. That is the 17(3)(b) exemption made concrete.

Naming three strategies forces the question at the moment a table is born. “Which one applies here?” becomes a decision a developer makes once, on purpose, instead of a default that silently does the wrong thing six months later.

The access side, the right to a copy of your personal data under Article 15, is the mirror image: GET /api/v1/users/me/export assembles everything the system holds about the caller into one versioned JSON document. Its own invariant is exclusion. The bundle is handed to the user, so it must carry nothing an attacker could weaponize: no password hash, no refresh-token hashes, no API-key material beyond the display hint, no Identity security stamps, no other user’s identifiers from shared resources.

The build that fails on unclassified personal data

Here is the genuinely interesting part. None of the above stays correct through good intentions, because schemas grow and people forget. So Slicekit makes forgetting a compile-time failure rather than a production incident.

PersonalDataCompletenessTests enumerates every domain type that carries a Guid UserId property and asserts that each one is referenced by both the export handler and the delete handler. Add a new user-owned entity, wire up only one side, and the build goes red before the code can ship. A new personal-data field that nobody classified is not a latent compliance gap waiting for an audit. It is a failing test on the pull request that introduced it.

There are two exemption lists, one per side, and the only way onto either is to write an inline justification: an Article 15 rationale to skip the export, or a legal-retention reason (the UserConsent precedent) to skip deletion. You cannot quietly drop a table from compliance. You can only state, in code that CI reads, why it is exempt.

This is the idea worth stealing even if you never touch Slicekit: a compliance invariant that is expensive to verify by hand becomes cheap when you express it as a test over your own schema. The classification decision moves into the same pull request as the table it covers, checked by the same CI run, while the author still remembers why the table exists. When the first real erasure request arrives, there is no archaeology dig. The handler already knows, field by field, what to delete, what to anonymize, and what to keep.

The parts no test can close for you

Compile-time enforcement covers your primary store. It does not cover everything in scope, and pretending otherwise would be the same mistake as the naive DELETE.

Backups are the obvious gap. You usually cannot, and need not, surgically purge a single user from last night’s snapshot. The UK ICO’s guidance on the right to erasure accepts that backed-up data can be put “beyond use”, left unrestored to live systems and erased on the normal backup cycle, provided you document the approach and honor the erasure if a restore ever happens. That is an operational policy, not a line of C#.

The same is true of the copies that left your database: logs, search indexes, analytics, and any third-party processors holding personal data on your behalf. Each is in scope, and each needs its own plan. The test guarantees you classified every personal-data field in your own model. It is the foundation that makes the rest tractable, not a substitute for it.

For the handler internals, the export schema, the secrets-exclusion table, and the checklist to run before merging anything that touches user data, see the data export and GDPR guide.