GDPR Data Export + Deletion
Category: Security & Compliance ยท ๐ฆ 1 install
This page is generated from the Air Pipe marketplace. Browse it live to install into your organization.
Two endpoints to satisfy GDPR Article 15 (right of access) and Article 17 (right to erasure). Every request is logged to an audit table. Data is kept on your own infrastructure โ nothing leaves your database.
What's includedโ
| File | Purpose |
|---|---|
config.yml | AirPipe config โ seed, export, delete, audit log endpoints |
schema.sql | All table definitions (alternative to using /gdpr/seed) |
Endpointsโ
| Method | Path | Body | Description |
|---|---|---|---|
POST | /gdpr/seed | โ | Create tables and load sample data for testing |
POST | /gdpr/export | {user_id, requested_by?} | Return all data held for a user |
POST | /gdpr/delete | {user_id, requested_by?} | Anonymise user, delete content |
GET | /gdpr/requests | โ | Audit log of all requests |
Setupโ
1. Set managed variableโ
| Name | Value |
|---|---|
DATABASE_URL | your Postgres connection string |
2. Deploy the configโ
Upload config.yml via the AirPipe dashboard.
3. Create tables and seed sample dataโ
curl -X POST https://your-airpipe-host/gdpr/seed
This creates all required tables (gdpr_users, gdpr_posts, gdpr_comments, gdpr_requests) and inserts 3 sample users with posts, comments, and audit log entries. Safe to re-run โ truncates before inserting.
Alternatively, apply schema.sql manually and populate with your own data:
psql $DATABASE_URL -f schema.sql
Testingโ
Seed sample data:
curl -X POST https://your-airpipe-host/gdpr/seed
Export a user's data:
curl -X POST https://your-airpipe-host/gdpr/export \
-H "Content-Type: application/json" \
-d '{"user_id": 1, "requested_by": "[email protected]"}'
Response:
{
"user": { "id": 1, "email": "[email protected]", ... },
"posts": [ ... ],
"comments": [ ... ]
}
Delete a user's data:
curl -X POST https://your-airpipe-host/gdpr/delete \
-H "Content-Type: application/json" \
-d '{"user_id": 1, "requested_by": "[email protected]"}'
View audit log:
curl https://your-airpipe-host/gdpr/requests
Deletion strategyโ
The config uses two strategies depending on the table:
Anonymise โ used for users. The row is kept because other tables may reference the user ID via foreign keys. PII fields (email, username, password_hash) are overwritten with non-identifying values.
Hard delete โ used for posts (which cascades to comments via the ON DELETE CASCADE foreign key). Content created by the user is fully removed.
Tables that must be retained (financial / audit)โ
If your schema has tables that cannot be deleted due to financial regulations or audit requirements (invoices, payments, order history), anonymise those rows instead:
- name: AnonymiseOrders
run_when_succeeded: [AnonymiseUser]
database: main
query: |
UPDATE orders SET
customer_name = 'DELETED',
customer_email = '[email protected]',
shipping_address = NULL
WHERE user_id = $1;
params:
- a|ValidateBody::user_id|
Adding more tablesโ
For each additional table owned by the user, add an action following the same pattern โ either a DELETE WHERE user_id = $1 or an UPDATE ... SET pii_field = 'DELETED' WHERE user_id = $1.
Protecting these endpointsโ
These endpoints should not be publicly accessible. Options:
API key in header โ validate a shared secret before proceeding:
- name: ValidateApiKey
input: a|headers|
hide_data_on_success: true
assert:
http_code_on_error: 403
tests:
- value: x-api-key
is_equal_to: a|ap_var::GDPR_API_KEY|
JWT โ use the is_valid_jwt assertion as shown in the REST API Starter pack.
Notesโ
requested_byis optional but recommended for audit trail completeness โ log who made the request (support agent email, user ID from your auth system, etc.).- The export response returns the user row including
password_hash. Strip this field if you are returning the export bundle directly to the end user. - Response times for the export endpoint scale with the number of rows in related tables. For large datasets, consider adding
LIMITclauses and implementing pagination.
Configurationโ
config.ymlโ
name: GdprExportDeletion
docs: true
# Required managed variables:
# DATABASE_URL โ Postgres connection string
global:
databases:
main:
driver: postgres
conn_string: "a|ap_var::DATABASE_URL|"
interfaces:
# POST /gdpr/seed
# Creates all required tables (idempotent) then loads sample users, posts,
# comments, and audit log entries. Safe to re-run โ truncates before inserting.
gdpr/seed:
output: http
method: POST
actions:
- name: CreateUsersTable
database: main
hide_data_on_success: true
query: |
CREATE TABLE IF NOT EXISTS gdpr_users (
id BIGSERIAL PRIMARY KEY,
email TEXT NOT NULL UNIQUE,
username TEXT NOT NULL,
password_hash TEXT NOT NULL,
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
- name: CreatePostsTable
run_when_succeeded: [CreateUsersTable]
database: main
hide_data_on_success: true
query: |
CREATE TABLE IF NOT EXISTS gdpr_posts (
id BIGSERIAL PRIMARY KEY,
user_id BIGINT NOT NULL REFERENCES gdpr_users(id) ON DELETE CASCADE,
title TEXT NOT NULL,
body TEXT,
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
- name: CreateCommentsTable
run_when_succeeded: [CreatePostsTable]
database: main
hide_data_on_success: true
query: |
CREATE TABLE IF NOT EXISTS gdpr_comments (
id BIGSERIAL PRIMARY KEY,
user_id BIGINT NOT NULL REFERENCES gdpr_users(id) ON DELETE CASCADE,
post_id BIGINT NOT NULL REFERENCES gdpr_posts(id) ON DELETE CASCADE,
body TEXT NOT NULL,
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
- name: CreateGdprRequestsTable
run_when_succeeded: [CreateCommentsTable]
database: main
hide_data_on_success: true
query: |
CREATE TABLE IF NOT EXISTS gdpr_requests (
id BIGSERIAL PRIMARY KEY,
user_id BIGINT NOT NULL,
request_type TEXT NOT NULL CHECK (request_type IN ('export', 'deletion')),
requested_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
requested_by TEXT,
notes TEXT
);
- name: TruncateTables
run_when_succeeded: [CreateGdprRequestsTable]
database: main
hide_data_on_success: true
query: |
TRUNCATE gdpr_requests, gdpr_comments, gdpr_posts, gdpr_users RESTART IDENTITY CASCADE;
- name: InsertUsers
run_when_succeeded: [TruncateTables]
database: main
hide_data_on_success: true
query: |
INSERT INTO gdpr_users (email, username, password_hash) VALUES
('[email protected]', 'alice', 'hashed_pw_1'),
('[email protected]', 'bob', 'hashed_pw_2'),
('[email protected]', 'carol', 'hashed_pw_3');
- name: InsertPosts
run_when_succeeded: [InsertUsers]
database: main
hide_data_on_success: true
query: |
INSERT INTO gdpr_posts (user_id, title, body)
SELECT u.id, v.title, v.body
FROM (VALUES
('[email protected]', 'My first post', 'Hello from Alice!'),
('[email protected]', 'On privacy', 'Why GDPR matters.'),
('[email protected]', 'Bob''s thoughts', 'Just writing stuff.')
) AS v(email, title, body)
JOIN gdpr_users u ON u.email = v.email;
- name: InsertComments
run_when_succeeded: [InsertPosts]
database: main
hide_data_on_success: true
query: |
INSERT INTO gdpr_comments (user_id, post_id, body)
SELECT u.id, p.id, v.body
FROM (VALUES
('[email protected]', 'My first post', 'Great post, Alice!'),
('[email protected]', 'My first post', 'Thanks for sharing!'),
('[email protected]', 'Bob''s thoughts', 'Thanks Bob!')
) AS v(email, post_title, body)
JOIN gdpr_users u ON u.email = v.email
JOIN gdpr_posts p ON p.title = v.post_title;
- name: InsertGdprRequests
run_when_succeeded: [InsertComments]
database: main
hide_data_on_success: true
query: |
INSERT INTO gdpr_requests (user_id, request_type, requested_by)
SELECT u.id, v.request_type, v.requested_by
FROM (VALUES
('[email protected]', 'export', '[email protected]'),
('[email protected]', 'deletion', '[email protected]')
) AS v(email, request_type, requested_by)
JOIN gdpr_users u ON u.email = v.email;
- name: SeedSummary
run_when_succeeded: [InsertGdprRequests]
database: main
query: |
SELECT
(SELECT COUNT(*) FROM gdpr_users) AS gdpr_users,
(SELECT COUNT(*) FROM gdpr_posts) AS gdpr_posts,
(SELECT COUNT(*) FROM gdpr_comments) AS gdpr_comments,
(SELECT COUNT(*) FROM gdpr_requests) AS gdpr_requests;
post_transforms:
- extract_value: "[0]"
# POST /gdpr/export
# Returns a complete JSON bundle of all data held for a given user.
# Body: { "user_id": 1, "requested_by": "[email protected]" }
gdpr/export:
output: http
method: POST
request_example:
user_id: 1
requested_by: [email protected]
actions:
- name: ValidateBody
input: a|body|
hide_data_on_success: true
assert:
http_code_on_error: 400
tests:
- value: user_id
is_not_null: true
- name: ExportUserData
run_when_succeeded:
actions: [ValidateBody]
http_code_on_error: 400
database: main
query: |
SELECT
row_to_json(u.*) AS user,
COALESCE(
json_agg(DISTINCT to_jsonb(p.*)) FILTER (WHERE p.id IS NOT NULL),
'[]'::json
) AS posts,
COALESCE(
json_agg(DISTINCT to_jsonb(c.*)) FILTER (WHERE c.id IS NOT NULL),
'[]'::json
) AS comments
FROM gdpr_users u
LEFT JOIN gdpr_posts p ON p.user_id = u.id
LEFT JOIN gdpr_comments c ON c.user_id = u.id
WHERE u.id = $1
GROUP BY u.id;
params:
- a|ValidateBody::user_id|
assert:
http_code_on_error: 404
error_message: "User not found"
tests:
- value: count()
is_equal_to: 1
post_transforms:
- extract_value: "[0]"
- name: LogExportRequest
run_when_succeeded: [ExportUserData]
database: main
hide_data_on_success: true
query: |
INSERT INTO gdpr_requests (user_id, request_type, requested_by)
VALUES ($1, 'export', $2);
params:
- a|ValidateBody::user_id|
- a|ValidateBody::requested_by|
# POST /gdpr/delete
# Anonymises the user row and hard-deletes owned content.
# Body: { "user_id": 1, "requested_by": "[email protected]" }
#
# Strategy used here:
# - gdpr_users โ anonymised in-place (row kept for FK integrity)
# - gdpr_posts โ hard deleted (cascades to gdpr_comments via FK)
#
# If you have financial or audit tables that must be retained, use the
# anonymise pattern on those tables instead of deleting them.
gdpr/delete:
output: http
method: POST
request_example:
user_id: 2
requested_by: [email protected]
actions:
- name: ValidateBody
input: a|body|
hide_data_on_success: true
assert:
http_code_on_error: 400
tests:
- value: user_id
is_not_null: true
- name: AnonymiseUser
run_when_succeeded:
actions: [ValidateBody]
http_code_on_error: 400
database: main
query: |
UPDATE gdpr_users SET
email = 'deleted_' || id || '@deleted.invalid',
username = 'deleted_' || id,
password_hash = 'DELETED',
updated_at = NOW()
WHERE id = $1
RETURNING id;
params:
- a|ValidateBody::user_id|
assert:
http_code_on_error: 404
error_message: "User not found"
tests:
- value: count()
is_equal_to: 1
post_transforms:
- extract_value: "[0]"
- name: DeletePosts
run_when_succeeded: [AnonymiseUser]
database: main
hide_data_on_success: true
query: |
DELETE FROM gdpr_posts WHERE user_id = $1;
params:
- a|ValidateBody::user_id|
- name: LogDeletionRequest
run_when_succeeded: [DeletePosts]
database: main
hide_data_on_success: true
query: |
INSERT INTO gdpr_requests (user_id, request_type, requested_by)
VALUES ($1, 'deletion', $2);
params:
- a|ValidateBody::user_id|
- a|ValidateBody::requested_by|
# GET /gdpr/requests
# Audit log of all export and deletion requests.
gdpr/requests:
output: http
actions:
- name: ListRequests
database: main
query: |
SELECT id, user_id, request_type, requested_by, requested_at
FROM gdpr_requests
ORDER BY requested_at DESC
LIMIT 500;
schema.sqlโ
-- Sample application tables and GDPR audit log.
-- All tables are prefixed gdpr_ to avoid collisions with other examples.
-- Run this once, or use POST /gdpr/seed which creates and seeds everything automatically.
CREATE TABLE IF NOT EXISTS gdpr_users (
id BIGSERIAL PRIMARY KEY,
email TEXT NOT NULL UNIQUE,
username TEXT NOT NULL,
password_hash TEXT NOT NULL,
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
CREATE TABLE IF NOT EXISTS gdpr_posts (
id BIGSERIAL PRIMARY KEY,
user_id BIGINT NOT NULL REFERENCES gdpr_users(id) ON DELETE CASCADE,
title TEXT NOT NULL,
body TEXT,
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
CREATE TABLE IF NOT EXISTS gdpr_comments (
id BIGSERIAL PRIMARY KEY,
user_id BIGINT NOT NULL REFERENCES gdpr_users(id) ON DELETE CASCADE,
post_id BIGINT NOT NULL REFERENCES gdpr_posts(id) ON DELETE CASCADE,
body TEXT NOT NULL,
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
CREATE TABLE IF NOT EXISTS gdpr_requests (
id BIGSERIAL PRIMARY KEY,
user_id BIGINT NOT NULL,
request_type TEXT NOT NULL CHECK (request_type IN ('export', 'deletion')),
requested_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
requested_by TEXT,
notes TEXT
);
CREATE INDEX IF NOT EXISTS idx_gdpr_requests_user_id ON gdpr_requests (user_id);