Skip to main content

GDPR Data Export + Deletion

Category: Security & Compliance ยท ๐Ÿ“ฆ 1 install

Get this pack โ†’

This page is generated from the Air Pipe marketplace. Browse it live to install into your organization.

Two endpoints to satisfy GDPR Article 15 (right of access) and Article 17 (right to erasure). Every request is logged to an audit table. Data is kept on your own infrastructure โ€” nothing leaves your database.


What's includedโ€‹

FilePurpose
config.ymlAirPipe config โ€” seed, export, delete, audit log endpoints
schema.sqlAll table definitions (alternative to using /gdpr/seed)

Endpointsโ€‹

MethodPathBodyDescription
POST/gdpr/seedโ€”Create tables and load sample data for testing
POST/gdpr/export{user_id, requested_by?}Return all data held for a user
POST/gdpr/delete{user_id, requested_by?}Anonymise user, delete content
GET/gdpr/requestsโ€”Audit log of all requests

Setupโ€‹

1. Set managed variableโ€‹

NameValue
DATABASE_URLyour Postgres connection string

2. Deploy the configโ€‹

Upload config.yml via the AirPipe dashboard.

3. Create tables and seed sample dataโ€‹

curl -X POST https://your-airpipe-host/gdpr/seed

This creates all required tables (gdpr_users, gdpr_posts, gdpr_comments, gdpr_requests) and inserts 3 sample users with posts, comments, and audit log entries. Safe to re-run โ€” truncates before inserting.

Alternatively, apply schema.sql manually and populate with your own data:

psql $DATABASE_URL -f schema.sql

Testingโ€‹

Seed sample data:

curl -X POST https://your-airpipe-host/gdpr/seed

Export a user's data:

curl -X POST https://your-airpipe-host/gdpr/export \
-H "Content-Type: application/json" \
-d '{"user_id": 1, "requested_by": "[email protected]"}'

Response:

{
"user": { "id": 1, "email": "[email protected]", ... },
"posts": [ ... ],
"comments": [ ... ]
}

Delete a user's data:

curl -X POST https://your-airpipe-host/gdpr/delete \
-H "Content-Type: application/json" \
-d '{"user_id": 1, "requested_by": "[email protected]"}'

View audit log:

curl https://your-airpipe-host/gdpr/requests

Deletion strategyโ€‹

The config uses two strategies depending on the table:

Anonymise โ€” used for users. The row is kept because other tables may reference the user ID via foreign keys. PII fields (email, username, password_hash) are overwritten with non-identifying values.

Hard delete โ€” used for posts (which cascades to comments via the ON DELETE CASCADE foreign key). Content created by the user is fully removed.

Tables that must be retained (financial / audit)โ€‹

If your schema has tables that cannot be deleted due to financial regulations or audit requirements (invoices, payments, order history), anonymise those rows instead:

- name: AnonymiseOrders
run_when_succeeded: [AnonymiseUser]
database: main
query: |
UPDATE orders SET
customer_name = 'DELETED',
customer_email = '[email protected]',
shipping_address = NULL
WHERE user_id = $1;
params:
- a|ValidateBody::user_id|

Adding more tablesโ€‹

For each additional table owned by the user, add an action following the same pattern โ€” either a DELETE WHERE user_id = $1 or an UPDATE ... SET pii_field = 'DELETED' WHERE user_id = $1.


Protecting these endpointsโ€‹

These endpoints should not be publicly accessible. Options:

API key in header โ€” validate a shared secret before proceeding:

- name: ValidateApiKey
input: a|headers|
hide_data_on_success: true
assert:
http_code_on_error: 403
tests:
- value: x-api-key
is_equal_to: a|ap_var::GDPR_API_KEY|

JWT โ€” use the is_valid_jwt assertion as shown in the REST API Starter pack.


Notesโ€‹

  • requested_by is optional but recommended for audit trail completeness โ€” log who made the request (support agent email, user ID from your auth system, etc.).
  • The export response returns the user row including password_hash. Strip this field if you are returning the export bundle directly to the end user.
  • Response times for the export endpoint scale with the number of rows in related tables. For large datasets, consider adding LIMIT clauses and implementing pagination.

Configurationโ€‹

config.ymlโ€‹

name: GdprExportDeletion

docs: true

# Required managed variables:
# DATABASE_URL โ€” Postgres connection string

global:
databases:
main:
driver: postgres
conn_string: "a|ap_var::DATABASE_URL|"

interfaces:

# POST /gdpr/seed
# Creates all required tables (idempotent) then loads sample users, posts,
# comments, and audit log entries. Safe to re-run โ€” truncates before inserting.
gdpr/seed:
output: http
method: POST

actions:
- name: CreateUsersTable
database: main
hide_data_on_success: true
query: |
CREATE TABLE IF NOT EXISTS gdpr_users (
id BIGSERIAL PRIMARY KEY,
email TEXT NOT NULL UNIQUE,
username TEXT NOT NULL,
password_hash TEXT NOT NULL,
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

- name: CreatePostsTable
run_when_succeeded: [CreateUsersTable]
database: main
hide_data_on_success: true
query: |
CREATE TABLE IF NOT EXISTS gdpr_posts (
id BIGSERIAL PRIMARY KEY,
user_id BIGINT NOT NULL REFERENCES gdpr_users(id) ON DELETE CASCADE,
title TEXT NOT NULL,
body TEXT,
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

- name: CreateCommentsTable
run_when_succeeded: [CreatePostsTable]
database: main
hide_data_on_success: true
query: |
CREATE TABLE IF NOT EXISTS gdpr_comments (
id BIGSERIAL PRIMARY KEY,
user_id BIGINT NOT NULL REFERENCES gdpr_users(id) ON DELETE CASCADE,
post_id BIGINT NOT NULL REFERENCES gdpr_posts(id) ON DELETE CASCADE,
body TEXT NOT NULL,
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

- name: CreateGdprRequestsTable
run_when_succeeded: [CreateCommentsTable]
database: main
hide_data_on_success: true
query: |
CREATE TABLE IF NOT EXISTS gdpr_requests (
id BIGSERIAL PRIMARY KEY,
user_id BIGINT NOT NULL,
request_type TEXT NOT NULL CHECK (request_type IN ('export', 'deletion')),
requested_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
requested_by TEXT,
notes TEXT
);

- name: TruncateTables
run_when_succeeded: [CreateGdprRequestsTable]
database: main
hide_data_on_success: true
query: |
TRUNCATE gdpr_requests, gdpr_comments, gdpr_posts, gdpr_users RESTART IDENTITY CASCADE;

- name: InsertUsers
run_when_succeeded: [TruncateTables]
database: main
hide_data_on_success: true
query: |
INSERT INTO gdpr_users (email, username, password_hash) VALUES
('[email protected]', 'alice', 'hashed_pw_1'),
('[email protected]', 'bob', 'hashed_pw_2'),
('[email protected]', 'carol', 'hashed_pw_3');

- name: InsertPosts
run_when_succeeded: [InsertUsers]
database: main
hide_data_on_success: true
query: |
INSERT INTO gdpr_posts (user_id, title, body)
SELECT u.id, v.title, v.body
FROM (VALUES
('[email protected]', 'My first post', 'Hello from Alice!'),
('[email protected]', 'On privacy', 'Why GDPR matters.'),
('[email protected]', 'Bob''s thoughts', 'Just writing stuff.')
) AS v(email, title, body)
JOIN gdpr_users u ON u.email = v.email;

- name: InsertComments
run_when_succeeded: [InsertPosts]
database: main
hide_data_on_success: true
query: |
INSERT INTO gdpr_comments (user_id, post_id, body)
SELECT u.id, p.id, v.body
FROM (VALUES
('[email protected]', 'My first post', 'Great post, Alice!'),
('[email protected]', 'My first post', 'Thanks for sharing!'),
('[email protected]', 'Bob''s thoughts', 'Thanks Bob!')
) AS v(email, post_title, body)
JOIN gdpr_users u ON u.email = v.email
JOIN gdpr_posts p ON p.title = v.post_title;

- name: InsertGdprRequests
run_when_succeeded: [InsertComments]
database: main
hide_data_on_success: true
query: |
INSERT INTO gdpr_requests (user_id, request_type, requested_by)
SELECT u.id, v.request_type, v.requested_by
FROM (VALUES
('[email protected]', 'export', '[email protected]'),
('[email protected]', 'deletion', '[email protected]')
) AS v(email, request_type, requested_by)
JOIN gdpr_users u ON u.email = v.email;

- name: SeedSummary
run_when_succeeded: [InsertGdprRequests]
database: main
query: |
SELECT
(SELECT COUNT(*) FROM gdpr_users) AS gdpr_users,
(SELECT COUNT(*) FROM gdpr_posts) AS gdpr_posts,
(SELECT COUNT(*) FROM gdpr_comments) AS gdpr_comments,
(SELECT COUNT(*) FROM gdpr_requests) AS gdpr_requests;
post_transforms:
- extract_value: "[0]"

# POST /gdpr/export
# Returns a complete JSON bundle of all data held for a given user.
# Body: { "user_id": 1, "requested_by": "[email protected]" }
gdpr/export:
output: http
method: POST
request_example:
user_id: 1
requested_by: [email protected]

actions:
- name: ValidateBody
input: a|body|
hide_data_on_success: true
assert:
http_code_on_error: 400
tests:
- value: user_id
is_not_null: true

- name: ExportUserData
run_when_succeeded:
actions: [ValidateBody]
http_code_on_error: 400
database: main
query: |
SELECT
row_to_json(u.*) AS user,
COALESCE(
json_agg(DISTINCT to_jsonb(p.*)) FILTER (WHERE p.id IS NOT NULL),
'[]'::json
) AS posts,
COALESCE(
json_agg(DISTINCT to_jsonb(c.*)) FILTER (WHERE c.id IS NOT NULL),
'[]'::json
) AS comments
FROM gdpr_users u
LEFT JOIN gdpr_posts p ON p.user_id = u.id
LEFT JOIN gdpr_comments c ON c.user_id = u.id
WHERE u.id = $1
GROUP BY u.id;
params:
- a|ValidateBody::user_id|
assert:
http_code_on_error: 404
error_message: "User not found"
tests:
- value: count()
is_equal_to: 1
post_transforms:
- extract_value: "[0]"

- name: LogExportRequest
run_when_succeeded: [ExportUserData]
database: main
hide_data_on_success: true
query: |
INSERT INTO gdpr_requests (user_id, request_type, requested_by)
VALUES ($1, 'export', $2);
params:
- a|ValidateBody::user_id|
- a|ValidateBody::requested_by|

# POST /gdpr/delete
# Anonymises the user row and hard-deletes owned content.
# Body: { "user_id": 1, "requested_by": "[email protected]" }
#
# Strategy used here:
# - gdpr_users โ†’ anonymised in-place (row kept for FK integrity)
# - gdpr_posts โ†’ hard deleted (cascades to gdpr_comments via FK)
#
# If you have financial or audit tables that must be retained, use the
# anonymise pattern on those tables instead of deleting them.
gdpr/delete:
output: http
method: POST
request_example:
user_id: 2
requested_by: [email protected]

actions:
- name: ValidateBody
input: a|body|
hide_data_on_success: true
assert:
http_code_on_error: 400
tests:
- value: user_id
is_not_null: true

- name: AnonymiseUser
run_when_succeeded:
actions: [ValidateBody]
http_code_on_error: 400
database: main
query: |
UPDATE gdpr_users SET
email = 'deleted_' || id || '@deleted.invalid',
username = 'deleted_' || id,
password_hash = 'DELETED',
updated_at = NOW()
WHERE id = $1
RETURNING id;
params:
- a|ValidateBody::user_id|
assert:
http_code_on_error: 404
error_message: "User not found"
tests:
- value: count()
is_equal_to: 1
post_transforms:
- extract_value: "[0]"

- name: DeletePosts
run_when_succeeded: [AnonymiseUser]
database: main
hide_data_on_success: true
query: |
DELETE FROM gdpr_posts WHERE user_id = $1;
params:
- a|ValidateBody::user_id|

- name: LogDeletionRequest
run_when_succeeded: [DeletePosts]
database: main
hide_data_on_success: true
query: |
INSERT INTO gdpr_requests (user_id, request_type, requested_by)
VALUES ($1, 'deletion', $2);
params:
- a|ValidateBody::user_id|
- a|ValidateBody::requested_by|

# GET /gdpr/requests
# Audit log of all export and deletion requests.
gdpr/requests:
output: http

actions:
- name: ListRequests
database: main
query: |
SELECT id, user_id, request_type, requested_by, requested_at
FROM gdpr_requests
ORDER BY requested_at DESC
LIMIT 500;

schema.sqlโ€‹

-- Sample application tables and GDPR audit log.
-- All tables are prefixed gdpr_ to avoid collisions with other examples.
-- Run this once, or use POST /gdpr/seed which creates and seeds everything automatically.

CREATE TABLE IF NOT EXISTS gdpr_users (
id BIGSERIAL PRIMARY KEY,
email TEXT NOT NULL UNIQUE,
username TEXT NOT NULL,
password_hash TEXT NOT NULL,
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

CREATE TABLE IF NOT EXISTS gdpr_posts (
id BIGSERIAL PRIMARY KEY,
user_id BIGINT NOT NULL REFERENCES gdpr_users(id) ON DELETE CASCADE,
title TEXT NOT NULL,
body TEXT,
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

CREATE TABLE IF NOT EXISTS gdpr_comments (
id BIGSERIAL PRIMARY KEY,
user_id BIGINT NOT NULL REFERENCES gdpr_users(id) ON DELETE CASCADE,
post_id BIGINT NOT NULL REFERENCES gdpr_posts(id) ON DELETE CASCADE,
body TEXT NOT NULL,
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

CREATE TABLE IF NOT EXISTS gdpr_requests (
id BIGSERIAL PRIMARY KEY,
user_id BIGINT NOT NULL,
request_type TEXT NOT NULL CHECK (request_type IN ('export', 'deletion')),
requested_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
requested_by TEXT,
notes TEXT
);

CREATE INDEX IF NOT EXISTS idx_gdpr_requests_user_id ON gdpr_requests (user_id);