LLM Gateway
Category: AI & Agents
This page is generated from the Air Pipe marketplace. Browse it live to install into your organization.
A configurable LLM completion and content moderation endpoint. System prompt, model, and API key live in managed variables — swap providers or update prompts without touching the config or redeploying.
What's included
| File | Purpose |
|---|---|
config.yml | AirPipe config with docs: true |
Endpoints
| Method | Path | Description |
|---|---|---|
POST | /ai/complete | Send a message, receive a completion |
POST | /ai/moderate | Check text against OpenAI's moderation API |
Setup
1. Get an OpenAI API key
Sign up at platform.openai.com. New accounts receive free credits. No credit card required to start testing.
2. Set managed variables
| Name | Example value |
|---|---|
OPENAI_API_KEY | sk-proj-... |
SYSTEM_PROMPT | You are a helpful assistant. |
LLM_MODEL | gpt-4o-mini |
gpt-4o-mini is the recommended starting model — fast, cheap, and capable for most tasks.
Testing
BASE=https://your-airpipe-host
# Completion
curl -X POST $BASE/ai/complete \
-H "Content-Type: application/json" \
-d '{"message": "What is AirPipe in one sentence?", "max_tokens": 100}'
# Moderation check
curl -X POST $BASE/ai/moderate \
-H "Content-Type: application/json" \
-d '{"text": "Some user-generated content to check"}'
Response shape
/ai/complete
The response mirrors the OpenAI API directly:
{
"id": "chatcmpl-...",
"choices": [
{
"message": { "role": "assistant", "content": "AirPipe is..." },
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 24,
"completion_tokens": 18,
"total_tokens": 42
}
}
Extract the completion text from choices[0].message.content.
/ai/moderate
{
"results": [
{
"flagged": false,
"categories": { "hate": false, "violence": false, ... },
"category_scores": { "hate": 0.0001, "violence": 0.0003, ... }
}
]
}
Check results[0].flagged for a simple pass/fail.
Changing the system prompt
Update the SYSTEM_PROMPT managed variable in the AirPipe dashboard. The next request will use the new prompt immediately — no redeploy needed. This makes it straightforward to:
- Tune the assistant's persona or knowledge scope
- A/B test different system prompts by maintaining two deployments with different variables
- Run environment-specific prompts (stricter in production, more permissive in staging)
Using Anthropic instead of OpenAI
Change the managed variables and update the CallLLM action's URL and headers:
- name: CallLLM
http:
url: https://api.anthropic.com/v1/messages
headers:
content-type: application/json
x-api-key: "a|ap_var::ANTHROPIC_API_KEY|"
anthropic-version: "2023-06-01"
body: |
{
"model": a|double_quote(ap_var::LLM_MODEL)|,
"max_tokens": 1024,
"system": a|double_quote(ap_var::SYSTEM_PROMPT)|,
"messages": [
{ "role": "user", "content": a|double_quote(ValidateBody::message)| }
]
}
Response is then at content[0].text instead of choices[0].message.content.
Building a RAG pipeline
Add a vector search step before CallLLM to retrieve relevant context and inject it into the prompt:
- name: RetrieveContext
run_when_succeeded: [ValidateBody]
database: main
query: |
SELECT content FROM documents
ORDER BY embedding <-> $1::vector
LIMIT 3;
params:
- a|ValidateBody::embedding|
Then compose the enriched prompt in the CallLLM body using the retrieved context.
Notes
double_quote()is an AirPipe interpolation wrapper that escapes the resolved value and wraps it in double quotes — safe to use directly inside JSON string positions.- The moderation endpoint uses OpenAI's free moderation API, which does not count against your token quota.
max_tokensdefaults tonullin the request body if omitted, which lets the model use its default maximum.
Configuration
config.yml
name: LlmGateway
description: >
A configurable LLM completion endpoint. System prompt, model, and API key
are managed variables — swap providers or update prompts without redeploying.
docs: true
# Required managed variables:
# OPENAI_API_KEY — OpenAI API key (or your provider's equivalent)
# SYSTEM_PROMPT — System prompt sent to the model on every request
# LLM_MODEL — Model name, e.g. gpt-4o-mini or gpt-4o
interfaces:
# POST /ai/complete
# Send a message and receive a completion.
# Body: { "message": "Summarise this text: ...", "max_tokens": 500 }
ai/complete:
output: http
method: POST
summary: LLM completion
description: >
Send a message to the configured LLM. The system prompt and model are
set via managed variables — no redeploy needed to change them.
tags: [ai]
request_example:
message: "Explain what an API is in one sentence"
max_tokens: 200
notes: |
Response shape mirrors the OpenAI API: `choices[0].message.content` contains
the completion text. `usage` contains token counts.
actions:
- name: ValidateBody
input: a|body|
hide_data_on_success: true
assert:
http_code_on_error: 400
tests:
- value: message
is_not_null: true
is_not_empty: true
description: "The user message to send to the model"
- name: CallLLM
run_when_succeeded:
actions: [ValidateBody]
http_code_on_error: 400
http:
url: https://api.openai.com/v1/chat/completions
method: POST
headers:
content-type: application/json
authorization: "Bearer a|ap_var::OPENAI_API_KEY|"
body: |
{
"model": a|double_quote(ap_var::LLM_MODEL)|,
"max_tokens": a|ValidateBody::max_tokens|,
"messages": [
{ "role": "system", "content": a|double_quote(ap_var::SYSTEM_PROMPT)| },
{ "role": "user", "content": a|double_quote(ValidateBody::message)| }
]
}
assert:
http_code_on_error: 502
error_message: "LLM provider error"
tests:
- value: status
is_equal_to: 200
- name: ExtractResponse
run_when_succeeded: [CallLLM]
input: a|CallLLM|
post_transforms:
- extract_value: body
- name: TrackCompletion
run_when_succeeded: [ExtractResponse]
hide_data_on_success: true
emit_metric:
name: app_llm_completions_total
type: counter
labels:
model: a|ap_var::LLM_MODEL|
# POST /ai/moderate
# Check whether a piece of text violates OpenAI's content policy.
# Useful as a guard before storing user-generated content.
# Body: { "text": "..." }
ai/moderate:
output: http
method: POST
summary: Content moderation
description: >
Check text against OpenAI's moderation API. Returns `flagged: true` if
the content violates usage policies, along with per-category scores.
tags: [ai]
request_example:
text: "Some user-generated content to check"
notes: |
Uses the OpenAI Moderation API which is free regardless of your plan.
actions:
- name: ValidateBody
input: a|body|
hide_data_on_success: true
assert:
http_code_on_error: 400
tests:
- value: text
is_not_null: true
is_not_empty: true
description: "Text to moderate"
- name: CallModeration
run_when_succeeded:
actions: [ValidateBody]
http_code_on_error: 400
http:
url: https://api.openai.com/v1/moderations
method: POST
headers:
content-type: application/json
authorization: "Bearer a|ap_var::OPENAI_API_KEY|"
body: |
{
"input": a|double_quote(ValidateBody::text)|
}
assert:
http_code_on_error: 502
error_message: "Moderation API error"
tests:
- value: status
is_equal_to: 200
- name: ExtractResult
run_when_succeeded: [CallModeration]
input: a|CallModeration|
post_transforms:
- extract_value: body
- name: TrackModeration
run_when_succeeded: [ExtractResult]
hide_data_on_success: true
emit_metric:
name: app_llm_moderations_total
type: counter