Skip to main content

LLM Gateway

Category: AI & Agents

Get this pack →

This page is generated from the Air Pipe marketplace. Browse it live to install into your organization.

A configurable LLM completion and content moderation endpoint. System prompt, model, and API key live in managed variables — swap providers or update prompts without touching the config or redeploying.


What's included

FilePurpose
config.ymlAirPipe config with docs: true

Endpoints

MethodPathDescription
POST/ai/completeSend a message, receive a completion
POST/ai/moderateCheck text against OpenAI's moderation API

Setup

1. Get an OpenAI API key

Sign up at platform.openai.com. New accounts receive free credits. No credit card required to start testing.

2. Set managed variables

NameExample value
OPENAI_API_KEYsk-proj-...
SYSTEM_PROMPTYou are a helpful assistant.
LLM_MODELgpt-4o-mini

gpt-4o-mini is the recommended starting model — fast, cheap, and capable for most tasks.


Testing

BASE=https://your-airpipe-host

# Completion
curl -X POST $BASE/ai/complete \
-H "Content-Type: application/json" \
-d '{"message": "What is AirPipe in one sentence?", "max_tokens": 100}'

# Moderation check
curl -X POST $BASE/ai/moderate \
-H "Content-Type: application/json" \
-d '{"text": "Some user-generated content to check"}'

Response shape

/ai/complete

The response mirrors the OpenAI API directly:

{
"id": "chatcmpl-...",
"choices": [
{
"message": { "role": "assistant", "content": "AirPipe is..." },
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 24,
"completion_tokens": 18,
"total_tokens": 42
}
}

Extract the completion text from choices[0].message.content.

/ai/moderate

{
"results": [
{
"flagged": false,
"categories": { "hate": false, "violence": false, ... },
"category_scores": { "hate": 0.0001, "violence": 0.0003, ... }
}
]
}

Check results[0].flagged for a simple pass/fail.


Changing the system prompt

Update the SYSTEM_PROMPT managed variable in the AirPipe dashboard. The next request will use the new prompt immediately — no redeploy needed. This makes it straightforward to:

  • Tune the assistant's persona or knowledge scope
  • A/B test different system prompts by maintaining two deployments with different variables
  • Run environment-specific prompts (stricter in production, more permissive in staging)

Using Anthropic instead of OpenAI

Change the managed variables and update the CallLLM action's URL and headers:

- name: CallLLM
http:
url: https://api.anthropic.com/v1/messages
headers:
content-type: application/json
x-api-key: "a|ap_var::ANTHROPIC_API_KEY|"
anthropic-version: "2023-06-01"
body: |
{
"model": a|double_quote(ap_var::LLM_MODEL)|,
"max_tokens": 1024,
"system": a|double_quote(ap_var::SYSTEM_PROMPT)|,
"messages": [
{ "role": "user", "content": a|double_quote(ValidateBody::message)| }
]
}

Response is then at content[0].text instead of choices[0].message.content.


Building a RAG pipeline

Add a vector search step before CallLLM to retrieve relevant context and inject it into the prompt:

- name: RetrieveContext
run_when_succeeded: [ValidateBody]
database: main
query: |
SELECT content FROM documents
ORDER BY embedding <-> $1::vector
LIMIT 3;
params:
- a|ValidateBody::embedding|

Then compose the enriched prompt in the CallLLM body using the retrieved context.


Notes

  • double_quote() is an AirPipe interpolation wrapper that escapes the resolved value and wraps it in double quotes — safe to use directly inside JSON string positions.
  • The moderation endpoint uses OpenAI's free moderation API, which does not count against your token quota.
  • max_tokens defaults to null in the request body if omitted, which lets the model use its default maximum.

Configuration

config.yml

name: LlmGateway
description: >
A configurable LLM completion endpoint. System prompt, model, and API key
are managed variables — swap providers or update prompts without redeploying.

docs: true

# Required managed variables:
# OPENAI_API_KEY — OpenAI API key (or your provider's equivalent)
# SYSTEM_PROMPT — System prompt sent to the model on every request
# LLM_MODEL — Model name, e.g. gpt-4o-mini or gpt-4o

interfaces:

# POST /ai/complete
# Send a message and receive a completion.
# Body: { "message": "Summarise this text: ...", "max_tokens": 500 }
ai/complete:
output: http
method: POST
summary: LLM completion
description: >
Send a message to the configured LLM. The system prompt and model are
set via managed variables — no redeploy needed to change them.
tags: [ai]
request_example:
message: "Explain what an API is in one sentence"
max_tokens: 200
notes: |
Response shape mirrors the OpenAI API: `choices[0].message.content` contains
the completion text. `usage` contains token counts.

actions:
- name: ValidateBody
input: a|body|
hide_data_on_success: true
assert:
http_code_on_error: 400
tests:
- value: message
is_not_null: true
is_not_empty: true
description: "The user message to send to the model"

- name: CallLLM
run_when_succeeded:
actions: [ValidateBody]
http_code_on_error: 400
http:
url: https://api.openai.com/v1/chat/completions
method: POST
headers:
content-type: application/json
authorization: "Bearer a|ap_var::OPENAI_API_KEY|"
body: |
{
"model": a|double_quote(ap_var::LLM_MODEL)|,
"max_tokens": a|ValidateBody::max_tokens|,
"messages": [
{ "role": "system", "content": a|double_quote(ap_var::SYSTEM_PROMPT)| },
{ "role": "user", "content": a|double_quote(ValidateBody::message)| }
]
}
assert:
http_code_on_error: 502
error_message: "LLM provider error"
tests:
- value: status
is_equal_to: 200

- name: ExtractResponse
run_when_succeeded: [CallLLM]
input: a|CallLLM|
post_transforms:
- extract_value: body

- name: TrackCompletion
run_when_succeeded: [ExtractResponse]
hide_data_on_success: true
emit_metric:
name: app_llm_completions_total
type: counter
labels:
model: a|ap_var::LLM_MODEL|

# POST /ai/moderate
# Check whether a piece of text violates OpenAI's content policy.
# Useful as a guard before storing user-generated content.
# Body: { "text": "..." }
ai/moderate:
output: http
method: POST
summary: Content moderation
description: >
Check text against OpenAI's moderation API. Returns `flagged: true` if
the content violates usage policies, along with per-category scores.
tags: [ai]
request_example:
text: "Some user-generated content to check"
notes: |
Uses the OpenAI Moderation API which is free regardless of your plan.

actions:
- name: ValidateBody
input: a|body|
hide_data_on_success: true
assert:
http_code_on_error: 400
tests:
- value: text
is_not_null: true
is_not_empty: true
description: "Text to moderate"

- name: CallModeration
run_when_succeeded:
actions: [ValidateBody]
http_code_on_error: 400
http:
url: https://api.openai.com/v1/moderations
method: POST
headers:
content-type: application/json
authorization: "Bearer a|ap_var::OPENAI_API_KEY|"
body: |
{
"input": a|double_quote(ValidateBody::text)|
}
assert:
http_code_on_error: 502
error_message: "Moderation API error"
tests:
- value: status
is_equal_to: 200

- name: ExtractResult
run_when_succeeded: [CallModeration]
input: a|CallModeration|
post_transforms:
- extract_value: body

- name: TrackModeration
run_when_succeeded: [ExtractResult]
hide_data_on_success: true
emit_metric:
name: app_llm_moderations_total
type: counter