MongoDB Atlas Vector Search RAG
Category: AI & Agents
This page is generated from the Air Pipe marketplace. Browse it live to install into your organization.
A Retrieval-Augmented Generation (RAG) chatbot backed by MongoDB Atlas Vector Search. Ingest your own documents, embed them with OpenAI, and answer questions with $vectorSearch — your documents and their embeddings live in one database, with no separate vector store to run.
Already on MongoDB Atlas? Keep your data and your embeddings together. This pack turns your Atlas cluster into a RAG backend — semantic retrieval plus a grounded LLM answer, served as plain HTTP routes.
This is the Atlas-native companion to the AI RAG Chatbot pack (which uses Postgres + pgvector). Same workflow, different database — pick the one your data already lives in.
What's included
| File | Purpose |
|---|---|
config.yml | All endpoints — seed, ingest, list, delete, chat, history |
vector-index.json | The Atlas Vector Search index definition (one-time setup artifact) |
There is no schema.sql. MongoDB is schemaless: the rag_documents and rag_chat_history collections are created on first insert. The one piece of setup that isn't a document write is the Atlas Vector Search index — see Setup step 2.
Endpoints
| Method | Path | Body | Description |
|---|---|---|---|
POST | /rag/documents/seed | — | Clear and load three sample docs with embeddings |
GET | /rag/documents | — | List ingested documents (embeddings omitted) |
POST | /rag/documents/ingest | {title, content, source?} | Embed and store a document |
POST | /rag/documents/delete | {docId} | Delete a document by docId |
POST | /rag/chat | {question, session_id?, top_k?} | Retrieve + answer with RAG |
POST | /rag/history | {session_id} | Conversation history for a session |
How it works
question ──▶ OpenAI embeddings ──▶ Atlas $vectorSearch ──▶ context ──▶ GPT ──▶ grounded answer
- Embed the question with
text-embedding-3-small(1536 dimensions). - Retrieve the
top_kmost similar documents with a$vectorSearchaggregation stage. Air Pipe interpolates the query embedding straight into the pipeline as a JSON array — no round-trip through application code. - Assemble a single grounded context string inside the aggregation (
$group+$reduce), so the database does the work. - Answer with
gpt-4o-mini, instructed to use only the retrieved context. - Persist the turn in
rag_chat_historywhen asession_idis supplied.
The retrieval and context assembly happen in one aggregation pipeline — the thing a workflow tool (Zapier, n8n) fundamentally cannot do.
Setup
1. Create a MongoDB Atlas cluster
- Sign up at mongodb.com/cloud/atlas and create a cluster. The free M0 tier supports Atlas Vector Search.
- Under Database Access, create a database user with a password.
- Under Network Access, allow the IP of wherever Air Pipe runs. For hosted Air Pipe, whitelist its published egress ranges from ip-ranges.airpipe.io rather than opening the database to everything.
- Click Connect → Drivers and copy the SRV connection string:
mongodb+srv://<user>:<pass>@cluster0.xxxxx.mongodb.net/?retryWrites=true&w=majority
This pack uses a database named ragdb and a collection named rag_documents. Both are created automatically on first insert — change them throughout config.yml if you prefer different names.
2. Create the Vector Search index (one-time)
$vectorSearch needs a vector index, which can't be created from a config file. Create it once on the ragdb.rag_documents collection, named vector_index, using the definition in vector-index.json:
{
"fields": [
{ "type": "vector", "path": "embedding", "numDimensions": 1536, "similarity": "cosine" }
]
}
Atlas UI: Atlas Search → Create Search Index → JSON Editor → index type Vector Search → select ragdb.rag_documents, name it vector_index, paste the JSON above.
mongosh:
use ragdb
db.rag_documents.createSearchIndex(
"vector_index",
"vectorSearch",
{ fields: [ { type: "vector", path: "embedding", numDimensions: 1536, similarity: "cosine" } ] }
)
The index dimension (1536) must match the embedding model.
text-embedding-3-smallproduces 1536-dim vectors. If you switch models, update both the index and the embedding model name inconfig.yml. The index builds asynchronously — give it a few seconds before your first query.
3. Set the managed variables
In the Air Pipe dashboard, add:
| Name | Value |
|---|---|
MONGODB_URI | your Atlas SRV connection string |
OPENAI_API_KEY | your OpenAI API key |
4. Deploy and seed
curl -X POST https://your-airpipe-host/rag/documents/seed
# → { "documents_seeded": 3 }
The Vector Search index ingests the new documents asynchronously — allow a few seconds after seeding before the first /rag/chat call returns sources.
Quick start: curl walkthrough
BASE=https://your-airpipe-host
# Load three sample Air Pipe documents (with embeddings)
curl -X POST $BASE/rag/documents/seed
# Ask a question — retrieval + grounded answer
curl -X POST $BASE/rag/chat \
-H "Content-Type: application/json" \
-d '{"question": "How do I keep secrets like API keys out of my config?", "session_id": "sess_demo", "top_k": 3}'
# → { "sources_found": 2, "answer": "Use managed variables — encrypted key-value pairs referenced with ap_var::NAME ..." }
# Ingest your own document
curl -X POST $BASE/rag/documents/ingest \
-H "Content-Type: application/json" \
-d '{"title": "Refund policy", "content": "Customers may request a full refund within 30 days of purchase.", "source": "support-docs"}'
# List what is stored (embeddings omitted)
curl $BASE/rag/documents
# Replay a conversation
curl -X POST $BASE/rag/history \
-H "Content-Type: application/json" \
-d '{"session_id": "sess_demo"}'
Why this is an Atlas Vector Search pack, not "the pgvector RAG with a different driver"
- One database for documents and vectors. Your source content and its embeddings live in the same collection. No second system to provision, sync, or pay for.
- Retrieval is an aggregation.
$vectorSearchis a normal pipeline stage, so the same query that finds similar documents also$groups and$reduces them into a ready-to-prompt context string — all server-side. - The query embedding is interpolated into the pipeline. Air Pipe drops the 1536-float query vector straight into
queryVectoras JSON, so there's no glue code between "embed the question" and "search the database".
Notes & customisation
- Embedding model. Defaults to
text-embedding-3-small(1536 dims). To usetext-embedding-3-large(3072 dims) or another model, change the model name in every embedding action and thenumDimensionsin the Vector Search index. top_kand candidates./rag/chatretrievestop_kdocuments (default 5).$vectorSearchscansnumCandidates: 100first — raise it for larger corpora to improve recall, keepingnumCandidatescomfortably abovetop_k.- Document keys. Documents are addressed by a human-friendly
docId(the seed uses readable slugs;ingestgenerates a UUID) rather than the raw_idObjectId, keeping the API readable. - Grounding. The chat prompt instructs the model to answer only from retrieved context and to say so when the answer isn't covered — reducing hallucination over your corpus.
- Sessions. Pass a
session_idto persist each turn inrag_chat_history; omit it for stateless one-shot queries. - Switching to pgvector? The AI RAG Chatbot pack is the same workflow on Postgres + pgvector. Use whichever database your data already lives in.
## Configuration
### config.yml
```yaml
name: MongoAtlasVectorRag
description: >
Retrieval-Augmented Generation chatbot backed by MongoDB Atlas Vector Search.
Ingest your own documents, embed them with OpenAI, and answer questions using
$vectorSearch — your data and your embeddings live in one database, no separate
vector store required.
docs: true
# Required managed variables:
# OPENAI_API_KEY — OpenAI API key (embeddings + chat completions)
# MONGODB_URI — MongoDB Atlas SRV connection string
#
# Models used (override in the embed / chat actions as needed):
# Embeddings : text-embedding-3-small (1536 dimensions — MUST match the index)
# Chat : gpt-4o-mini (swap to gpt-4o for higher quality)
#
# One-time setup (cannot be done from a config — see README):
# Create an Atlas Vector Search index named "vector_index" on the
# ragdb.rag_documents collection, path "embedding", 1536 dims, cosine.
# The definition is shipped as vector-index.json.
global:
databases:
mongo:
driver: mongodb
conn_string: "a|ap_var::MONGODB_URI|"
interfaces:
# POST /rag/documents/seed
# Clears rag_documents and loads three sample Air Pipe docs with real OpenAI
# embeddings so /rag/chat works immediately. Safe to re-run.
#
# The Atlas Vector Search index ingests new documents asynchronously — allow a
# few seconds after seeding before the first /rag/chat call returns sources.
rag/documents/seed:
output: http
method: POST
summary: Seed documents
description: >
Clears the rag_documents collection and loads three sample documents with
OpenAI embeddings. Requires the Atlas Vector Search index to already exist.
tags: [rag, documents]
actions:
- name: ClearDocuments
database: mongo
hide_data_on_success: true
document_operation:
database: ragdb
collection: rag_documents
operation: deleteMany
delete: {}
# Document 1 — embed
- name: EmbedDoc1
run_when_succeeded: [ClearDocuments]
http:
url: https://api.openai.com/v1/embeddings
method: POST
headers:
content-type: application/json
authorization: "Bearer a|ap_var::OPENAI_API_KEY|"
body: |
{
"model": "text-embedding-3-small",
"input": "Getting started with Air Pipe: Air Pipe is a no-code backend platform that lets you build REST APIs, webhook handlers, and scheduled jobs using YAML configuration files. Connect it to your database and deploy in minutes with no servers to manage."
}
assert:
http_code_on_error: 502
error_message: "OpenAI embeddings API error"
tests:
- value: status
is_equal_to: 200
# Document 2 — embed
- name: EmbedDoc2
run_when_succeeded: [EmbedDoc1]
http:
url: https://api.openai.com/v1/embeddings
method: POST
headers:
content-type: application/json
authorization: "Bearer a|ap_var::OPENAI_API_KEY|"
body: |
{
"model": "text-embedding-3-small",
"input": "Managed variables in Air Pipe: Managed variables are encrypted key-value pairs stored securely in the Air Pipe platform. Reference them in your config using the ap_var::VARIABLE_NAME interpolation syntax. They are never logged and are injected at runtime. Use them for API keys, database URLs, and other secrets."
}
assert:
http_code_on_error: 502
error_message: "OpenAI embeddings API error"
tests:
- value: status
is_equal_to: 200
# Document 3 — embed
- name: EmbedDoc3
run_when_succeeded: [EmbedDoc2]
http:
url: https://api.openai.com/v1/embeddings
method: POST
headers:
content-type: application/json
authorization: "Bearer a|ap_var::OPENAI_API_KEY|"
body: |
{
"model": "text-embedding-3-small",
"input": "Vector search in Air Pipe: store OpenAI embeddings alongside your documents in MongoDB Atlas and retrieve the most semantically similar ones with a $vectorSearch aggregation stage. Air Pipe interpolates the query embedding directly into the pipeline, so retrieval-augmented generation runs without a separate vector database."
}
assert:
http_code_on_error: 502
error_message: "OpenAI embeddings API error"
tests:
- value: status
is_equal_to: 200
# Store all three with their embeddings in one insert. Each "embedding"
# value is a pure interpolation marker, so the resolved float array is
# stored as a real array (not a stringified one).
- name: StoreDocuments
run_when_succeeded:
actions: [EmbedDoc3]
http_code_on_error: 500
database: mongo
hide_data_on_success: true
document_operation:
database: ragdb
collection: rag_documents
operation: insertMany
insert:
- docId: doc-getting-started
title: Getting started with Air Pipe
content: "Air Pipe is a no-code backend platform that lets you build REST APIs, webhook handlers, and scheduled jobs using YAML configuration files. Connect it to your database and deploy in minutes with no servers to manage."
source: docs
embedding: "a|EmbedDoc1::body.data|[0].embedding|"
createdAt: "a|timestamp:datetimeutctz|"
- docId: doc-managed-variables
title: Managed variables
content: "Managed variables are encrypted key-value pairs stored securely in the Air Pipe platform. Reference them in your config using the ap_var::VARIABLE_NAME interpolation syntax. They are never logged and are injected at runtime. Use them for API keys, database URLs, and other secrets."
source: docs
embedding: "a|EmbedDoc2::body.data|[0].embedding|"
createdAt: "a|timestamp:datetimeutctz|"
- docId: doc-vector-search
title: Vector search
content: "Store OpenAI embeddings alongside your documents in MongoDB Atlas and retrieve the most semantically similar ones with a $vectorSearch aggregation stage. Air Pipe interpolates the query embedding directly into the pipeline, so retrieval-augmented generation runs without a separate vector database."
source: docs
embedding: "a|EmbedDoc3::body.data|[0].embedding|"
createdAt: "a|timestamp:datetimeutctz|"
- name: SeedSummary
run_when_succeeded: [StoreDocuments]
database: mongo
document_operation:
database: ragdb
collection: rag_documents
operation: aggregate
pipeline: |
[
{ "$count": "documents_seeded" }
]
post_transforms:
- extract_value: "[0]"
# GET /rag/documents
# List ingested documents. The embedding vector is dropped — it is large and
# not useful for display ($project, since find has no projection).
rag/documents:
output: http
summary: List documents
description: Returns all ingested documents without their embedding vectors.
tags: [rag, documents]
response_example:
- docId: doc-getting-started
title: Getting started with Air Pipe
source: docs
createdAt: "2026-01-01T12:00:00Z"
actions:
- name: ListDocuments
database: mongo
document_operation:
database: ragdb
collection: rag_documents
operation: aggregate
pipeline: |
[
{ "$project": { "_id": 0, "embedding": 0, "content": 0 } },
{ "$sort": { "createdAt": -1 } }
]
# POST /rag/documents/ingest
# Embed the provided content and store the document.
# Body: { "title": "...", "content": "...", "source": "..." }
rag/documents/ingest:
output: http
method: POST
summary: Ingest document
description: >
Generate an OpenAI embedding for the provided content and store the
document in MongoDB Atlas for retrieval. The source field is optional and
useful for filtering (e.g. "support-docs", "product-manual").
tags: [rag, documents]
request_example:
title: Refund policy
content: "Customers may request a full refund within 30 days of purchase. To initiate a refund, contact [email protected] with your order number."
source: support-docs
response_example:
docId: 0192f0a1-...-...
title: Refund policy
source: support-docs
actions:
- name: ValidateBody
input: a|body|
hide_data_on_success: true
assert:
http_code_on_error: 400
tests:
- value: title
is_not_null: true
is_not_empty: true
description: "Document title"
- value: content
is_not_null: true
is_not_empty: true
description: "Document text to embed and store"
# Assign a stable docId up front so it can be both stored and returned.
- name: PrepDoc
run_when_succeeded:
actions: [ValidateBody]
http_code_on_error: 400
input: a|ValidateBody|
hide_data_on_success: true
post_transforms:
- add_attribute:
docId: a|uuid|
- name: GenerateEmbedding
run_when_succeeded:
actions: [PrepDoc]
http_code_on_error: 400
http:
url: https://api.openai.com/v1/embeddings
method: POST
headers:
content-type: application/json
authorization: "Bearer a|ap_var::OPENAI_API_KEY|"
body: |
{
"model": "text-embedding-3-small",
"input": a|double_quote(ValidateBody::content)|
}
assert:
http_code_on_error: 502
error_message: "Embedding generation failed"
tests:
- value: status
is_equal_to: 200
- name: StoreDocument
run_when_succeeded:
actions: [GenerateEmbedding]
http_code_on_error: 500
database: mongo
hide_data_on_success: true
document_operation:
database: ragdb
collection: rag_documents
operation: insertOne
insert:
docId: "a|PrepDoc::docId|"
title: "a|ValidateBody::title|"
content: "a|ValidateBody::content|"
source: "a|ValidateBody::source->default(null)|"
embedding: "a|GenerateEmbedding::body.data|[0].embedding|"
createdAt: "a|timestamp:datetimeutctz|"
- name: TrackIngest
run_when_succeeded: [StoreDocument]
hide_data_on_success: true
emit_metric:
name: app_rag_documents_ingested_total
type: counter
labels:
source: a|ValidateBody::source->default(unknown)|
- name: Confirm
run_when_succeeded: [StoreDocument]
input: a|ValidateBody|
post_transforms:
- remove_keys:
- content
- add_attribute:
docId: a|PrepDoc::docId|
# POST /rag/documents/delete
# Remove a document by docId.
# Body: { "docId": "doc-getting-started" }
rag/documents/delete:
output: http
method: POST
summary: Delete document
description: Permanently remove an ingested document by its docId.
tags: [rag, documents]
request_example:
docId: doc-getting-started
actions:
- name: ValidateBody
input: a|body|
hide_data_on_success: true
assert:
http_code_on_error: 400
tests:
- value: docId
is_not_null: true
is_not_empty: true
description: "Document ID"
- name: DeleteDocument
run_when_succeeded:
actions: [ValidateBody]
http_code_on_error: 400
database: mongo
document_operation:
database: ragdb
collection: rag_documents
operation: deleteOne
delete:
docId: "a|ValidateBody::docId|"
assert:
http_code_on_error: 404
error_message: "Document not found"
tests:
- value: deleted
is_equal_to: 1
# POST /rag/chat
# Answer a question using retrieval-augmented generation over Atlas Vector Search.
# Body: { "question": "...", "session_id": "...", "top_k": 5 }
#
# Pipeline:
# 1. Embed the question with text-embedding-3-small
# 2. $vectorSearch the top_k most similar documents in MongoDB Atlas
# 3. Build a single grounded context string inside the aggregation
# 4. Call gpt-4o-mini with that context as the system prompt
# 5. Store Q+A in rag_chat_history (when session_id is provided)
# 6. Return a clean { answer, sources_found } response
rag/chat:
output: http
method: POST
summary: Chat
description: >
Embed the question, retrieve the most semantically similar documents with
Atlas $vectorSearch, and generate a grounded answer using GPT. Only uses
context from your ingested documents — the model is told not to guess.
tags: [rag, chat]
request_example:
question: "How do I keep secrets like API keys out of my config?"
session_id: sess_abc123
top_k: 5
response_example:
answer: "Use managed variables — encrypted key-value pairs referenced with ap_var::NAME..."
sources_found: 2
notes: |
`session_id` stores each turn in `rag_chat_history`. Omit it for stateless
one-shot queries. `top_k` defaults to 5 if not provided.
actions:
- name: ValidateBody
input: a|body|
hide_data_on_success: true
assert:
http_code_on_error: 400
tests:
- value: question
is_not_null: true
is_not_empty: true
description: "The question to answer"
- name: EmbedQuestion
run_when_succeeded:
actions: [ValidateBody]
http_code_on_error: 400
http:
url: https://api.openai.com/v1/embeddings
method: POST
headers:
content-type: application/json
authorization: "Bearer a|ap_var::OPENAI_API_KEY|"
body: |
{
"model": "text-embedding-3-small",
"input": a|double_quote(ValidateBody::question)|
}
assert:
http_code_on_error: 502
error_message: "Failed to embed question"
tests:
- value: status
is_equal_to: 200
# Retrieve the top_k most similar documents with $vectorSearch and fold
# them into a single context string inside the database. The query
# embedding interpolates straight into the pipeline as a JSON array.
- name: RetrieveContext
run_when_succeeded: [EmbedQuestion]
database: mongo
document_operation:
database: ragdb
collection: rag_documents
operation: aggregate
pipeline: |
[
{
"$vectorSearch": {
"index": "vector_index",
"path": "embedding",
"queryVector": a|EmbedQuestion::body.data|[0].embedding|,
"numCandidates": 100,
"limit": a|ValidateBody::top_k->default(5)|
}
},
{ "$project": { "_id": 0, "title": 1, "content": 1, "score": { "$meta": "vectorSearchScore" } } },
{
"$group": {
"_id": null,
"docs": { "$push": { "title": "$title", "content": "$content" } },
"sources_found": { "$sum": 1 }
}
},
{
"$project": {
"_id": 0,
"sources_found": 1,
"context": {
"$reduce": {
"input": "$docs",
"initialValue": "",
"in": {
"$concat": [
"$$value",
{ "$cond": [ { "$eq": [ "$$value", "" ] }, "", "\n\n---\n\n" ] },
"Document: ", "$$this.title", "\n", "$$this.content"
]
}
}
}
}
}
]
post_transforms:
- extract_value: "[0]"
- name: GenerateAnswer
run_when_succeeded: [RetrieveContext]
http:
url: https://api.openai.com/v1/chat/completions
method: POST
headers:
content-type: application/json
authorization: "Bearer a|ap_var::OPENAI_API_KEY|"
body: |
{
"model": "gpt-4o-mini",
"messages": [
{
"role": "system",
"content": "You are a helpful assistant. Answer the user's question using ONLY the context provided below. If the answer is not covered by the context, say so clearly — do not guess or draw on outside knowledge.\n\nContext:\na|RetrieveContext::context->json_escape|"
},
{
"role": "user",
"content": a|double_quote(ValidateBody::question)|
}
]
}
assert:
http_code_on_error: 502
error_message: "Answer generation failed"
tests:
- value: status
is_equal_to: 200
# Store both sides of the turn when a session_id is provided.
- name: StoreHistory
run_when_succeeded: [GenerateAnswer]
run_on_assertion:
tests:
- action: ValidateBody
value: session_id
is_not_null: true
database: mongo
hide_data_on_success: true
document_operation:
database: ragdb
collection: rag_chat_history
operation: insertMany
insert:
- session_id: "a|ValidateBody::session_id|"
role: user
content: "a|ValidateBody::question|"
createdAt: "a|timestamp:datetimeutctz|"
- session_id: "a|ValidateBody::session_id|"
role: assistant
content: "a|GenerateAnswer::body.choices|[0].message.content|"
createdAt: "a|timestamp:datetimeutctz|"
- name: TrackQuery
run_when_succeeded: [GenerateAnswer]
hide_data_on_success: true
emit_metric:
name: app_rag_queries_total
type: counter
# Build a clean { sources_found, answer } response from the retrieved
# context object plus the model's answer.
- name: BuildResponse
run_when_succeeded: [GenerateAnswer]
input: a|RetrieveContext|
post_transforms:
- add_attribute:
answer: a|GenerateAnswer::body.choices|[0].message.content|
- remove_keys:
- context
# POST /rag/history
# Retrieve conversation history for a session, oldest first.
# Body: { "session_id": "sess_abc123" }
rag/history:
output: http
method: POST
summary: Chat history
description: Retrieve all messages for a session ID, oldest first.
tags: [rag, chat]
request_example:
session_id: sess_abc123
response_example:
- session_id: sess_abc123
role: user
content: "How do I keep secrets out of my config?"
createdAt: "2026-01-01T12:00:00Z"
- session_id: sess_abc123
role: assistant
content: "Use managed variables..."
createdAt: "2026-01-01T12:00:01Z"
actions:
- name: ValidateBody
input: a|body|
hide_data_on_success: true
assert:
http_code_on_error: 400
tests:
- value: session_id
is_not_null: true
is_not_empty: true
description: "Session ID"
- name: GetHistory
run_when_succeeded:
actions: [ValidateBody]
http_code_on_error: 400
database: mongo
document_operation:
database: ragdb
collection: rag_chat_history
operation: aggregate
pipeline: |
[
{ "$match": { "session_id": a|double_quote(ValidateBody::session_id)| } },
{ "$sort": { "createdAt": 1 } },
{ "$project": { "_id": 0, "session_id": 1, "role": 1, "content": 1, "createdAt": 1 } }
]
vector-index.json
{
"fields": [
{
"type": "vector",
"path": "embedding",
"numDimensions": 1536,
"similarity": "cosine"
}
]
}