Knowledge Base
Upload documents, trigger semantic search, and connect knowledge bases to agents for RAG-powered conversations.
Knowledge Base
The Knowledge Base API lets you upload documents that agents can query during conversations using retrieval-augmented generation (RAG). When a user's question matches content in the knowledge base, the agent retrieves and cites the relevant excerpts before generating its response.
Base path: /api/v1/knowledge-base
How it works
Upload Document → Chunking → Embedding → Vector Index
↓
User Question → Semantic Search → Top-K Chunks → LLM Context → Response
Documents are split into overlapping chunks (~512 tokens each), embedded using a high-dimensional vector model, and stored in a vector index. At inference time, the user's utterance is embedded and the top-K most semantically similar chunks are retrieved and injected into the LLM context.
Document object
{
"id": "kb_01HXABC123DEF",
"name": "Acme Product Catalog Q2 2025",
"filename": "acme-catalog-q2-2025.pdf",
"mimeType": "application/pdf",
"sizeBytes": 2847392,
"status": "indexed",
"chunkCount": 284,
"embeddingModel": "text-embedding-3-large",
"assignedAgentIds": ["agt_01HXK8Z3MNPQRS"],
"uploadedAt": "2025-06-15T09:00:00Z",
"indexedAt": "2025-06-15T09:02:14Z",
"metadata": {
"department": "product",
"version": "2025-Q2"
}
}
Supported file types:
| Type | MIME Type | Max Size |
|---|---|---|
application/pdf | 50 MB | |
| Plain text | text/plain | 50 MB |
| CSV | text/csv | 50 MB |
| Word (docx) | application/vnd.openxmlformats-officedocument.wordprocessingml.document | 50 MB |
| Excel (xlsx) | application/vnd.openxmlformats-officedocument.spreadsheetml.sheet | 50 MB |
| Markdown | text/markdown | 50 MB |
| JSON | application/json | 50 MB |
| XML | application/xml | 50 MB |
| HTML | text/html | 50 MB |
Upload a document
POST /api/v1/knowledge-base/documents
Content-Type: multipart/form-data
Form fields:
| Field | Type | Required | Description |
|---|---|---|---|
file | file | Yes | The document file |
name | string | No | Display name (defaults to filename) |
agentIds | string | No | Comma-separated agent IDs to assign immediately |
metadata | string | No | JSON string of key-value metadata |
curl -X POST https://api.voisnap.ai/api/v1/knowledge-base/documents \
-H "Authorization: Bearer vsnp_live_..." \
-F "file=@acme-catalog-q2-2025.pdf" \
-F "name=Acme Product Catalog Q2 2025" \
-F 'agentIds=agt_01HXK8Z3MNPQRS' \
-F 'metadata={"department":"product","version":"2025-Q2"}'
with open("acme-catalog-q2-2025.pdf", "rb") as f:
doc = client.knowledge_base.upload_document(
file=f,
name="Acme Product Catalog Q2 2025",
agent_ids=["agt_01HXK8Z3MNPQRS"],
metadata={"department": "product", "version": "2025-Q2"}
)
print(f"Uploaded: {doc.id}, status: {doc.status}")
# Poll for indexing completion
import time
while doc.status == "processing":
time.sleep(2)
doc = client.knowledge_base.get(doc.id)
print(f"Indexed: {doc.chunk_count} chunks")
const file = new File([await fs.readFile('./acme-catalog-q2-2025.pdf')], 'acme-catalog-q2-2025.pdf');
const doc = await client.knowledgeBase.uploadDocument({
file,
name: 'Acme Product Catalog Q2 2025',
agentIds: ['agt_01HXK8Z3MNPQRS'],
metadata: { department: 'product', version: '2025-Q2' },
});
Response:
{
"id": "kb_01HXABC123DEF",
"name": "Acme Product Catalog Q2 2025",
"filename": "acme-catalog-q2-2025.pdf",
"status": "processing",
"uploadedAt": "2025-06-15T09:00:00Z"
}
List documents
GET /api/v1/knowledge-base/documents
curl https://api.voisnap.ai/api/v1/knowledge-base/documents \
-H "Authorization: Bearer vsnp_live_..."
Get a document
GET /api/v1/knowledge-base/documents/{documentId}
Delete a document
DELETE /api/v1/knowledge-base/documents/{documentId}
Deletes the document, all its chunks, and the associated vector embeddings.
Reindex a document
Triggers re-chunking and re-embedding. Useful after updating the document content.
POST /api/v1/knowledge-base/documents/{documentId}/reindex
curl -X POST https://api.voisnap.ai/api/v1/knowledge-base/documents/kb_01HXABC123DEF/reindex \
-H "Authorization: Bearer vsnp_live_..."
Check indexing status
GET /api/v1/knowledge-base/documents/{documentId}/status
Response:
{
"id": "kb_01HXABC123DEF",
"status": "indexed",
"progress": 100,
"chunkCount": 284,
"indexedAt": "2025-06-15T09:02:14Z",
"error": null
}
Status values: pending, processing, indexed, failed
Semantic search
Search the knowledge base directly to test retrieval quality before deploying to an agent.
POST /api/v1/knowledge-base/search
curl -X POST https://api.voisnap.ai/api/v1/knowledge-base/search \
-H "Authorization: Bearer vsnp_live_..." \
-H "Content-Type: application/json" \
-d '{
"query": "What is the return policy for electronics?",
"documentIds": ["kb_01HXABC123DEF"],
"topK": 5,
"minScore": 0.7
}'
results = client.knowledge_base.search(
query="What is the return policy for electronics?",
document_ids=["kb_01HXABC123DEF"],
top_k=5,
min_score=0.7
)
for r in results.matches:
print(f"Score: {r.score:.3f} | {r.excerpt[:120]}...")
const results = await client.knowledgeBase.search({
query: 'What is the return policy for electronics?',
documentIds: ['kb_01HXABC123DEF'],
topK: 5,
minScore: 0.7,
});
Response:
{
"query": "What is the return policy for electronics?",
"matches": [
{
"documentId": "kb_01HXABC123DEF",
"documentName": "Acme Product Catalog Q2 2025",
"chunkId": "chunk_00142",
"score": 0.924,
"excerpt": "Electronics purchased at Acme may be returned within 30 days of purchase with original receipt. Items must be in original packaging and unused condition. Opened software and digital downloads are non-refundable.",
"pageNumber": 47,
"metadata": {}
},
{
"documentId": "kb_01HXABC123DEF",
"documentName": "Acme Product Catalog Q2 2025",
"chunkId": "chunk_00143",
"score": 0.871,
"excerpt": "For defective electronics, Acme provides a 90-day replacement warranty. Contact support@acme.com or call 1-800-ACME to initiate a warranty claim.",
"pageNumber": 47,
"metadata": {}
}
],
"totalMatches": 2
}
:::tip Run semantic search queries before assigning a knowledge base to a production agent. If scores are consistently below 0.7, consider splitting your document into smaller, more focused files. :::