Deploying ChromaDB

Introduction

ChromaDB is an open-source embedding database designed specifically for AI applications. As the AI-native database built for large language models (LLMs), ChromaDB makes it easy to build AI applications with embeddings, offering a simple, powerful way to store and retrieve vector embeddings with their associated metadata.

ChromaDB excels at:

Vector Storage: Efficiently stores and queries high-dimensional vector embeddings
Semantic Search: Enables similarity search across documents, images, and other data types
LLM Integration: Seamless integration with popular LLM frameworks like LangChain and LlamaIndex
Metadata Filtering: Powerful filtering capabilities combined with vector similarity search
Multiple Distance Metrics: Supports cosine similarity, L2 distance, and inner product
Scalability: Designed to scale from prototype to production workloads
Easy-to-Use API: Python and JavaScript clients with intuitive interfaces
Persistence: Reliable data persistence for production deployments

Common use cases include building RAG (Retrieval Augmented Generation) systems, semantic search engines, recommendation systems, document similarity matching, chatbots with memory, content moderation, and any AI application requiring efficient vector storage and retrieval.

This comprehensive guide walks you through deploying ChromaDB on Klutch.sh using Docker, including detailed installation steps, sample configurations, code examples for getting started, and production-ready best practices for persistent storage and security.

Prerequisites

Before you begin, ensure you have the following:

A Klutch.sh account
A GitHub account with a repository for your ChromaDB project
Docker installed locally for testing (optional but recommended)
Basic understanding of Docker, vector databases, and embeddings

Installation and Setup

Step 1: Create Your Project Directory

First, create a new directory for your ChromaDB deployment project:

mkdir chromadb-klutch
cd chromadb-klutch
git init

Step 2: Create the Dockerfile

Create a Dockerfile in your project root directory. This will define your ChromaDB container configuration:

FROM chromadb/chroma:latest

# Set environment variables for ChromaDB configuration
ENV CHROMA_SERVER_HOST=0.0.0.0
ENV CHROMA_SERVER_HTTP_PORT=8000
ENV IS_PERSISTENT=TRUE
ENV PERSIST_DIRECTORY=/chroma/chroma

# Expose the ChromaDB port
EXPOSE 8000

# The default command starts the ChromaDB server
# CMD is inherited from the base image

Note: ChromaDB uses port 8000 by default for its HTTP API. The IS_PERSISTENT environment variable ensures data is saved to disk.

Step 3: (Optional) Create a Custom Configuration File

For advanced configurations, you can create a custom settings file. Create a file named chroma-config.yaml:

# chroma-config.yaml - ChromaDB Configuration

# Server settings
chroma_server_host: "0.0.0.0"
chroma_server_http_port: 8000

# Persistence settings
is_persistent: true
persist_directory: "/chroma/chroma"

# Authentication (optional - for production)
chroma_server_auth_credentials_provider: "chromadb.auth.token.TokenAuthCredentialsProvider"
chroma_server_auth_credentials: "your-secure-token-here"
chroma_server_auth_provider: "chromadb.auth.token.TokenAuthServerProvider"

# Telemetry (can be disabled)
anonymized_telemetry: false

# CORS settings (if needed)
chroma_server_cors_allow_origins: ["*"]

To use a custom configuration, update your Dockerfile:

FROM chromadb/chroma:latest

# Copy custom configuration
COPY chroma-config.yaml /chroma/chroma-config.yaml

ENV CHROMA_SERVER_HOST=0.0.0.0
ENV CHROMA_SERVER_HTTP_PORT=8000
ENV IS_PERSISTENT=TRUE
ENV PERSIST_DIRECTORY=/chroma/chroma

EXPOSE 8000

Step 4: Test Locally (Optional)

Before deploying to Klutch.sh, you can test your ChromaDB setup locally:

# Build the Docker image
docker build -t my-chromadb .

# Run the container
docker run -d \
  --name chromadb-test \
  -p 8000:8000 \
  -v $(pwd)/chroma-data:/chroma/chroma \
  my-chromadb

# Wait a moment for ChromaDB to start
sleep 5

# Test the API endpoint
curl http://localhost:8000/api/v1/heartbeat

# You should see: {"nanosecond heartbeat": <timestamp>}

# Stop and remove the test container when done
docker stop chromadb-test
docker rm chromadb-test

Step 5: Create a `.dockerignore` File

Create a .dockerignore file to exclude unnecessary files from your Docker build:

.git
.gitignore
README.md
*.md
.DS_Store
node_modules
chroma-data
*.log

Step 6: Push to GitHub

Commit your Dockerfile and configuration files to your GitHub repository:

git add Dockerfile chroma-config.yaml .dockerignore
git commit -m "Add ChromaDB Dockerfile and configuration"
git remote add origin https://github.com/yourusername/chromadb-klutch.git
git push -u origin main

Connecting to ChromaDB

Once deployed, you can connect to your ChromaDB instance from any application using the ChromaDB client libraries. Since Klutch.sh routes HTTP traffic through your app’s URL, you can connect directly.

Connection Details

URL: http://example-app.klutch.sh (or https://example-app.klutch.sh if SSL is configured)
Port: Default HTTP/HTTPS ports (80/443)
API Endpoint: /api/v1

Example Connection Code

Python (using chromadb client):

import chromadb
from chromadb.config import Settings

# Connect to ChromaDB on Klutch.sh
client = chromadb.HttpClient(
    host="example-app.klutch.sh",
    port=443,  # Use 443 for HTTPS, 80 for HTTP
    ssl=True,  # Set to True if using HTTPS
    settings=Settings(
        anonymized_telemetry=False
    )
)

# Test the connection
heartbeat = client.heartbeat()
print(f"ChromaDB is alive! Heartbeat: {heartbeat}")

# Create or get a collection
collection = client.get_or_create_collection(
    name="my_collection",
    metadata={"description": "My first collection"}
)

# Add some documents with embeddings
collection.add(
    documents=[
        "This is a document about cats",
        "This is a document about dogs",
        "This is a document about birds"
    ],
    metadatas=[
        {"category": "pets", "type": "mammal"},
        {"category": "pets", "type": "mammal"},
        {"category": "pets", "type": "bird"}
    ],
    ids=["id1", "id2", "id3"]
)

# Query the collection
results = collection.query(
    query_texts=["tell me about pets"],
    n_results=2
)

print("Query results:")
for doc, metadata, distance in zip(
    results['documents'][0],
    results['metadatas'][0],
    results['distances'][0]
):
    print(f"Document: {doc}")
    print(f"Metadata: {metadata}")
    print(f"Distance: {distance}\n")

Python with LangChain:

from langchain.vectorstores import Chroma
from langchain.embeddings import OpenAIEmbeddings
from langchain.text_splitter import CharacterTextSplitter
from langchain.document_loaders import TextLoader
import chromadb
from chromadb.config import Settings

# Initialize ChromaDB client
chroma_client = chromadb.HttpClient(
    host="example-app.klutch.sh",
    port=443,
    ssl=True
)

# Initialize embeddings (using OpenAI as example)
embeddings = OpenAIEmbeddings(openai_api_key="your-api-key")

# Load and split documents
loader = TextLoader("your-document.txt")
documents = loader.load()
text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
docs = text_splitter.split_documents(documents)

# Create vector store
vectorstore = Chroma.from_documents(
    documents=docs,
    embedding=embeddings,
    client=chroma_client,
    collection_name="langchain_collection"
)

# Perform similarity search
query = "What is the main topic?"
results = vectorstore.similarity_search(query, k=3)

for result in results:
    print(result.page_content)
    print("---")

Python with LlamaIndex:

from llama_index import VectorStoreIndex, SimpleDirectoryReader
from llama_index.vector_stores import ChromaVectorStore
from llama_index.storage.storage_context import StorageContext
import chromadb
from chromadb.config import Settings

# Initialize ChromaDB client
chroma_client = chromadb.HttpClient(
    host="example-app.klutch.sh",
    port=443,
    ssl=True
)

# Get or create collection
chroma_collection = chroma_client.get_or_create_collection("llama_collection")

# Set up ChromaDB as vector store
vector_store = ChromaVectorStore(chroma_collection=chroma_collection)
storage_context = StorageContext.from_defaults(vector_store=vector_store)

# Load documents and create index
documents = SimpleDirectoryReader("./data").load_data()
index = VectorStoreIndex.from_documents(
    documents,
    storage_context=storage_context
)

# Query the index
query_engine = index.as_query_engine()
response = query_engine.query("What is the content about?")
print(response)

JavaScript/TypeScript (using chromadb client):

const { ChromaClient } = require('chromadb');

// Connect to ChromaDB on Klutch.sh
const client = new ChromaClient({
  path: "https://example-app.klutch.sh"
});

async function main() {
  // Test connection
  const heartbeat = await client.heartbeat();
  console.log('ChromaDB is alive! Heartbeat:', heartbeat);

  // Create or get a collection
  const collection = await client.getOrCreateCollection({
    name: "my_collection",
    metadata: { description: "My first collection" }
  });

  // Add documents
  await collection.add({
    ids: ["id1", "id2", "id3"],
    documents: [
      "This is a document about cats",
      "This is a document about dogs",
      "This is a document about birds"
    ],
    metadatas: [
      { category: "pets", type: "mammal" },
      { category: "pets", type: "mammal" },
      { category: "pets", type: "bird" }
    ]
  });

  // Query the collection
  const results = await collection.query({
    queryTexts: ["tell me about pets"],
    nResults: 2
  });

  console.log('Query results:', results);
}

main().catch(console.error);

Go (using HTTP API directly):

package main

import (
    "bytes"
    "encoding/json"
    "fmt"
    "io/ioutil"
    "net/http"
)

type AddRequest struct {
    IDs       []string                 `json:"ids"`
    Documents []string                 `json:"documents"`
    Metadatas []map[string]interface{} `json:"metadatas"`
}

type QueryRequest struct {
    QueryTexts []string `json:"query_texts"`
    NResults   int      `json:"n_results"`
}

func main() {
    baseURL := "https://example-app.klutch.sh/api/v1"

    // Test heartbeat
    resp, err := http.Get(baseURL + "/heartbeat")
    if err != nil {
        panic(err)
    }
    defer resp.Body.Close()

    body, _ := ioutil.ReadAll(resp.Body)
    fmt.Println("Heartbeat:", string(body))

    // Create collection
    collectionData := map[string]interface{}{
        "name": "my_collection",
        "metadata": map[string]string{
            "description": "My first collection",
        },
    }

    jsonData, _ := json.Marshal(collectionData)
    resp, err = http.Post(
        baseURL+"/collections",
        "application/json",
        bytes.NewBuffer(jsonData),
    )
    if err != nil {
        panic(err)
    }
    defer resp.Body.Close()

    // Add documents to collection
    addData := AddRequest{
        IDs:       []string{"id1", "id2", "id3"},
        Documents: []string{
            "This is a document about cats",
            "This is a document about dogs",
            "This is a document about birds",
        },
        Metadatas: []map[string]interface{}{
            {"category": "pets", "type": "mammal"},
            {"category": "pets", "type": "mammal"},
            {"category": "pets", "type": "bird"},
        },
    }

    jsonData, _ = json.Marshal(addData)
    resp, err = http.Post(
        baseURL+"/collections/my_collection/add",
        "application/json",
        bytes.NewBuffer(jsonData),
    )
    if err != nil {
        panic(err)
    }
    defer resp.Body.Close()

    fmt.Println("Documents added successfully")

    // Query the collection
    queryData := QueryRequest{
        QueryTexts: []string{"tell me about pets"},
        NResults:   2,
    }

    jsonData, _ = json.Marshal(queryData)
    resp, err = http.Post(
        baseURL+"/collections/my_collection/query",
        "application/json",
        bytes.NewBuffer(jsonData),
    )
    if err != nil {
        panic(err)
    }
    defer resp.Body.Close()

    body, _ = ioutil.ReadAll(resp.Body)
    fmt.Println("Query results:", string(body))
}

Deploying to Klutch.sh

Now that your ChromaDB project is ready and pushed to GitHub, follow these steps to deploy it on Klutch.sh with persistent storage.

Deployment Steps

Log in to Klutch.sh

Navigate to klutch.sh/app and sign in to your account.
Create a New Project

Go to Create Project and give your project a meaningful name (e.g., “ChromaDB Vector Database”).
Create a New App

Navigate to Create App and configure the following settings:
Select Your Repository
- Choose GitHub as your Git source
- Select the repository containing your Dockerfile
- Choose the branch you want to deploy (usually main or master)
Configure Traffic Type
- Traffic Type: Select HTTP (ChromaDB uses HTTP for its API)
- Internal Port: Set to 8000 (the default ChromaDB port that your container listens on)
Set Environment Variables

Add the following environment variables for your ChromaDB configuration:
- CHROMA_SERVER_HOST: Set to 0.0.0.0 (allows external connections)
- CHROMA_SERVER_HTTP_PORT: Set to 8000 (ChromaDB’s default port)
- IS_PERSISTENT: Set to TRUE (enables data persistence)
- PERSIST_DIRECTORY: Set to /chroma/chroma (where ChromaDB stores data)
- ANONYMIZED_TELEMETRY: Set to FALSE (optional, disables telemetry)
Optional - For Authentication (Recommended for Production):
- CHROMA_SERVER_AUTH_CREDENTIALS: A secure token for authentication
- CHROMA_SERVER_AUTH_CREDENTIALS_PROVIDER: Set to chromadb.auth.token.TokenAuthCredentialsProvider
- CHROMA_SERVER_AUTH_PROVIDER: Set to chromadb.auth.token.TokenAuthServerProvider
Attach a Persistent Volume

This is critical for ensuring your vector embeddings and data persist across deployments and restarts:
- In the Volumes section, click “Add Volume”
- Mount Path: Enter /chroma/chroma (this is where ChromaDB stores all vector data and indexes)
- Size: Choose an appropriate size based on your expected data volume (e.g., 10GB, 20GB, 50GB)
Important: ChromaDB requires persistent storage to maintain your embeddings and collections between container restarts. Without a volume, all data will be lost when the container restarts.
Configure Additional Settings
- Region: Select the region closest to your users for optimal latency
- Compute Resources: Choose CPU and memory based on your workload (minimum 512MB RAM recommended, 1GB+ for production workloads with large collections)
- Instances: Start with 1 instance (single instance deployment)
Deploy Your Vector Database

Click “Create” to start the deployment. Klutch.sh will:
- Automatically detect your Dockerfile in the repository root
- Build the Docker image
- Attach the persistent volume
- Start your ChromaDB container
- Assign a URL for external connections
Access Your Database

Once deployment is complete, you’ll receive a URL like example-app.klutch.sh. You can connect to your ChromaDB instance using this URL:
```
import chromadb

client = chromadb.HttpClient(
    host="example-app.klutch.sh",
    port=443,
    ssl=True
)
```

Production Best Practices

Security Recommendations

Enable Authentication: Always enable token-based authentication for production deployments
Use HTTPS: Ensure SSL/TLS is configured for encrypted connections
Secure Tokens: Generate strong, random authentication tokens and store them securely
Environment Variables: Never hardcode credentials in your Dockerfile or code
Network Security: ChromaDB is accessible only through Klutch.sh’s secure network by default
Regular Updates: Keep your ChromaDB version up to date with security patches
Access Control: Implement application-level access control for multi-tenant scenarios

Example authentication setup in Python:

import chromadb
from chromadb.config import Settings

client = chromadb.HttpClient(
    host="example-app.klutch.sh",
    port=443,
    ssl=True,
    headers={
        "Authorization": f"Bearer {your_secure_token}"
    },
    settings=Settings(
        chroma_client_auth_provider="chromadb.auth.token.TokenAuthClientProvider",
        chroma_client_auth_credentials=your_secure_token
    )
)

Performance Optimization

Batch Operations: Use batch operations when adding multiple documents to reduce API calls
Embedding Caching: Cache frequently used embeddings to reduce computation
Collection Organization: Organize data into multiple collections based on use cases
Index Optimization: ChromaDB automatically optimizes HNSW indexes, but be aware of memory usage
Resource Allocation: Allocate sufficient memory for vector operations (rule of thumb: 2-4GB RAM per million vectors)
Query Optimization: Use metadata filtering to reduce search space before vector similarity search

Efficient batch insertion:

# Instead of adding one at a time
for doc in documents:
    collection.add(ids=[doc.id], documents=[doc.text])

# Use batch operations
collection.add(
    ids=[doc.id for doc in documents],
    documents=[doc.text for doc in documents],
    metadatas=[doc.metadata for doc in documents]
)

Data Modeling Best Practices

Collection Design:

# Organize by use case or data type
user_docs_collection = client.get_or_create_collection(
    name="user_documents",
    metadata={"type": "user_content"}
)

product_docs_collection = client.get_or_create_collection(
    name="product_catalog",
    metadata={"type": "products"}
)

# Use meaningful metadata for filtering
collection.add(
    ids=["doc1"],
    documents=["Product description..."],
    metadatas=[{
        "category": "electronics",
        "price_range": "500-1000",
        "brand": "Apple",
        "date_added": "2024-01-15"
    }]
)

Effective Metadata Filtering:

# Combine vector search with metadata filters
results = collection.query(
    query_texts=["laptop for gaming"],
    n_results=5,
    where={
        "$and": [
            {"category": "electronics"},
            {"price_range": {"$in": ["500-1000", "1000-1500"]}}
        ]
    }
)

Embedding Strategy

Choose the Right Embedding Model:

# For general text (using sentence-transformers)
from sentence_transformers import SentenceTransformer

model = SentenceTransformer('all-MiniLM-L6-v2')
embeddings = model.encode(documents)

collection.add(
    ids=ids,
    embeddings=embeddings.tolist(),
    documents=documents
)

# For OpenAI embeddings
import openai

def get_embedding(text, model="text-embedding-3-small"):
    response = openai.Embedding.create(input=text, model=model)
    return response['data'][0]['embedding']

embeddings = [get_embedding(doc) for doc in documents]
collection.add(
    ids=ids,
    embeddings=embeddings,
    documents=documents
)

Backup Strategy

Since ChromaDB stores data in the persistent volume, implement a comprehensive backup strategy:

Manual Backup Approach:

import chromadb
import json

# Export collection data
client = chromadb.HttpClient(host="example-app.klutch.sh", port=443, ssl=True)
collection = client.get_collection("my_collection")

# Get all data
data = collection.get()

# Save to file
with open('backup.json', 'w') as f:
    json.dump(data, f)

# Restore from backup
with open('backup.json', 'r') as f:
    backup_data = json.load(f)

new_collection = client.get_or_create_collection("restored_collection")
new_collection.add(
    ids=backup_data['ids'],
    embeddings=backup_data['embeddings'],
    documents=backup_data['documents'],
    metadatas=backup_data['metadatas']
)

Consider:

Regular automated backups of the persistent volume
Multiple retention periods (daily, weekly, monthly)
Offsite backup storage for disaster recovery
Regular restore testing to verify backup integrity
Version control for collection schemas and configurations

Monitoring

Monitor your ChromaDB deployment for:

API response times and latency
Query throughput and success rates
Collection sizes and number of vectors
Memory usage and disk I/O
CPU utilization during embedding operations
Error rates and failed operations
Connection pool utilization

Health Check Script:

import chromadb
import time

def health_check():
    try:
        client = chromadb.HttpClient(
            host="example-app.klutch.sh",
            port=443,
            ssl=True
        )

        start = time.time()
        heartbeat = client.heartbeat()
        latency = time.time() - start

        print(f"✓ ChromaDB is healthy")
        print(f"  Latency: {latency*1000:.2f}ms")
        print(f"  Heartbeat: {heartbeat}")

        # Check collections
        collections = client.list_collections()
        print(f"  Collections: {len(collections)}")

        for collection in collections:
            count = collection.count()
            print(f"    - {collection.name}: {count} vectors")

        return True
    except Exception as e:
        print(f"✗ Health check failed: {e}")
        return False

if __name__ == "__main__":
    health_check()

Troubleshooting

Cannot Connect to ChromaDB

Verify the URL is correct (should be your Klutch.sh app URL)
Ensure HTTP traffic type is selected in Klutch.sh configuration
Check that the internal port is set to 8000
Test the heartbeat endpoint: curl https://example-app.klutch.sh/api/v1/heartbeat
Verify SSL/TLS settings match your deployment (HTTP vs HTTPS)

Authentication Errors

Verify authentication token is correct if auth is enabled
Ensure CHROMA_SERVER_AUTH_CREDENTIALS environment variable is set
Check that client authentication headers match server configuration
Review container logs for authentication-related errors

Data Not Persisting

Verify the persistent volume is correctly attached at /chroma/chroma
Ensure IS_PERSISTENT environment variable is set to TRUE
Check PERSIST_DIRECTORY matches your volume mount path
Verify the volume has sufficient space allocated
Ensure the container has write permissions to the volume

Slow Query Performance

Review embedding model size and complexity
Check if you’re querying large collections without metadata filtering
Consider splitting large collections into smaller, domain-specific ones
Ensure adequate memory is allocated (2-4GB RAM per million vectors)
Monitor CPU usage during vector operations
Use metadata filtering to reduce search space

Out of Memory Errors

Increase memory allocation in Klutch.sh compute resources
Reduce batch size when adding embeddings
Split large collections into smaller ones
Monitor memory usage and scale resources accordingly
Consider using lighter embedding models

Collection Not Found

Verify collection name is spelled correctly (case-sensitive)
Check if collection was created successfully
List all collections to verify: client.list_collections()
Ensure persistent storage is working correctly

Embedding Dimension Mismatch

# Error: Embedding dimension mismatch
# Solution: Ensure all embeddings in a collection have the same dimension

# Check your embedding model's dimension
from sentence_transformers import SentenceTransformer
model = SentenceTransformer('all-MiniLM-L6-v2')
print(f"Embedding dimension: {model.get_sentence_embedding_dimension()}")

# All embeddings added to a collection must match this dimension

Advanced ChromaDB Features

Distance Metrics

ChromaDB supports multiple distance metrics for similarity search:

# Create collection with specific distance metric
collection = client.create_collection(
    name="my_collection",
    metadata={"hnsw:space": "cosine"}  # Options: cosine, l2, ip (inner product)
)

# Cosine similarity (default) - good for normalized vectors
# Range: 0 (most similar) to 2 (least similar)

# L2 (Euclidean) distance - good for absolute distances
# Range: 0 (identical) to infinity

# Inner Product - good for dot product similarity
# Range: -infinity to +infinity

Metadata Filtering

Powerful filtering with MongoDB-like query syntax:

# Complex queries with $and, $or, $in, $nin, $gt, $gte, $lt, $lte
results = collection.query(
    query_texts=["search query"],
    n_results=10,
    where={
        "$and": [
            {"category": "technology"},
            {"year": {"$gte": 2020}},
            {"tags": {"$in": ["AI", "ML", "deep learning"]}}
        ]
    }
)

# Not equal
results = collection.query(
    query_texts=["search query"],
    where={"status": {"$ne": "draft"}}
)

# Exists check
results = collection.query(
    query_texts=["search query"],
    where={"author": {"$exists": True}}
)

Document Filtering

Filter by document content:

results = collection.query(
    query_texts=["machine learning"],
    n_results=5,
    where_document={"$contains": "neural network"}
)

Getting Specific Items

Retrieve items by ID:

# Get specific documents
items = collection.get(
    ids=["id1", "id2"],
    include=["documents", "embeddings", "metadatas", "distances"]
)

# Get with filtering
items = collection.get(
    where={"category": "technology"},
    limit=10
)

Updating and Deleting

# Update documents
collection.update(
    ids=["id1"],
    documents=["Updated document text"],
    metadatas=[{"category": "updated", "status": "active"}]
)

# Delete specific documents
collection.delete(ids=["id1", "id2"])

# Delete with filter
collection.delete(
    where={"status": "archived"}
)

# Delete entire collection
client.delete_collection("collection_name")

Collections Management

# List all collections
collections = client.list_collections()
for col in collections:
    print(f"Collection: {col.name}, Count: {col.count()}")

# Get collection metadata
metadata = collection.metadata
print(metadata)

# Modify collection metadata
collection.modify(metadata={"description": "Updated description"})

# Peek at collection data
peek_data = collection.peek(limit=5)
print(peek_data)

Additional Resources

Conclusion

Deploying ChromaDB to Klutch.sh with Docker provides a powerful, scalable vector database solution perfect for AI and machine learning applications. By following this guide, you’ve set up a production-ready ChromaDB instance with persistent storage, proper configuration, and best practices for security and performance. Your vector database is now ready to power your AI applications with efficient embedding storage and semantic search capabilities, enabling advanced features like RAG systems, chatbots with memory, and intelligent document retrieval.