Skip to content

Deploying Zep

Introduction

Zep is an open-source vector database and memory store designed for LLM applications — storing embeddings, chat history, and long-term memory for conversational agents. Deploying Zep on Klutch.sh gives you a scalable, secure storage layer for embeddings and a reliable place to persist model state.

This guide covers deploying Zep with and without a Dockerfile, persisting data with volumes, securing secrets, and production recommendations. Where relevant, links point to existing Klutch.sh guides (Quick Start, Volumes, Builds) for consistency.


Prerequisites

  • A Klutch.sh account (sign up here)
  • A GitHub repository for your deployment code (or a small wrapper to start Zep)
  • Basic knowledge of Docker and environment variables
  • Optionally: object storage credentials (S3-compatible) if you plan to offload backups

1. Prepare your Zep project

Create a small repo that either runs the official Zep server or a wrapper that configures it. Keep secrets out of the repo — use environment variables in Klutch.sh.

Refer to the Quick Start Guide for repository and project setup.


2. Sample non-Docker deployment (Klutch.sh build)

If you want Klutch.sh to build the app from your repo without a Dockerfile:

  1. Push your repo to GitHub. Include a start script (for example start.sh) that launches the Zep server or your entrypoint.
  2. In Klutch.sh, create a new project and app and connect your repository.
  3. Set the start command to your launcher (for example: ./start.sh or zep serve --host 0.0.0.0 --port 8000 depending on how you run Zep).
  4. Attach a persistent volume for Zep data (see Volumes Guide).
  5. Set the app port to 8000 (or the port Zep listens on).
  6. Click “Create” to deploy.

Notes:

  • Configure runtime secrets (database keys, S3 creds) as Klutch.sh environment variables.
  • If you plan to scale out, consider using a storage backend or cluster configuration supported by Zep.

3. Deploying with a Dockerfile

Using a Dockerfile gives reproducibility and tighter control over dependencies. Example Dockerfile (simple CPU-focused Zep server):

FROM python:3.10-slim
WORKDIR /app
# Install system dependencies
RUN apt-get update && apt-get install -y --no-install-recommends \
build-essential git && rm -rf /var/lib/apt/lists/*
COPY requirements.txt ./
RUN pip install --no-cache-dir -r requirements.txt
COPY . .
EXPOSE 8000
CMD ["zep", "serve", "--host", "0.0.0.0", "--port", "8000"]
  • requirements.txt should include zep and any connectors (for example zep[postgres] or zep[redis] if using external backends).
  • For production use, pin package versions and use multi-stage builds to reduce image size.

4. Persistence and backups

Zep stores vector indexes and metadata — attach a persistent volume to ensure data survives redeploys:

  • Create a Klutch.sh persistent volume and mount it to the path Zep uses for storage (e.g., /data/zep).
  • Use environment variables to configure Zep to write data to the mounted path.

If you prefer object storage backups, configure a scheduled job to snapshot and upload to S3-compatible storage, with credentials stored in Klutch.sh environment variables.


5. Environment variables and secrets

  • Store API keys, DB connection strings, and S3 credentials in the Klutch.sh UI as environment variables.
  • Avoid checking secrets into source control.

6. Scaling and production recommendations

  • Monitor CPU, memory, and storage usage; vector DBs can be I/O and memory intensive.
  • Use autoscaling or horizontal scaling where your architecture and Zep backend support it.
  • Use health checks and readiness probes if your deployment workflow needs them.
  • Pin dependency versions in requirements.txt and use CI to build and publish images for stable releases.
  • Consider using an external vector index backend (if supported) for very large datasets.

7. Example requirements.txt

zep
# optionally: zep[postgres], zep[redis], or other extras as needed

Resources


Deploying Zep on Klutch.sh provides a managed, persistent place to store embeddings and memory for LLMs. If you want, I can add a sample start.sh, a CI snippet to build and publish images, or a GPU-ready variant if you’re running GPU-accelerated vector search workloads.