Deploying AnythingLLM

Introduction

AnythingLLM is a powerful, full-stack open-source application that turns any document, resource, or piece of content into context that any LLM can use during chatting. This comprehensive AI document management platform allows you to create a private ChatGPT alternative with full control over your data, supporting multiple LLM providers, vector databases, and embedding models.

Deploying AnythingLLM on Klutch.sh provides you with a scalable, secure platform for running your AI-powered document chat application with persistent storage for your documents and embeddings, automated HTTPS, and simple environment configuration for connecting to various LLM providers like OpenAI, Anthropic, Azure OpenAI, or locally-hosted models.

This guide walks you through deploying AnythingLLM using a Dockerfile on Klutch.sh, configuring persistent volumes for data retention, setting up environment variables for LLM integrations, and best practices for production deployments.

What You’ll Learn

How to deploy AnythingLLM with a Dockerfile on Klutch.sh
Setting up persistent storage for documents, embeddings, and vector databases
Configuring environment variables for different LLM providers
Customizing AnythingLLM settings for production use
Best practices for security and performance

Prerequisites

Before you begin, ensure you have:

A Klutch.sh account
A GitHub repository (can be a new empty repo or a fork of the AnythingLLM repository)
Basic familiarity with Docker and environment variables
(Optional) API keys for your preferred LLM provider (OpenAI, Anthropic, etc.)

Understanding AnythingLLM Architecture

AnythingLLM consists of:

Frontend: React-based user interface for document management and chat
Backend: Node.js server handling embeddings, vector storage, and LLM connections
Storage Layer: Persistent storage for documents, vector embeddings, and user data
Vector Database: Built-in LanceDB or optional Pinecone, Chroma, Weaviate integration

The application runs as a single container on port 3001 by default, making it straightforward to deploy on Klutch.sh.

Step 1: Prepare Your GitHub Repository

Create a new GitHub repository or fork the official AnythingLLM repository.
Create a Dockerfile in the root of your repository with the following content:

FROM mintplexlabs/anythingllm:latest

# Set working directory
WORKDIR /app/server

# Expose the application port
EXPOSE 3001

# The base image already contains the necessary startup commands
# Data will be persisted to /app/server/storage
CMD ["node", "/app/server/index.js"]

(Optional) Create a .dockerignore file to exclude unnecessary files:

.git
.github
node_modules
*.md
.env
.env.local

Commit and push your changes to GitHub:

git add Dockerfile .dockerignore
git commit -m "Add Dockerfile for Klutch.sh deployment"
git push origin main

Step 2: Create Your App on Klutch.sh

Log in to Klutch.sh and navigate to the dashboard.
Create a new project (if you don’t have one already) by clicking “New Project” and providing a project name.
Create a new app within your project by clicking “New App”.
Connect your GitHub repository by selecting it from the list of available repositories.
Configure the build settings:
- Klutch.sh will automatically detect the Dockerfile in your repository root
- The build will use this Dockerfile automatically
Set the internal port to 3001 (AnythingLLM’s default port). This is the port that traffic will be routed to within the container.
Select HTTP traffic for the app’s traffic type.

Step 3: Configure Persistent Storage

AnythingLLM requires persistent storage to retain your documents, embeddings, user data, and vector database across deployments.

In your app settings, navigate to the “Volumes” section.
Add a persistent volume with the following configuration:
- Mount Path: /app/server/storage
- Size: Start with at least 10 GB (adjust based on expected document volume)
Save the volume configuration. This ensures all your data persists even when the container is restarted or redeployed.

The /app/server/storage directory contains:

Document embeddings and metadata
Vector database files (if using built-in LanceDB)
User accounts and workspace configurations
Chat history and conversation data

For more details on managing persistent storage, see the Volumes Guide.

Step 4: Configure Environment Variables

AnythingLLM can be customized using environment variables. Here are the most important ones:

In your app settings, navigate to the “Environment Variables” section.
Add the following required variables:

# Server configuration
SERVER_PORT=3001

# Storage configuration (already set by volume mount)
STORAGE_DIR=/app/server/storage

# JWT secret for authentication (generate a random string)
JWT_SECRET=your-random-jwt-secret-here

# (Optional) Set the application URL
APP_URL=https://example-app.klutch.sh

Configure your LLM provider (choose one):

For OpenAI:

LLM_PROVIDER=openai
OPEN_AI_KEY=sk-your-openai-api-key
OPEN_AI_MODEL_PREF=gpt-4

For Anthropic (Claude):

LLM_PROVIDER=anthropic
ANTHROPIC_API_KEY=sk-ant-your-anthropic-key
ANTHROPIC_MODEL_PREF=claude-3-opus-20240229

For Azure OpenAI:

LLM_PROVIDER=azure
AZURE_OPENAI_ENDPOINT=https://your-resource.openai.azure.com
AZURE_OPENAI_KEY=your-azure-key
AZURE_OPENAI_DEPLOYMENT=your-deployment-name

For Local LLM (Ollama, LM Studio, etc.):

LLM_PROVIDER=ollama
OLLAMA_BASE_URL=http://your-ollama-server:11434
OLLAMA_MODEL_PREF=llama2

Configure your embedding provider (optional, uses the same as LLM by default):
Terminal window
```
EMBEDDING_PROVIDER=openai
EMBEDDING_MODEL_PREF=text-embedding-ada-002
```

Configure vector database (optional, uses built-in LanceDB by default):

For Pinecone:

VECTOR_DB=pinecone
PINECONE_API_KEY=your-pinecone-key
PINECONE_ENVIRONMENT=us-west1-gcp
PINECONE_INDEX=anythingllm

For Chroma:

VECTOR_DB=chroma
CHROMA_ENDPOINT=http://your-chroma-server:8000

Additional optional settings:

# Disable telemetry
DISABLE_TELEMETRY=true

# Set authentication mode (default is multi-user)
AUTH_TOKEN=your-single-user-token

# Enable/disable user registration
ENABLE_REGISTRATION=false

Mark sensitive values as secrets in the Klutch.sh UI to prevent them from appearing in logs.

Important Security Notes:

Never commit API keys or secrets to your repository
Always use Klutch.sh environment variables for sensitive data
Generate a strong, random JWT_SECRET for production use
Consider disabling registration (ENABLE_REGISTRATION=false) after creating your admin account

Step 5: Deploy Your Application

Review your configuration to ensure all settings are correct:
- Dockerfile is detected
- Internal port is set to 3001
- Persistent volume is mounted to /app/server/storage
- Environment variables are configured
- Traffic type is set to HTTP
Click “Deploy” to start the build and deployment process.
Monitor the build logs to ensure the deployment completes successfully. The build typically takes 2-5 minutes depending on your image size.
Wait for the deployment to complete. Once done, you’ll see your app URL (e.g., https://example-app.klutch.sh).

Step 6: Initial Setup and Configuration

Access your AnythingLLM instance by navigating to your app URL (e.g., https://example-app.klutch.sh).
Create your admin account on the first login screen:
- Enter a username and password
- Complete the initial setup wizard
Configure your workspace:
- Set your preferred LLM provider (if not already configured via environment variables)
- Choose your embedding model
- Select your vector database
Upload your first documents:
- Click on “Documents” in the sidebar
- Upload PDF, TXT, DOCX, or other supported file formats
- Wait for the documents to be processed and embedded
Start chatting:
- Navigate to a workspace
- Ask questions about your uploaded documents
- The AI will use the document context to provide accurate answers

Getting Started: Sample Usage

Here are some common tasks you can perform with AnythingLLM:

Uploading and Processing Documents

# You can upload documents through the UI or via API
curl -X POST https://example-app.klutch.sh/api/v1/document/upload \
  -H "Authorization: Bearer YOUR_API_TOKEN" \
  -F "file=@document.pdf"

Creating a Workspace

# Create a workspace via API
curl -X POST https://example-app.klutch.sh/api/v1/workspace/new \
  -H "Authorization: Bearer YOUR_API_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "My First Workspace",
    "description": "A workspace for my documents"
  }'

Chatting with Documents

# Send a chat message
curl -X POST https://example-app.klutch.sh/api/v1/workspace/my-workspace/chat \
  -H "Authorization: Bearer YOUR_API_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "message": "What is the main topic of the uploaded document?",
    "mode": "chat"
  }'

Advanced Configuration

Custom Dockerfile for Additional Dependencies

If you need to add custom dependencies or configurations, you can extend the base image:

FROM mintplexlabs/anythingllm:latest

# Install additional system packages
USER root
RUN apt-get update && apt-get install -y \
    # Add any additional packages here
    && rm -rf /var/lib/apt/lists/*

# Switch back to the app user
USER anythingllm

# Set working directory
WORKDIR /app/server

# Expose the application port
EXPOSE 3001

# Start the application
CMD ["node", "/app/server/index.js"]

Using Environment Variables with Nixpacks

If you’re not using a Dockerfile and want Klutch.sh to use Nixpacks to build your application, you can customize the build and start commands using environment variables:

Build-time environment variables:

NIXPACKS_BUILD_CMD=npm run build

Runtime environment variables:

NIXPACKS_START_CMD=node /app/server/index.js

However, for AnythingLLM, using a Dockerfile is the recommended approach as it provides better control and reproducibility.

Production Best Practices

Security

Use strong JWT secrets: Generate a cryptographically secure random string for JWT_SECRET
Disable public registration: Set ENABLE_REGISTRATION=false after creating your accounts
Use HTTPS only: Klutch.sh provides automatic HTTPS for all apps
Rotate API keys regularly: Periodically update your LLM provider API keys
Implement rate limiting: Monitor usage and consider implementing rate limits on the API

Performance

Choose appropriate instance size: Monitor CPU and memory usage and scale accordingly
Optimize vector database: If using external vector databases, ensure they’re geographically close to your app
Enable caching: AnythingLLM caches embeddings by default, ensure persistent storage is configured
Monitor storage usage: Regularly check your volume usage and scale as needed

Backups

Backup your persistent volume: Regularly backup the /app/server/storage directory
Export workspace data: Use the built-in export feature to backup individual workspaces
Document your configuration: Keep a record of all environment variables and settings

Monitoring

Watch application logs: Monitor logs for errors or performance issues through the Klutch.sh dashboard
Set up health checks: Implement monitoring to detect when the application is down
Track API usage: Monitor your LLM provider API usage and costs
Monitor vector database size: Keep an eye on embedding storage growth

Troubleshooting

Application Won’t Start

Issue: Container starts but application doesn’t respond

Solutions:

Verify the internal port is set to 3001
Check that environment variables are properly set
Review application logs for startup errors
Ensure JWT_SECRET is configured

Out of Storage Space

Issue: Cannot upload new documents or create embeddings

Solutions:

Increase your persistent volume size in Klutch.sh
Clean up old or unused documents
Consider using an external vector database to offload storage

LLM Provider Connection Errors

Issue: Cannot connect to OpenAI, Anthropic, or other LLM providers

Solutions:

Verify API keys are correct and not expired
Check that the LLM_PROVIDER variable matches your provider
Ensure your LLM provider account has sufficient credits
Test API connectivity from the application logs

Slow Document Processing

Issue: Document uploads and embeddings take too long

Solutions:

Increase instance CPU and memory resources
Use faster embedding models (smaller dimensions)
Process large documents in batches
Consider using a more powerful embedding provider

Data Loss After Redeployment

Issue: Documents and conversations disappear after redeploying

Solutions:

Verify persistent volume is properly attached to /app/server/storage
Check volume mount path matches exactly
Ensure volume wasn’t accidentally deleted
Restore from backups if available

Scaling and Performance Optimization

Vertical Scaling

For increased performance with large document collections:

Increase CPU cores for faster embedding generation
Add more RAM for larger vector database operations
Expand storage volume as document collection grows

Embedding Optimization

# Use smaller embedding models for faster processing
EMBEDDING_MODEL_PREF=text-embedding-3-small

# Or use local embedding models to reduce API costs
EMBEDDING_PROVIDER=local
LOCAL_EMBEDDING_MODEL=all-MiniLM-L6-v2

Vector Database Considerations

For large-scale deployments (>10,000 documents):

Consider migrating to Pinecone or Weaviate for better performance
Use dedicated vector database instances
Implement proper indexing strategies

Cost Optimization

Reducing LLM Costs

Use cheaper models for embeddings (text-embedding-3-small instead of ada-002)
Consider local LLM hosting with Ollama for privacy and cost savings
Implement response caching to avoid redundant API calls
Set token limits on responses

Storage Optimization

Regularly clean up unused documents
Compress documents before upload when possible
Use appropriate vector database settings for your use case

Alternative Deployment Options

Using Docker Compose for Local Testing

While Docker Compose is not supported on Klutch.sh for deployment, you can use it locally to test your configuration before deploying:

version: '3.8'

services:
  anythingllm:
    image: mintplexlabs/anythingllm:latest
    container_name: anythingllm
    ports:
      - "3001:3001"
    volumes:
      - ./storage:/app/server/storage
    environment:
      - SERVER_PORT=3001
      - JWT_SECRET=my-dev-secret
      - LLM_PROVIDER=openai
      - OPEN_AI_KEY=sk-your-key
    restart: unless-stopped

Run locally with:

docker-compose up -d

This allows you to test configurations before deploying to Klutch.sh.

Updating AnythingLLM

To update to the latest version of AnythingLLM:

Update your Dockerfile to use the latest tag or a specific version:

FROM mintplexlabs/anythingllm:latest  # or specify a version like :1.2.3

Commit and push the changes to your GitHub repository:

git add Dockerfile
git commit -m "Update AnythingLLM to latest version"
git push origin main

Redeploy your app through the Klutch.sh dashboard. Your persistent storage will be retained.
Verify the update by checking the application version in the UI or logs.

Migrating from Other Platforms

If you’re migrating from another hosting platform:

Export your data from your current AnythingLLM instance using the built-in export feature
Deploy on Klutch.sh following the steps in this guide
Copy your storage data to the new persistent volume:
- Download your old /app/server/storage directory
- Upload to the new instance or mount during deployment
Update environment variables to match your new setup
Test thoroughly before decommissioning the old instance

Integration Examples

Embedding AnythingLLM in Your Website

<!DOCTYPE html>
<html>
<head>
    <title>My Website with AI Chat</title>
</head>
<body>
    <h1>Welcome to My Website</h1>

    <!-- Embed AnythingLLM chat widget -->
    <script
      src="https://example-app.klutch.sh/embed.js"
      data-workspace-id="your-workspace-id">
    </script>
</body>
</html>

Using the REST API

// Node.js example
const axios = require('axios');

const API_URL = 'https://example-app.klutch.sh/api/v1';
const API_TOKEN = 'your-api-token';

async function chatWithWorkspace(workspaceSlug, message) {
  const response = await axios.post(
    `${API_URL}/workspace/${workspaceSlug}/chat`,
    { message, mode: 'chat' },
    { headers: { 'Authorization': `Bearer ${API_TOKEN}` } }
  );

  return response.data;
}

// Usage
chatWithWorkspace('my-workspace', 'What is AI?')
  .then(response => console.log(response))
  .catch(error => console.error(error));

Python SDK Integration

import requests

API_URL = "https://example-app.klutch.sh/api/v1"
API_TOKEN = "your-api-token"

def chat_with_workspace(workspace_slug: str, message: str):
    response = requests.post(
        f"{API_URL}/workspace/{workspace_slug}/chat",
        json={"message": message, "mode": "chat"},
        headers={"Authorization": f"Bearer {API_TOKEN}"}
    )
    return response.json()

# Usage
result = chat_with_workspace("my-workspace", "Explain machine learning")
print(result)

Resources

Conclusion

You now have a fully functional AnythingLLM deployment running on Klutch.sh with persistent storage, configured LLM providers, and production-ready settings. This setup allows you to:

Chat with your documents using AI
Maintain full data privacy and control
Scale as your document collection grows
Integrate with multiple LLM providers
Access your AI assistant from anywhere

For questions or support, refer to the AnythingLLM community discussions or the Klutch.sh documentation.