Skip to content

Deploying Portkey

Introduction

Portkey is an open-source AI gateway that provides a unified interface for managing calls to multiple Large Language Model (LLM) providers. Acting as a proxy between your applications and AI services like OpenAI, Anthropic, Azure, and others, Portkey offers intelligent routing, caching, load balancing, and comprehensive observability for AI operations.

The platform addresses the growing complexity of managing multiple LLM integrations by providing a single API endpoint that handles failover, retries, and cost optimization automatically. With built-in analytics and logging, teams gain visibility into AI usage patterns, performance metrics, and spending.

Key highlights of Portkey:

  • Unified API: Single endpoint for OpenAI, Anthropic, Azure, Cohere, and 100+ LLM providers
  • Intelligent Routing: Load balance requests across multiple providers and models
  • Automatic Failover: Seamlessly switch providers when one fails
  • Response Caching: Reduce costs and latency with semantic caching
  • Rate Limiting: Protect against quota exhaustion and cost overruns
  • Retry Logic: Automatic retries with exponential backoff
  • Request Logging: Detailed logs of all LLM interactions
  • Analytics Dashboard: Visualize usage, costs, and performance metrics
  • Prompt Management: Version and manage prompts across environments
  • Guardrails: Apply safety filters and content moderation
  • OpenAI-Compatible: Drop-in replacement using OpenAI SDK format

This guide walks through deploying Portkey on Klutch.sh using Docker, configuring the AI gateway for your LLM providers, and setting up observability for your AI applications.

Why Deploy Portkey on Klutch.sh

Deploying Portkey on Klutch.sh provides several advantages:

Simplified Deployment: Klutch.sh automatically detects your Dockerfile and builds Portkey without complex configuration. Push to GitHub, and your AI gateway deploys automatically.

Persistent Storage: Attach persistent volumes for logs, cache data, and analytics. Your usage history survives container restarts.

HTTPS by Default: Klutch.sh provides automatic SSL certificates for secure API communication.

GitHub Integration: Connect your configuration repository directly from GitHub for version-controlled gateway management.

Scalable Resources: Allocate CPU and memory based on expected API throughput and caching requirements.

Environment Variable Management: Securely store API keys for multiple LLM providers through Klutch.sh’s environment variable system.

Custom Domains: Assign a custom domain for a professional API endpoint.

Always-On Availability: Your AI gateway remains accessible 24/7 for consistent LLM access.

Prerequisites

Before deploying Portkey on Klutch.sh, ensure you have:

  • A Klutch.sh account
  • A GitHub account with a repository for your Portkey configuration
  • Basic familiarity with Docker and containerization concepts
  • API keys for your LLM providers (OpenAI, Anthropic, etc.)
  • (Optional) A Redis instance for caching
  • (Optional) A custom domain for your API gateway

Understanding Portkey Architecture

Portkey uses a gateway architecture optimized for AI workloads:

Gateway Service: The core proxy that receives API requests, applies routing logic, and forwards to appropriate LLM providers. Built for low-latency processing.

Provider Connectors: Adapters for each LLM provider that translate between Portkey’s unified format and provider-specific APIs.

Cache Layer: Semantic caching system that stores and retrieves responses based on request similarity, reducing costs and latency.

Analytics Engine: Collects and processes metrics on usage, latency, costs, and errors for dashboard visualization.

Config Store: Manages routing rules, provider configurations, and prompt templates.

Preparing Your Repository

To deploy Portkey on Klutch.sh, create a GitHub repository with your Dockerfile and configuration.

Repository Structure

portkey-deploy/
├── Dockerfile
├── README.md
├── .dockerignore
└── config/
└── config.yaml

Creating the Dockerfile

Create a Dockerfile in the root of your repository:

FROM portkeyai/gateway:latest
# Copy custom configuration
COPY config/config.yaml /app/config/config.yaml
# Set environment variables
ENV PORTKEY_PORT=8080
ENV PORTKEY_LOG_LEVEL=${PORTKEY_LOG_LEVEL:-info}
ENV PORTKEY_CACHE_ENABLED=${PORTKEY_CACHE_ENABLED:-true}
ENV PORTKEY_REDIS_URL=${PORTKEY_REDIS_URL}
# OpenAI configuration
ENV OPENAI_API_KEY=${OPENAI_API_KEY}
# Anthropic configuration
ENV ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY}
# Azure OpenAI configuration
ENV AZURE_OPENAI_API_KEY=${AZURE_OPENAI_API_KEY}
ENV AZURE_OPENAI_ENDPOINT=${AZURE_OPENAI_ENDPOINT}
# Expose port
EXPOSE 8080
# Health check
HEALTHCHECK --interval=30s --timeout=10s --start-period=30s --retries=3 \
CMD wget --no-verbose --tries=1 --spider http://localhost:8080/health || exit 1
CMD ["portkey-gateway"]

Creating the Configuration File

Create config/config.yaml:

server:
port: 8080
cors:
enabled: true
origins: ["*"]
logging:
level: info
format: json
cache:
enabled: true
ttl: 3600
max_size: 1000
redis:
enabled: true
url: "${PORTKEY_REDIS_URL}"
providers:
openai:
api_key: "${OPENAI_API_KEY}"
models:
- gpt-4
- gpt-3.5-turbo
rate_limit:
requests_per_minute: 60
tokens_per_minute: 90000
anthropic:
api_key: "${ANTHROPIC_API_KEY}"
models:
- claude-3-opus-20240229
- claude-3-sonnet-20240229
rate_limit:
requests_per_minute: 50
azure:
api_key: "${AZURE_OPENAI_API_KEY}"
endpoint: "${AZURE_OPENAI_ENDPOINT}"
deployments:
- name: gpt-4
deployment_id: "gpt-4-deployment"
routing:
default_provider: openai
fallback:
- anthropic
- azure
load_balancing:
strategy: round_robin
retry:
max_attempts: 3
backoff:
type: exponential
initial_delay: 1000
max_delay: 30000
analytics:
enabled: true
retention_days: 30

Creating the .dockerignore File

.git
.github
*.md
LICENSE
.gitignore
*.log
.DS_Store
.env

Environment Variables Reference

VariableRequiredDefaultDescription
PORTKEY_PORTNo8080Gateway port
PORTKEY_LOG_LEVELNoinfoLogging level
PORTKEY_CACHE_ENABLEDNotrueEnable response caching
PORTKEY_REDIS_URLNo-Redis URL for distributed caching
OPENAI_API_KEYNo-OpenAI API key
ANTHROPIC_API_KEYNo-Anthropic API key
AZURE_OPENAI_API_KEYNo-Azure OpenAI API key
AZURE_OPENAI_ENDPOINTNo-Azure OpenAI endpoint

Deploying Portkey on Klutch.sh

    Gather Your API Keys

    Collect API keys from your LLM providers:

    Push Your Repository to GitHub

    Initialize your repository and push to GitHub:

    Terminal window
    git init
    git add Dockerfile .dockerignore config/ README.md
    git commit -m "Initial Portkey deployment configuration"
    git remote add origin https://github.com/yourusername/portkey-deploy.git
    git push -u origin main

    Create a New Project on Klutch.sh

    Navigate to the Klutch.sh dashboard and create a new project. Give it a descriptive name like “portkey” or “ai-gateway”.

    Create a New App

    Within your project, create a new app. Connect your GitHub account if you haven’t already, then select the repository containing your Portkey Dockerfile.

    Configure HTTP Traffic

    In the deployment settings:

    • Select HTTP as the traffic type
    • Set the internal port to 8080

    Set Environment Variables

    In the environment variables section, add:

    VariableValue
    OPENAI_API_KEYYour OpenAI API key
    ANTHROPIC_API_KEYYour Anthropic API key
    AZURE_OPENAI_API_KEYYour Azure API key (if using)
    AZURE_OPENAI_ENDPOINTYour Azure endpoint (if using)
    PORTKEY_REDIS_URLYour Redis URL (optional)
    PORTKEY_CACHE_ENABLEDtrue

    Attach Persistent Volumes

    Add the following volumes:

    Mount PathRecommended SizePurpose
    /data/logs10 GBRequest logs and analytics
    /data/cache5 GBLocal cache storage

    Deploy Your Application

    Click Deploy to start the build process. Klutch.sh will build and deploy your Portkey instance.

    Access Portkey

    Once deployment completes, your AI gateway is available at https://your-app-name.klutch.sh.

Using Portkey

Making API Requests

Use Portkey as a drop-in replacement for OpenAI:

import openai
client = openai.OpenAI(
base_url="https://your-portkey.klutch.sh/v1",
api_key="your-portkey-api-key"
)
response = client.chat.completions.create(
model="gpt-4",
messages=[
{"role": "user", "content": "Hello!"}
]
)

Specifying Providers

Route to specific providers:

Terminal window
curl https://your-portkey.klutch.sh/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer your-api-key" \
-H "x-portkey-provider: anthropic" \
-d '{
"model": "claude-3-sonnet-20240229",
"messages": [{"role": "user", "content": "Hello!"}]
}'

Load Balancing

Distribute requests across providers:

Terminal window
curl https://your-portkey.klutch.sh/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer your-api-key" \
-H "x-portkey-config: load-balance" \
-d '{
"model": "gpt-4",
"messages": [{"role": "user", "content": "Hello!"}]
}'

Monitoring and Analytics

Viewing Logs

Access request logs through the analytics endpoint or log files:

  • Request/response payloads
  • Latency metrics
  • Token usage
  • Cost tracking
  • Error rates

Key Metrics

Monitor important metrics:

  • Requests per minute: API throughput
  • Average latency: Response time
  • Cache hit rate: Cost savings from caching
  • Error rate: Provider reliability
  • Token usage: Consumption tracking
  • Cost per request: Financial metrics

Troubleshooting Common Issues

Provider Errors

  • Verify API keys are correct
  • Check provider rate limits
  • Ensure model names match provider specifications

High Latency

  • Enable caching for repeated queries
  • Check provider status pages
  • Consider geographic proximity

Cache Not Working

  • Verify Redis connection (if using)
  • Check cache configuration
  • Review cache hit/miss logs

Additional Resources

Conclusion

Deploying Portkey on Klutch.sh provides a powerful AI gateway that simplifies managing multiple LLM providers. The combination of Portkey’s intelligent routing, caching, and observability features with Klutch.sh’s deployment simplicity enables production-ready AI infrastructure without complex setup.

Whether you’re building applications with a single LLM provider or orchestrating across multiple AI services, Portkey on Klutch.sh delivers the unified access, reliability, and visibility needed for production AI applications.