Deploying Portkey

Introduction

Portkey is an open-source AI gateway that provides a unified interface for managing calls to multiple Large Language Model (LLM) providers. Acting as a proxy between your applications and AI services like OpenAI, Anthropic, Azure, and others, Portkey offers intelligent routing, caching, load balancing, and comprehensive observability for AI operations.

The platform addresses the growing complexity of managing multiple LLM integrations by providing a single API endpoint that handles failover, retries, and cost optimization automatically. With built-in analytics and logging, teams gain visibility into AI usage patterns, performance metrics, and spending.

Key highlights of Portkey:

Unified API: Single endpoint for OpenAI, Anthropic, Azure, Cohere, and 100+ LLM providers
Intelligent Routing: Load balance requests across multiple providers and models
Automatic Failover: Seamlessly switch providers when one fails
Response Caching: Reduce costs and latency with semantic caching
Rate Limiting: Protect against quota exhaustion and cost overruns
Retry Logic: Automatic retries with exponential backoff
Request Logging: Detailed logs of all LLM interactions
Analytics Dashboard: Visualize usage, costs, and performance metrics
Prompt Management: Version and manage prompts across environments
Guardrails: Apply safety filters and content moderation
OpenAI-Compatible: Drop-in replacement using OpenAI SDK format

This guide walks through deploying Portkey on Klutch.sh using Docker, configuring the AI gateway for your LLM providers, and setting up observability for your AI applications.

Why Deploy Portkey on Klutch.sh

Deploying Portkey on Klutch.sh provides several advantages:

Simplified Deployment: Klutch.sh automatically detects your Dockerfile and builds Portkey without complex configuration. Push to GitHub, and your AI gateway deploys automatically.

Persistent Storage: Attach persistent volumes for logs, cache data, and analytics. Your usage history survives container restarts.

HTTPS by Default: Klutch.sh provides automatic SSL certificates for secure API communication.

GitHub Integration: Connect your configuration repository directly from GitHub for version-controlled gateway management.

Scalable Resources: Allocate CPU and memory based on expected API throughput and caching requirements.

Environment Variable Management: Securely store API keys for multiple LLM providers through Klutch.sh’s environment variable system.

Custom Domains: Assign a custom domain for a professional API endpoint.

Always-On Availability: Your AI gateway remains accessible 24/7 for consistent LLM access.

Prerequisites

Before deploying Portkey on Klutch.sh, ensure you have:

A Klutch.sh account
A GitHub account with a repository for your Portkey configuration
Basic familiarity with Docker and containerization concepts
API keys for your LLM providers (OpenAI, Anthropic, etc.)
(Optional) A Redis instance for caching
(Optional) A custom domain for your API gateway

Understanding Portkey Architecture

Portkey uses a gateway architecture optimized for AI workloads:

Gateway Service: The core proxy that receives API requests, applies routing logic, and forwards to appropriate LLM providers. Built for low-latency processing.

Provider Connectors: Adapters for each LLM provider that translate between Portkey’s unified format and provider-specific APIs.

Cache Layer: Semantic caching system that stores and retrieves responses based on request similarity, reducing costs and latency.

Analytics Engine: Collects and processes metrics on usage, latency, costs, and errors for dashboard visualization.

Config Store: Manages routing rules, provider configurations, and prompt templates.

Preparing Your Repository

To deploy Portkey on Klutch.sh, create a GitHub repository with your Dockerfile and configuration.

Repository Structure

portkey-deploy/
├── Dockerfile
├── README.md
├── .dockerignore
└── config/
    └── config.yaml

Creating the Dockerfile

Create a Dockerfile in the root of your repository:

FROM portkeyai/gateway:latest

# Copy custom configuration
COPY config/config.yaml /app/config/config.yaml

# Set environment variables
ENV PORTKEY_PORT=8080
ENV PORTKEY_LOG_LEVEL=${PORTKEY_LOG_LEVEL:-info}
ENV PORTKEY_CACHE_ENABLED=${PORTKEY_CACHE_ENABLED:-true}
ENV PORTKEY_REDIS_URL=${PORTKEY_REDIS_URL}

# OpenAI configuration
ENV OPENAI_API_KEY=${OPENAI_API_KEY}

# Anthropic configuration
ENV ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY}

# Azure OpenAI configuration
ENV AZURE_OPENAI_API_KEY=${AZURE_OPENAI_API_KEY}
ENV AZURE_OPENAI_ENDPOINT=${AZURE_OPENAI_ENDPOINT}

# Expose port
EXPOSE 8080

# Health check
HEALTHCHECK --interval=30s --timeout=10s --start-period=30s --retries=3 \
  CMD wget --no-verbose --tries=1 --spider http://localhost:8080/health || exit 1

CMD ["portkey-gateway"]

Creating the Configuration File

Create config/config.yaml:

server:
  port: 8080
  cors:
    enabled: true
    origins: ["*"]

logging:
  level: info
  format: json

cache:
  enabled: true
  ttl: 3600
  max_size: 1000
  redis:
    enabled: true
    url: "${PORTKEY_REDIS_URL}"

providers:
  openai:
    api_key: "${OPENAI_API_KEY}"
    models:
      - gpt-4
      - gpt-3.5-turbo
    rate_limit:
      requests_per_minute: 60
      tokens_per_minute: 90000

  anthropic:
    api_key: "${ANTHROPIC_API_KEY}"
    models:
      - claude-3-opus-20240229
      - claude-3-sonnet-20240229
    rate_limit:
      requests_per_minute: 50

  azure:
    api_key: "${AZURE_OPENAI_API_KEY}"
    endpoint: "${AZURE_OPENAI_ENDPOINT}"
    deployments:
      - name: gpt-4
        deployment_id: "gpt-4-deployment"

routing:
  default_provider: openai
  fallback:
    - anthropic
    - azure
  load_balancing:
    strategy: round_robin

retry:
  max_attempts: 3
  backoff:
    type: exponential
    initial_delay: 1000
    max_delay: 30000

analytics:
  enabled: true
  retention_days: 30

Creating the .dockerignore File

.git
.github
*.md
LICENSE
.gitignore
*.log
.DS_Store
.env

Environment Variables Reference

Variable	Required	Default	Description
`PORTKEY_PORT`	No	8080	Gateway port
`PORTKEY_LOG_LEVEL`	No	info	Logging level
`PORTKEY_CACHE_ENABLED`	No	true	Enable response caching
`PORTKEY_REDIS_URL`	No	-	Redis URL for distributed caching
`OPENAI_API_KEY`	No	-	OpenAI API key
`ANTHROPIC_API_KEY`	No	-	Anthropic API key
`AZURE_OPENAI_API_KEY`	No	-	Azure OpenAI API key
`AZURE_OPENAI_ENDPOINT`	No	-	Azure OpenAI endpoint

Deploying Portkey on Klutch.sh

Gather Your API Keys

Collect API keys from your LLM providers:

OpenAI: OpenAI API Keys
Anthropic: Anthropic Console
Azure: Azure Portal OpenAI resource

Push Your Repository to GitHub

Initialize your repository and push to GitHub:

git init
git add Dockerfile .dockerignore config/ README.md
git commit -m "Initial Portkey deployment configuration"
git remote add origin https://github.com/yourusername/portkey-deploy.git
git push -u origin main

Create a New Project on Klutch.sh

Navigate to the Klutch.sh dashboard and create a new project. Give it a descriptive name like “portkey” or “ai-gateway”.

Create a New App

Within your project, create a new app. Connect your GitHub account if you haven’t already, then select the repository containing your Portkey Dockerfile.

Configure HTTP Traffic

In the deployment settings:

Select HTTP as the traffic type
Set the internal port to 8080

Set Environment Variables

In the environment variables section, add:

Variable	Value
`OPENAI_API_KEY`	Your OpenAI API key
`ANTHROPIC_API_KEY`	Your Anthropic API key
`AZURE_OPENAI_API_KEY`	Your Azure API key (if using)
`AZURE_OPENAI_ENDPOINT`	Your Azure endpoint (if using)
`PORTKEY_REDIS_URL`	Your Redis URL (optional)
`PORTKEY_CACHE_ENABLED`	`true`

Attach Persistent Volumes

Add the following volumes:

Mount Path	Recommended Size	Purpose
`/data/logs`	10 GB	Request logs and analytics
`/data/cache`	5 GB	Local cache storage

Deploy Your Application

Click Deploy to start the build process. Klutch.sh will build and deploy your Portkey instance.

Access Portkey

Once deployment completes, your AI gateway is available at https://your-app-name.klutch.sh.

Using Portkey

Making API Requests

Use Portkey as a drop-in replacement for OpenAI:

import openai

client = openai.OpenAI(
    base_url="https://your-portkey.klutch.sh/v1",
    api_key="your-portkey-api-key"
)

response = client.chat.completions.create(
    model="gpt-4",
    messages=[
        {"role": "user", "content": "Hello!"}
    ]
)

Specifying Providers

Route to specific providers:

curl https://your-portkey.klutch.sh/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer your-api-key" \
  -H "x-portkey-provider: anthropic" \
  -d '{
    "model": "claude-3-sonnet-20240229",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

Load Balancing

Distribute requests across providers:

curl https://your-portkey.klutch.sh/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer your-api-key" \
  -H "x-portkey-config: load-balance" \
  -d '{
    "model": "gpt-4",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

Monitoring and Analytics

Viewing Logs

Access request logs through the analytics endpoint or log files:

Request/response payloads
Latency metrics
Token usage
Cost tracking
Error rates

Key Metrics

Monitor important metrics:

Requests per minute: API throughput
Average latency: Response time
Cache hit rate: Cost savings from caching
Error rate: Provider reliability
Token usage: Consumption tracking
Cost per request: Financial metrics

Troubleshooting Common Issues

Provider Errors

Verify API keys are correct
Check provider rate limits
Ensure model names match provider specifications

High Latency

Enable caching for repeated queries
Check provider status pages
Consider geographic proximity

Cache Not Working

Verify Redis connection (if using)
Check cache configuration
Review cache hit/miss logs

Additional Resources

Conclusion

Deploying Portkey on Klutch.sh provides a powerful AI gateway that simplifies managing multiple LLM providers. The combination of Portkey’s intelligent routing, caching, and observability features with Klutch.sh’s deployment simplicity enables production-ready AI infrastructure without complex setup.

Whether you’re building applications with a single LLM provider or orchestrating across multiple AI services, Portkey on Klutch.sh delivers the unified access, reliability, and visibility needed for production AI applications.