Deploying a PyTorch App
Introduction
PyTorch powers production inference for computer vision, NLP, and recommendation workloads. Deploying a PyTorch model server with a Dockerfile on Klutch.sh gives you reproducible builds, managed secrets, and persistent storage for models—all configured from klutch.sh/app. This guide uses a lightweight FastAPI/uvicorn app to serve models over HTTP.
Prerequisites
- A Klutch.sh account (sign up)
- A GitHub repository containing your Dockerfile, model code, and weights (GitHub is the only supported git source)
- (Optional) External object storage if you fetch large models at startup
For onboarding, see the Quick Start.
Architecture and ports
- The sample FastAPI server listens on internal port
8080. Choose HTTP traffic and set the internal port to8080. - Attach storage if you need to persist downloaded weights or cache files between deployments.
Repository layout
pytorch-app/├── Dockerfile # Must be at repo root for auto-detection├── app.py # FastAPI/uvicorn entrypoint├── requirements.txt # Python deps (torch, fastapi, uvicorn)└── models/ # Optional bundled weightsKeep secrets out of Git; store them in Klutch.sh environment variables.
Installation (local) and starter commands
Test locally before pushing:
docker build -t pytorch-local .docker run -p 8080:8080 \ -e MODEL_NAME=resnet18 \ pytorch-localDockerfile for PyTorch inference (production-ready)
Place this at the repo root; Klutch.sh auto-detects Dockerfiles.
FROM python:3.11-slim
WORKDIR /app
COPY requirements.txt .RUN pip install --no-cache-dir -r requirements.txt
COPY . .
ENV PORT=8080EXPOSE 8080
CMD ["bash", "-lc", "uvicorn app:app --host 0.0.0.0 --port ${PORT}"]Notes:
- Add CUDA images if you have GPU support available (Klutch.sh runs CPU-only unless GPUs are provided).
- Pin torch and model-specific libraries in
requirements.txtfor reproducibility.
Environment variables (Klutch.sh)
Set these before deploying:
PORT=8080MODEL_NAME=resnet18(example selector in your code)- Optional: paths or URLs for model weights, e.g.,
MODEL_WEIGHTS_URL, and auth tokens if pulling from private storage.
If deploying without the Dockerfile and relying on Nixpacks:
NIXPACKS_PYTHON_VERSION=3.11NIXPACKS_START_CMD=uvicorn app:app --host 0.0.0.0 --port $PORT
Attach persistent volumes
If you cache or store models locally, add storage in Klutch.sh (path and size only):
/app/models— cached or bundled weights./root/.cache/torch— torch/hub cache (optional).
Ensure the paths align with your code.
Deploy PyTorch on Klutch.sh (Dockerfile workflow)
- Push your repository—with the Dockerfile at the root—to GitHub.
- Open klutch.sh/app, create a project, and add an app.
- Select HTTP traffic and set the internal port to
8080. - Add the environment variables above (model selector and any weight URLs/secrets).
- Attach volumes at
/app/models(and/root/.cache/torchif needed) sized for your models. - Deploy. Your API will be reachable at
https://example-app.klutch.sh.
Sample FastAPI app (app.py)
from fastapi import FastAPIimport torchfrom torchvision import models, transformsfrom PIL import Imageimport ioimport base64
app = FastAPI()
MODEL_NAME = "resnet18"model = models.__dict__[MODEL_NAME](weights=models.ResNet18_Weights.DEFAULT)model.eval()
preprocess = transforms.Compose([ transforms.Resize(256), transforms.CenterCrop(224), transforms.ToTensor(), transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),])
@app.get("/health")def health(): return {"status": "ok"}
@app.post("/predict")def predict(image_b64: str): image_bytes = base64.b64decode(image_b64) img = Image.open(io.BytesIO(image_bytes)).convert("RGB") with torch.no_grad(): inp = preprocess(img).unsqueeze(0) outputs = model(inp) pred = outputs.argmax(dim=1).item() return {"prediction": int(pred)}Sample requests
Health check:
curl -I https://example-app.klutch.sh/healthInference (send a base64 image string):
curl -X POST https://example-app.klutch.sh/predict \ -H "Content-Type: application/json" \ -d '{"image_b64":"<BASE64_IMAGE>"}'Health checks and production tips
- Add readiness/liveness probes to
/health. - Pin torch and torchvision versions; test upgrades in staging.
- Store weight URLs and tokens in Klutch.sh secrets; avoid embedding secrets in the image.
- Monitor storage usage on model/cache volumes; resize before they fill.
PyTorch on Klutch.sh delivers reproducible Docker builds, managed secrets, and optional model storage—without extra YAML or CI steps. Configure ports, env vars, and volumes, then ship your inference API.