Deploying a TensorFlow App
Introduction
TensorFlow Serving is a flexible, high-performance serving system for machine learning models. Deploying TensorFlow Serving with a Dockerfile on Klutch.sh provides reproducible builds, managed secrets, and persistent storage for your SavedModels—all configured from klutch.sh/app. This guide covers installation, Dockerfile setup, environment variables, storage, Nixpacks overrides, and sample inference calls.
Prerequisites
- Klutch.sh account (sign up)
- GitHub repository containing your Dockerfile and model assets (GitHub is the only supported git source)
- A SavedModel directory for the model you want to serve
Architecture and ports
- TensorFlow Serving exposes REST on internal port
8501(and gRPC on8500, but only one port is supported per app in this guide). Choose HTTP traffic and set the internal port to8501. - Persistent storage is recommended for models; attach a volume to mount SavedModels at runtime.
Repository layout
tensorflow-serving/├── Dockerfile # Must be at repo root for auto-detection├── models/my_model/ # SavedModel directory (versioned: 1/)└── README.mdKeep secrets out of Git; store them in Klutch.sh environment variables.
Installation (local) and starter commands
Build and run locally (serving a sample model):
docker build -t tfserving-local .docker run -p 8501:8501 \ -e MODEL_NAME=my_model \ tfserving-localDockerfile for TensorFlow Serving (production-ready)
Place this at the repo root; Klutch.sh auto-detects Dockerfiles.
FROM tensorflow/serving:latest
ENV MODEL_NAME=my_modelENV MODEL_BASE_PATH=/models
COPY models /models
EXPOSE 8501CMD ["tensorflow_model_server", "--rest_api_port=8501", "--model_name=${MODEL_NAME}", "--model_base_path=${MODEL_BASE_PATH}/${MODEL_NAME}"]Notes:
- Pin to a specific tag (e.g.,
tensorflow/serving:2.14.1) for stability. - REST is exposed on 8501; gRPC (8500) is not used here to align with single-port routing.
Environment variables (Klutch.sh)
Set these before deploying:
PORT=8501MODEL_NAME=my_modelMODEL_BASE_PATH=/models
If deploying without the Dockerfile and relying on Nixpacks:
NIXPACKS_START_CMD=tensorflow_model_server --rest_api_port=8501 --model_name=$MODEL_NAME --model_base_path=$MODEL_BASE_PATH/$MODEL_NAME
Attach persistent volumes
Add storage in Klutch.sh (path and size only):
/models— SavedModel directories (versioned).
Ensure the path is writable inside the container.
Deploy TensorFlow Serving on Klutch.sh (Dockerfile workflow)
- Push your repository—with the Dockerfile and
models/directory at the root—to GitHub. - Open klutch.sh/app, create a project, and add an app.
- Select HTTP traffic and set the internal port to
8501. - Add the environment variables above (model name/base path) and any app-specific secrets.
- Attach a volume at
/modelssized for your SavedModels. - Deploy. Your REST endpoint will be reachable at
https://example-app.klutch.sh/v1/models/.:predict
Sample inference request
curl -X POST https://example-app.klutch.sh/v1/models/my_model:predict \ -H "Content-Type: application/json" \ -d '{"instances": [[1.0, 2.0, 5.0]]}'Health checks and production tips
- Add an HTTP readiness probe to
/v1/models/${MODEL_NAME}for basic status. - Pin image versions and test upgrades in staging before production rollout.
- Monitor
/modelsvolume usage; resize proactively as you add versions. - Keep large models in object storage and sync them into
/modelsduring build or startup if needed.
TensorFlow Serving on Klutch.sh delivers reproducible Docker builds, managed secrets, and persistent model storage—without extra YAML or CI steps. Configure ports, env vars, and storage, then start serving predictions.