Deploying a TensorFlow App

Introduction

TensorFlow Serving is a flexible, high-performance serving system for machine learning models. Deploying TensorFlow Serving with a Dockerfile on Klutch.sh provides reproducible builds, managed secrets, and persistent storage for your SavedModels—all configured from klutch.sh/app. This guide covers installation, Dockerfile setup, environment variables, storage, Nixpacks overrides, and sample inference calls.

Prerequisites

Klutch.sh account (sign up)
GitHub repository containing your Dockerfile and model assets (GitHub is the only supported git source)
A SavedModel directory for the model you want to serve

Architecture and ports

TensorFlow Serving exposes REST on internal port 8501 (and gRPC on 8500, but only one port is supported per app in this guide). Choose HTTP traffic and set the internal port to 8501.
Persistent storage is recommended for models; attach a volume to mount SavedModels at runtime.

Repository layout

tensorflow-serving/
├── Dockerfile              # Must be at repo root for auto-detection
├── models/my_model/        # SavedModel directory (versioned: 1/)
└── README.md

Keep secrets out of Git; store them in Klutch.sh environment variables.

Installation (local) and starter commands

Build and run locally (serving a sample model):

docker build -t tfserving-local .
docker run -p 8501:8501 \
  -e MODEL_NAME=my_model \
  tfserving-local

Dockerfile for TensorFlow Serving (production-ready)

Place this at the repo root; Klutch.sh auto-detects Dockerfiles.

FROM tensorflow/serving:latest

ENV MODEL_NAME=my_model
ENV MODEL_BASE_PATH=/models

COPY models /models

EXPOSE 8501
CMD ["tensorflow_model_server", "--rest_api_port=8501", "--model_name=${MODEL_NAME}", "--model_base_path=${MODEL_BASE_PATH}/${MODEL_NAME}"]

Notes:

Pin to a specific tag (e.g., tensorflow/serving:2.14.1) for stability.
REST is exposed on 8501; gRPC (8500) is not used here to align with single-port routing.

Environment variables (Klutch.sh)

Set these before deploying:

PORT=8501
MODEL_NAME=my_model
MODEL_BASE_PATH=/models

If deploying without the Dockerfile and relying on Nixpacks:

NIXPACKS_START_CMD=tensorflow_model_server --rest_api_port=8501 --model_name=$MODEL_NAME --model_base_path=$MODEL_BASE_PATH/$MODEL_NAME

Attach persistent volumes

Add storage in Klutch.sh (path and size only):

/models — SavedModel directories (versioned).

Ensure the path is writable inside the container.

Deploy TensorFlow Serving on Klutch.sh (Dockerfile workflow)

Push your repository—with the Dockerfile and models/ directory at the root—to GitHub.
Open klutch.sh/app, create a project, and add an app.
Select HTTP traffic and set the internal port to 8501.
Add the environment variables above (model name/base path) and any app-specific secrets.
Attach a volume at /models sized for your SavedModels.
Deploy. Your REST endpoint will be reachable at https://example-app.klutch.sh/v1/models/:predict.

Sample inference request

curl -X POST https://example-app.klutch.sh/v1/models/my_model:predict \
  -H "Content-Type: application/json" \
  -d '{"instances": [[1.0, 2.0, 5.0]]}'

Health checks and production tips

Add an HTTP readiness probe to /v1/models/${MODEL_NAME} for basic status.
Pin image versions and test upgrades in staging before production rollout.
Monitor /models volume usage; resize proactively as you add versions.
Keep large models in object storage and sync them into /models during build or startup if needed.

TensorFlow Serving on Klutch.sh delivers reproducible Docker builds, managed secrets, and persistent model storage—without extra YAML or CI steps. Configure ports, env vars, and storage, then start serving predictions.