How to Deploy Gradio

Introduction

Gradio makes it easy to build interactive machine learning demos and UIs in Python. This guide shows how to deploy a Gradio app on Klutch.sh either by letting Klutch.sh build from your repo (no Dockerfile) or by providing a Dockerfile for full control. Where possible this guide links to existing Klutch.sh docs (Quick Start, Builds, Volumes) for consistent workflows.

Prerequisites

A Klutch.sh account (sign up here)
A GitHub repository for your Gradio app
Python 3.8+ knowledge
Docker installed (for Dockerfile deployments)

1. Prepare a local Gradio app

Create a project directory and virtualenv for local development:

mkdir gradio-klutch && cd gradio-klutch
python3 -m venv venv
source venv/bin/activate
pip install gradio

Refer to the Quick Start Guide for repository and project preparation.

2. Sample Gradio app (`app.py`)

Create a simple Gradio demo that accepts text and returns the length of the input.

import gradio as gr

def length_of_text(s: str):
    return {"length": len(s)}

iface = gr.Interface(
    fn=length_of_text,
    inputs=gr.Textbox(lines=2, placeholder="Type something..."),
    outputs=gr.JSON(),
    title="Text Length Demo",
    description="A minimal Gradio app to show deployment on Klutch.sh"
)

if __name__ == "__main__":
    # Bind to 0.0.0.0 and port 8000 so Klutch.sh can route traffic
    iface.launch(server_name="0.0.0.0", server_port=8000, share=False)

Notes:

For production apps, avoid Gradio’s development server for heavy traffic; use a production ASGI server (e.g., run behind gunicorn or an ASGI adapter) if you expect high concurrent usage. See the Dockerfile example below for a production-ready approach.

3. requirements.txt

Create requirements.txt:

gradio

Add additional ML libs (torch, tensorflow, transformers) as needed.

4. Deploying without a Dockerfile

Push your code to GitHub (include app.py and requirements.txt).
Log in to Klutch.sh.
Create a new project and app.
Connect your GitHub repository and branch.
Set the start command to python app.py.
Set the service port to 8000.
Add environment variables as needed.
Click “Create” to build and deploy.

For repository connection and build behavior, see the Quick Start Guide and Builds guide.

5. Deploying with a Dockerfile (recommended for production)

Use a Dockerfile to control the runtime, pin dependencies, and run behind a production server. Example using gunicorn + uvicorn workers via fastapi wrapper is common — however Gradio exposes a Flask-like interface; a simple production-ready Dockerfile below uses gunicorn and uvicorn with gradio.routes.App via ASGI adapter if needed.

Basic CPU-focused Dockerfile (simple and compatible):

FROM python:3.10-slim

WORKDIR /app

COPY requirements.txt ./
RUN pip install --no-cache-dir -r requirements.txt

COPY . .

# Expose the port Gradio will run on
EXPOSE 8000

# Default start command - uses Gradio's CLI launch
CMD ["python", "app.py"]

For higher scale, use a production ASGI server and an adapter. See the Builds guide for Dockerfile best-practices and multi-stage builds.

6. Persistent storage and assets

If your app loads large models or stores uploaded files, attach a persistent volume and mount it to the path your app expects (e.g., /app/models or /app/data). See the Volumes Guide for step-by-step instructions.

7. Production best practices

Pin dependency versions in requirements.txt.
Use environment variables for secrets and configuration.
Serve large ML models from persistent volumes or object storage.
Add health checks and readiness endpoints.
Use a proper production server (gunicorn/uvicorn) if you need high concurrency.
Monitor resource usage and scale as needed.
Use CI/CD to build and publish container images.

Resources

Deploying Gradio on Klutch.sh is straightforward: for quick demos, let Klutch.sh build from your repo; for production, prefer a Dockerfile with pinned deps and a production server. If you’d like, I can add a gunicorn+uvicorn example and a sample Dockerfile tuned for more throughput.