Deploying a Presidio App

Introduction

Presidio is an open-source toolkit for PII detection and anonymization, typically composed of two services: Analyzer (detects PII) and Anonymizer (masks/redacts). Deploying Presidio with a Dockerfile on Klutch.sh gives you reproducible builds, managed secrets, and optional persistent storage—all configured from klutch.sh/app. This guide shows how to run the analyzer and anonymizer as separate Klutch.sh apps, set environment variables, and test with sample API calls.

Prerequisites

A Klutch.sh account (sign up)
A GitHub repository containing your Presidio Dockerfile(s) (GitHub is the only supported git source)
Optional: external Redis if you add caching, and model assets if you extend Presidio’s recognizers

For onboarding, see the Quick Start.

Architecture and ports

Klutch.sh supports one port per app. Deploy two apps (same repo/image) to serve both services:
- Analyzer: HTTP on internal port 3000; choose HTTP traffic and set internal port to 3000.
- Anonymizer: HTTP on internal port 3001; choose HTTP traffic and set internal port to 3001.
Use the analyzer app endpoint in your client SDK and point anonymization calls to the anonymizer app.

Repository layout

presidio/
├── Dockerfile              # Must be at repo root for auto-detection
└── README.md

Keep secrets out of Git; store them in Klutch.sh environment variables.

Installation (local) and starter commands

Test locally by running both services in one container:

docker build -t presidio-local .
docker run -p 3000:3000 -p 3001:3001 \
  -e ANALYZER_PORT=3000 \
  -e ANONYMIZER_PORT=3001 \
  -e LOG_LEVEL=info \
  presidio-local

Dockerfile for Presidio (production-ready)

Place this at the repo root; Klutch.sh auto-detects Dockerfiles.

FROM python:3.10-slim AS base
WORKDIR /app
RUN pip install --no-cache-dir presidio-analyzer presidio-anonymizer fastapi uvicorn[standard]

COPY ./ ./app

ENV ANALYZER_PORT=3000
ENV ANONYMIZER_PORT=3001

EXPOSE 3000 3001

CMD ["bash", "-lc", "uvicorn app.analyzer_app:app --host 0.0.0.0 --port ${ANALYZER_PORT} & uvicorn app.anonymizer_app:app --host 0.0.0.0 --port ${ANONYMIZER_PORT} && wait"]

Notes:

Split into two Klutch.sh apps, each setting only one port internally (see deployment steps). If you prefer separate images, build analyzer-only and anonymizer-only Dockerfiles; both approaches work.
Pin Python or package versions as needed for stability.

Environment variables (Klutch.sh)

Configure these before deploying:

For the analyzer app:
- ANALYZER_PORT=3000
- LOG_LEVEL=info
- Optional custom recognizers or model paths if you extend Presidio
For the anonymizer app:
- ANONYMIZER_PORT=3001
- LOG_LEVEL=info
- Optional policy configuration variables if you template anonymization behaviors

If you deploy without the Dockerfile and rely on Nixpacks:

NIXPACKS_PYTHON_VERSION=3.10
NIXPACKS_START_CMD=uvicorn app.analyzer_app:app --host 0.0.0.0 --port $ANALYZER_PORT (analyzer app)
NIXPACKS_START_CMD=uvicorn app.anonymizer_app:app --host 0.0.0.0 --port $ANONYMIZER_PORT (anonymizer app)

Attach persistent volumes

If you store local custom models or configuration files, add storage in Klutch.sh (path and size only):

/app/models — optional custom models/resources.
/app/config — optional policy/config files.

Ensure the paths align with your Presidio configuration.

Deploy Presidio on Klutch.sh (split-port workflow)

Push your repository—with the Dockerfile at the root—to GitHub.
Create the analyzer app: choose HTTP traffic, set the internal port to 3000, add analyzer env vars, and attach volumes if needed.
Deploy the analyzer app and note its URL (e.g., https://example-app.klutch.sh).
Create the anonymizer app: choose HTTP traffic, set the internal port to 3001, add anonymizer env vars, and attach the same volumes if both need shared models/config.
Deploy the anonymizer app and note its URL (e.g., https://example-app.klutch.sh).
Configure your client to call the analyzer endpoint first, then pass results to the anonymizer endpoint.

Sample API usage

Analyzer request:

curl -X POST https://example-app.klutch.sh/analyze \
  -H "Content-Type: application/json" \
  -d '{"text":"My phone is 212-555-1234 and my email is jane@example.com","language":"en"}'

Anonymizer request (using analyzer results):

curl -X POST https://example-app.klutch.sh/anonymize \
  -H "Content-Type: application/json" \
  -d '{"text":"My phone is 212-555-1234 and my email is jane@example.com","anonymizers":{"PHONE_NUMBER":{"type":"mask"}}}'

Replace URLs with your analyzer/anonymizer app URLs.

Health checks and production tips

Add HTTP readiness probes to /health (or the root) for each app.
Keep any API keys or custom model credentials in Klutch.sh secrets; rotate regularly.
Pin Python and Presidio package versions; test upgrades in staging before production.
Monitor storage usage on /app/models or /app/config if mounted; resize volumes proactively.

Presidio on Klutch.sh offers reproducible Docker deployments, split analyzer/anonymizer services with one port per app, managed secrets, and optional model storage—without extra YAML or CI steps. Configure ports, env vars, and storage, then start detecting and anonymizing PII.