Skip to content

Deploying DevLake

Introduction

Apache DevLake is a powerful open-source dev data platform that helps engineering teams collect, analyze, and visualize development metrics from multiple data sources. It integrates with popular tools like GitHub, GitLab, Jira, Jenkins, and more to provide comprehensive insights into your engineering processes, DORA metrics, sprint analytics, and code quality.

Deploying DevLake on Klutch.sh gives you a managed, scalable platform for running your engineering metrics dashboard with support for persistent storage, secure environment variables, and production-grade reliability. This guide walks you through deploying DevLake using a Dockerfile, configuring databases, setting up persistent volumes, and following production best practices.


Prerequisites

  • A Klutch.sh account
  • A GitHub repository for your DevLake deployment
  • Basic familiarity with Docker and environment variables
  • A MySQL or PostgreSQL database (DevLake requires an external database for production)
  • API tokens/credentials for the data sources you want to connect (GitHub, GitLab, Jira, etc.)

Understanding DevLake Architecture

DevLake consists of several components:

  • Config UI: Web interface for configuring data sources and connections
  • API Server: Backend API that handles data collection and transformations
  • Database: MySQL or PostgreSQL for storing collected metrics and configurations
  • Grafana: Visualization dashboard for viewing metrics and analytics (optional but recommended)

For production deployments, you’ll need:

  1. A DevLake instance (API + Config UI)
  2. A MySQL or PostgreSQL database
  3. Persistent storage for configuration and logs
  4. (Optional) Grafana for advanced visualizations

1. Prepare Your Repository

Create a new GitHub repository for your DevLake deployment or use an existing one. Your repository should contain:

devlake-deployment/
├── Dockerfile
├── .env.example
└── README.md

Important: Never commit secrets or credentials to your repository. Use Klutch.sh environment variables for all sensitive data.


2. Sample Dockerfile

Create a Dockerfile in your repository root. Klutch.sh will automatically detect and use it for deployment.

FROM apache/devlake:latest
# Set working directory
WORKDIR /app
# Expose ports
# 4000 - Config UI
# 8080 - API Server
EXPOSE 4000 8080
# Health check
HEALTHCHECK --interval=30s --timeout=10s --start-period=60s --retries=3 \
CMD curl -f http://localhost:4000/api/ping || exit 1
# The base image already includes the entrypoint
# No need to override CMD unless customizing startup

Notes:

  • The official apache/devlake image includes both the Config UI and API server
  • Port 4000 is for the Config UI (web interface)
  • Port 8080 is for the API server
  • Pin to a specific version tag (e.g., apache/devlake:v0.20.0) for production to ensure reproducible builds

3. Deploying to Klutch.sh

Follow these steps to deploy DevLake on Klutch.sh:

    Push Your Repository

    Push your repository (including the Dockerfile) to GitHub.

    Terminal window
    git add .
    git commit -m "Add DevLake Dockerfile"
    git push origin main

    Create a New Project

    Log in to Klutch.sh dashboard and create a new project.

    Set Up Database

    Before deploying DevLake, set up a MySQL or PostgreSQL database. You can use a managed database service or deploy one on Klutch.sh. Note the following connection details:

    • Database host
    • Database port (usually 3306 for MySQL, 5432 for PostgreSQL)
    • Database name
    • Database user
    • Database password

    Create the DevLake App

    Create a new app in your project:

    • Select Repository: Choose your GitHub repository with the Dockerfile
    • Select Branch: Choose the branch you want to deploy (e.g., main)
    • Traffic Type: Select HTTP
    • Internal Port: Set to 4000 (this is the Config UI port)
    • Region: Choose your preferred region
    • Compute: Select appropriate resources (minimum 2GB RAM recommended)

    Configure Environment Variables

    Add the following environment variables in the Klutch.sh app settings (mark sensitive values as secrets):

    Database Configuration (MySQL example):

    DB_URL=mysql://username:password@host:3306/dbname?charset=utf8mb4&parseTime=True&loc=UTC

    Database Configuration (PostgreSQL example):

    DB_URL=postgres://username:password@host:5432/dbname?sslmode=disable

    API Configuration:

    PORT=8080
    ENABLE_GRAFANA=true
    GRAFANA_ENDPOINT=https://your-grafana-instance.klutch.sh

    Encryption Key (generate a random 128-character string):

    ENCRYPTION_SECRET=your-random-128-character-encryption-key

    Attach Persistent Volume

    DevLake stores configuration files and logs that should persist across deployments.

    In your app settings, attach a persistent volume:

    • Mount Path: /app/.config
    • Size: At least 5GB (adjust based on your data volume)

    This ensures your data source configurations and logs are preserved during restarts and updates.

    Deploy the App

    Click “Create” to deploy. Klutch.sh will:

    1. Build the Docker image from your Dockerfile
    2. Deploy the container with your configuration
    3. Mount the persistent volume
    4. Make the app available at example-app.klutch.sh

4. Initial Configuration

Once deployed, access your DevLake instance at your app URL (e.g., https://example-app.klutch.sh).

    Access the Config UI

    Open your DevLake app URL in a browser. You’ll see the DevLake Config UI.

    Set Up Database Connection

    On first launch, DevLake will automatically use the DB_URL environment variable to connect to your database and run migrations.

    Configure Data Sources

    Add connections to your data sources:

    1. Click “Connections” in the sidebar
    2. Click “Add Connection”
    3. Select your data source type (GitHub, GitLab, Jira, Jenkins, etc.)
    4. Enter the required credentials:
      • GitHub: Personal Access Token with repo access
      • GitLab: Personal Access Token with API access
      • Jira: Username and API token
      • Jenkins: Username and API token

    Security Tip: Use tokens with minimal required permissions for each data source.

    Create a Project

    1. Navigate to “Projects” in the Config UI
    2. Click “Create Project”
    3. Add your repositories and boards
    4. Configure the metrics you want to track

    Run Data Collection

    1. Go to “Blueprints” (data collection pipelines)
    2. Create a new blueprint or use a template
    3. Configure the collection frequency
    4. Run the blueprint to start collecting data

5. Setting Up Grafana (Optional)

For advanced visualizations, you can deploy Grafana alongside DevLake:

    Deploy Grafana on Klutch.sh

    Create a separate Grafana app on Klutch.sh using the official Grafana image.

    Configure Grafana Connection

    Set the GRAFANA_ENDPOINT environment variable in your DevLake app to point to your Grafana instance.

    Import DevLake Dashboards

    DevLake provides pre-built Grafana dashboards for DORA metrics, sprint analytics, and more. Import these dashboards from the DevLake GitHub repository.


6. Environment Variables Reference

Here’s a comprehensive list of environment variables for DevLake:

Required:

DB_URL=mysql://user:pass@host:3306/dbname?charset=utf8mb4&parseTime=True&loc=UTC
ENCRYPTION_SECRET=your-128-character-secret

Optional:

PORT=8080
API_TIMEOUT=120s
ENABLE_GRAFANA=true
GRAFANA_ENDPOINT=https://grafana.example.com
LOG_LEVEL=info

For PostgreSQL:

DB_URL=postgres://user:pass@host:5432/dbname?sslmode=disable

7. Sample Code: Getting Started with DevLake API

DevLake exposes a REST API that you can use to programmatically manage connections and trigger data collection.

Create a Connection (Example: GitHub)

Terminal window
curl -X POST https://example-app.klutch.sh/api/connections \
-H "Content-Type: application/json" \
-d '{
"name": "my-github",
"plugin": "github",
"connectionId": 1,
"endpoint": "https://api.github.com/",
"token": "ghp_your_token_here",
"rateLimitPerHour": 5000
}'

Trigger a Data Collection

Terminal window
curl -X POST https://example-app.klutch.sh/api/blueprints/:blueprintId/trigger

Query Collected Metrics

Terminal window
curl -X GET https://example-app.klutch.sh/api/metrics/dora

8. Persistent Storage Best Practices

DevLake requires persistent storage for:

  • Configuration files: Data source connections and settings
  • Logs: Application and collection logs
  • Temporary data: Cache and intermediate processing files

Recommended mount paths:

  • /app/.config - Main configuration directory (required)
  • /app/logs - Log files (optional but recommended)

Volume sizing:

  • Minimum: 5GB for basic usage
  • Recommended: 20GB+ for production with multiple data sources
  • Large deployments: 50GB+ depending on data volume

9. Production Best Practices

    Use External Database

    Always use a dedicated MySQL or PostgreSQL database for production. Do not use SQLite (which is only suitable for testing).

    Database recommendations:

    • MySQL 8.0+ or PostgreSQL 13+
    • At least 2GB RAM for the database
    • Regular backups (daily recommended)
    • Connection pooling enabled

    Secure Your Deployment

    • Use strong encryption keys (128+ characters)
    • Rotate API tokens regularly
    • Restrict database access to DevLake’s IP only
    • Enable SSL/TLS for database connections
    • Use Klutch.sh’s secret management for all credentials

    Monitor Resource Usage

    DevLake can be resource-intensive when collecting large amounts of data:

    • Monitor CPU and memory usage
    • Scale vertically (more RAM/CPU) if needed
    • Adjust collection frequency based on your needs
    • Use incremental collection instead of full syncs when possible

    Backup Strategy

    1. Database: Schedule regular database backups (automated recommended)
    2. Configuration: Backup the /app/.config volume regularly
    3. Logs: Archive logs to object storage for long-term retention

    Performance Optimization

    • Set appropriate API_TIMEOUT values for large data collections
    • Use webhooks instead of polling where supported
    • Schedule data collections during off-peak hours
    • Limit the number of concurrent collections

10. Troubleshooting

DevLake Won’t Start

Check database connectivity:

Terminal window
# Test database connection
mysql -h your-db-host -u username -p

Verify environment variables:

  • Ensure DB_URL is correctly formatted
  • Check that ENCRYPTION_SECRET is set
  • Verify database exists and user has proper permissions

Data Collection Failing

Common issues:

  • Invalid or expired API tokens
  • Rate limiting from data sources
  • Network connectivity issues
  • Insufficient permissions on tokens

Solutions:

  • Regenerate API tokens with proper scopes
  • Increase rateLimitPerHour or adjust collection frequency
  • Check Klutch.sh network logs
  • Verify token permissions in the data source settings

High Memory Usage

DevLake can consume significant memory during data collection:

  • Reduce the number of concurrent collections
  • Increase the app’s memory allocation
  • Use incremental collection instead of full syncs
  • Schedule collections during off-peak hours

Database Migration Errors

If you see database migration errors:

Terminal window
# Check migration status in DevLake logs
# Migrations run automatically on startup
# If stuck, may need to manually reset migrations

11. Upgrading DevLake

To upgrade DevLake to a newer version:

    Update Dockerfile

    Edit your Dockerfile to use the new version tag:

    FROM apache/devlake:v0.21.0 # Update version here

    Commit and Push

    Terminal window
    git add Dockerfile
    git commit -m "Upgrade DevLake to v0.21.0"
    git push origin main

    Deploy Update

    Klutch.sh will automatically rebuild and redeploy with the new version. The database migrations will run automatically on startup.

    Verify Upgrade

    1. Check the DevLake UI to confirm the new version
    2. Verify all data sources are still connected
    3. Run a test data collection
    4. Check Grafana dashboards (if using)

12. Resources


Summary

Deploying Apache DevLake on Klutch.sh provides a powerful platform for engineering metrics and analytics. With Dockerfile-based deployment, persistent storage, and production-ready configurations, you can track DORA metrics, sprint performance, code quality, and more across all your development tools.

Key takeaways:

  • Use the official apache/devlake Docker image
  • Configure external MySQL/PostgreSQL for production
  • Attach persistent volumes for configuration and logs
  • Secure all credentials using Klutch.sh environment variables
  • Monitor resource usage and scale as needed
  • Regular backups of database and configuration

For questions or issues, refer to the DevLake community or Klutch.sh documentation.