Deploying Druid

Introduction

Apache Druid is a high-performance, real-time analytics database designed for fast aggregations and exploratory queries on event-driven data. Built to power interactive dashboards and high-concurrency analytics applications, Druid delivers sub-second query latencies even when analyzing trillions of events. With its column-oriented storage, distributed architecture, and support for streaming and batch ingestion, Druid has become the go-to choice for organizations that need to analyze large-scale time-series data in real time.

Druid combines the best aspects of data warehouses, time-series databases, and search systems into a unified platform. It supports both SQL and native queries, making it accessible to analysts while providing the performance that engineers demand. Companies like Netflix, Airbnb, and Reddit rely on Druid to power their analytics at massive scale.

Key Features:

Lightning-Fast Queries: Optimized column-oriented storage with bitmap indexes enables sub-second aggregations over billions of rows
Real-Time Streaming: Native support for Apache Kafka and Amazon Kinesis allows continuous data ingestion with immediate query availability
Scalable Architecture: Horizontally scalable with separate compute for ingestion, storage, and queries
Time-Optimized: Built specifically for time-series analytics with advanced time-based partitioning and rollup capabilities
Flexible Schema: Support for nested JSON data and schema evolution without downtime
High Availability: Automatic failover, replication, and self-healing capabilities ensure continuous operation
SQL Support: Full ANSI SQL compatibility with extensions for time-series operations
Approximate Algorithms: Built-in support for HyperLogLog, theta sketches, and other probabilistic data structures
Rich Ecosystem: Native integrations with Kafka, Hadoop, S3, and modern data infrastructure

This comprehensive guide walks you through deploying Apache Druid on Klutch.sh using Docker, including single-server and clustered configurations, metadata storage setup, deep storage integration, and production-ready best practices for operating Druid at scale.

Why Deploy Druid on Klutch.sh

Deploying Apache Druid on Klutch.sh provides several advantages for real-time analytics workloads:

Simplified Infrastructure: Klutch.sh automatically detects your Dockerfile and handles container orchestration, letting you focus on Druid configuration rather than infrastructure management
TCP Traffic Support: Native TCP routing allows direct connections to Druid’s query endpoints on port 8000, with internal routing to Druid’s coordinator (8081), broker (8082), and router (8888) ports
Persistent Storage: Attach persistent volumes for local segment cache and metadata, ensuring data durability across deployments and enabling fast query performance
Environment Management: Securely configure Druid properties through environment variables without exposing sensitive credentials in your repository
Vertical Scaling: Easily adjust CPU and memory resources to match your query concurrency and data volume requirements
GitHub Integration: Deploy directly from GitHub with automatic rebuilds when you update Druid configuration or dependencies
Cost Efficiency: Start with single-server Druid deployments for development and testing, then scale to clustered configurations as your data grows
Multi-Region Support: Deploy Druid instances in regions close to your data sources and users for optimal latency
HTTP and TCP: Support both HTTP REST API access and native query protocols on the same deployment

Prerequisites

Before deploying Druid on Klutch.sh, ensure you have:

A Klutch.sh account
A GitHub account for repository hosting
Docker installed locally for testing (optional but recommended)
Basic understanding of data warehousing and analytics concepts
Familiarity with SQL and time-series data
(Recommended) A PostgreSQL database for metadata storage in production
(Recommended) Object storage (S3-compatible) for deep storage in production
(Optional) A Kafka cluster for streaming ingestion

Understanding Druid Architecture

Apache Druid uses a distributed, microservices-based architecture with specialized processes for different functions:

Core Processes

Master Server Processes:

Coordinator: Manages data availability and segment balancing across Historical nodes
Overlord: Manages data ingestion workloads and task distribution to MiddleManager nodes

Query Server Processes:

Broker: Routes queries to appropriate data nodes and merges results
Router: Optional API gateway that provides a unified endpoint for the Druid cluster

Data Server Processes:

Historical: Serves queries on immutable, historical data segments
MiddleManager: Ingests data and creates new segments
Indexer: Alternative to MiddleManager for simplified deployment

External Dependencies

Deep Storage: Long-term storage for segments (S3, HDFS, local filesystem, etc.)
Metadata Storage: Stores segment metadata and system configuration (PostgreSQL, MySQL, or Derby)
ZooKeeper: Coordinates internal service discovery and leader election

Deployment Modes

Micro-Quickstart (Development): All processes in a single JVM with Derby metadata storage

Single-Server (Small Production): Multiple JVM processes on one machine with external metadata storage

Clustered (Large Production): Distributed processes across multiple machines for high availability

For Klutch.sh deployments, we’ll focus on single-server configurations that can scale vertically, with external metadata and deep storage for production use.

Preparing Your Repository

To deploy Druid on Klutch.sh, you’ll create a GitHub repository with a Dockerfile and configuration files.

Step 1: Create Repository Structure

Create a new directory for your Druid deployment:

mkdir druid-klutch
cd druid-klutch
git init

Create the following directory structure:

druid-klutch/
├── Dockerfile
├── docker-compose.yml          # For local testing only
├── conf/
│   ├── druid/
│   │   └── cluster/
│   │       └── _common/
│   │           ├── common.runtime.properties
│   │           └── log4j2.xml
│   └── supervise/
│       └── single-server/
│           └── micro-quickstart.conf
├── scripts/
│   └── start-druid.sh
└── README.md

Step 2: Create the Dockerfile

Create a production-ready Dockerfile in your repository root:

# Use official Apache Druid image as base
FROM apache/druid:28.0.0

# Set environment variables for Java heap settings
# These will be overridden by Klutch.sh environment variables
ENV DRUID_XMX=1g
ENV DRUID_XMS=1g
ENV DRUID_MAXNEWSIZE=250m
ENV DRUID_NEWSIZE=250m
ENV DRUID_MAXDIRECTMEMORYSIZE=400m

# Set Druid service to start (single-server mode)
# Options: micro-quickstart, small, medium, large, xlarge
ENV DRUID_SINGLE_SERVER_TYPE=micro-quickstart

# Set working directory
WORKDIR /opt/druid

# Copy custom configurations if present
COPY conf/druid/cluster/_common/*.properties conf/druid/cluster/_common/ 2>/dev/null || true
COPY conf/druid/cluster/_common/*.xml conf/druid/cluster/_common/ 2>/dev/null || true

# Copy custom startup script
COPY scripts/start-druid.sh /opt/druid/scripts/
RUN chmod +x /opt/druid/scripts/start-druid.sh

# Create directories for persistent storage
RUN mkdir -p /opt/druid/var/druid/segments \
    && mkdir -p /opt/druid/var/druid/segment-cache \
    && mkdir -p /opt/druid/var/druid/task \
    && mkdir -p /opt/druid/var/tmp

# Expose Druid ports
# 8081: Coordinator
# 8082: Broker
# 8083: Historical
# 8090: Overlord
# 8091: MiddleManager
# 8888: Router (unified API endpoint)
EXPOSE 8081 8082 8083 8090 8091 8888

# Health check on router endpoint
HEALTHCHECK --interval=30s --timeout=10s --start-period=60s --retries=3 \
  CMD curl -f http://localhost:8888/status/health || exit 1

# Use custom start script
CMD ["/opt/druid/scripts/start-druid.sh"]

Step 3: Create Common Configuration

Create conf/druid/cluster/_common/common.runtime.properties:

# Extensions
druid.extensions.loadList=["druid-histogram", "druid-datasketches", "druid-lookups-cached-global", "postgresql-metadata-storage", "druid-kafka-indexing-service", "druid-s3-extensions"]

# Logging
druid.startup.logging.logProperties=true

# Zookeeper
# For single-server, use embedded ZooKeeper
druid.zk.service.host=localhost
druid.zk.paths.base=/druid

# Metadata storage (Derby for quickstart, PostgreSQL for production)
# Override these with environment variables in Klutch.sh
druid.metadata.storage.type=derby
druid.metadata.storage.connector.connectURI=jdbc:derby://localhost:1527/var/druid/metadata.db;create=true
druid.metadata.storage.connector.host=localhost
druid.metadata.storage.connector.port=1527
druid.metadata.storage.connector.createTables=true

# Deep storage (local for quickstart, S3 for production)
druid.storage.type=local
druid.storage.storageDirectory=/opt/druid/var/druid/segments

# Indexing service logs
druid.indexer.logs.type=file
druid.indexer.logs.directory=/opt/druid/var/druid/indexing-logs

# Service discovery
druid.selectors.indexing.serviceName=druid/overlord
druid.selectors.coordinator.serviceName=druid/coordinator

# Monitoring
druid.monitoring.monitors=["org.apache.druid.java.util.metrics.JvmMonitor"]
druid.emitter=noop

# Storage type of double columns
druid.indexing.doubleStorage=double

# SQL
druid.sql.enable=true

# Lookups
druid.lookup.enableLookupSyncOnStartup=false

Step 4: Create Startup Script

Create scripts/start-druid.sh:

#!/bin/bash
set -e

echo "Starting Apache Druid in single-server mode: ${DRUID_SINGLE_SERVER_TYPE}"

# Override metadata storage if PostgreSQL credentials provided
if [ ! -z "$POSTGRES_HOST" ]; then
    echo "Configuring PostgreSQL metadata storage..."
    cat >> /opt/druid/conf/druid/cluster/_common/common.runtime.properties <<EOF

# PostgreSQL metadata storage (production)
druid.metadata.storage.type=postgresql
druid.metadata.storage.connector.connectURI=jdbc:postgresql://${POSTGRES_HOST}:${POSTGRES_PORT:-5432}/${POSTGRES_DB:-druid}
druid.metadata.storage.connector.user=${POSTGRES_USER:-druid}
druid.metadata.storage.connector.password=${POSTGRES_PASSWORD}
druid.metadata.storage.connector.createTables=true
EOF
fi

# Override deep storage if S3 credentials provided
if [ ! -z "$S3_BUCKET" ]; then
    echo "Configuring S3 deep storage..."
    cat >> /opt/druid/conf/druid/cluster/_common/common.runtime.properties <<EOF

# S3 deep storage (production)
druid.storage.type=s3
druid.storage.bucket=${S3_BUCKET}
druid.storage.baseKey=${S3_BASE_KEY:-druid/segments}
druid.s3.accessKey=${S3_ACCESS_KEY}
druid.s3.secretKey=${S3_SECRET_KEY}
druid.s3.endpoint.url=${S3_ENDPOINT:-}
EOF
fi

# Set Java heap sizes from environment variables
export DRUID_XMX=${DRUID_XMX:-1g}
export DRUID_XMS=${DRUID_XMS:-1g}
export DRUID_MAXNEWSIZE=${DRUID_MAXNEWSIZE:-250m}
export DRUID_NEWSIZE=${DRUID_NEWSIZE:-250m}
export DRUID_MAXDIRECTMEMORYSIZE=${DRUID_MAXDIRECTMEMORYSIZE:-400m}

# Start Druid in single-server mode
exec /opt/druid/bin/start-${DRUID_SINGLE_SERVER_TYPE}

Step 5: Create Docker Compose for Local Testing

Create docker-compose.yml for local development and testing:

version: "3.8"

services:
  postgres:
    image: postgres:16-alpine
    container_name: druid-postgres
    environment:
      POSTGRES_DB: druid
      POSTGRES_USER: druid
      POSTGRES_PASSWORD: druid_password
    ports:
      - "5432:5432"
    volumes:
      - postgres_data:/var/lib/postgresql/data
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U druid"]
      interval: 10s
      timeout: 5s
      retries: 5

  druid:
    build: .
    container_name: druid-single
    ports:
      - "8888:8888"  # Router
      - "8081:8081"  # Coordinator
      - "8082:8082"  # Broker
      - "8083:8083"  # Historical
      - "8090:8090"  # Overlord
      - "8091:8091"  # MiddleManager
    environment:
      # Java heap settings
      DRUID_XMX: 2g
      DRUID_XMS: 2g
      DRUID_MAXNEWSIZE: 500m
      DRUID_NEWSIZE: 500m
      DRUID_MAXDIRECTMEMORYSIZE: 1g

      # Server type
      DRUID_SINGLE_SERVER_TYPE: micro-quickstart

      # PostgreSQL metadata storage
      POSTGRES_HOST: postgres
      POSTGRES_PORT: 5432
      POSTGRES_DB: druid
      POSTGRES_USER: druid
      POSTGRES_PASSWORD: druid_password
    volumes:
      - druid_data:/opt/druid/var
    depends_on:
      postgres:
        condition: service_healthy
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:8888/status/health"]
      interval: 30s
      timeout: 10s
      retries: 3
      start_period: 60s

volumes:
  postgres_data:
  druid_data:

Step 6: Create Documentation

Create a README.md:

# Apache Druid on Klutch.sh

Real-time analytics database for fast queries at scale.

## Features

- Sub-second query latencies on large datasets
- Real-time streaming ingestion from Kafka
- Column-oriented storage with bitmap indexes
- SQL and native query support
- Horizontal scalability
- High availability with automatic failover

## Local Development

Test locally with Docker Compose:

```bash
docker-compose up -d

Access Druid web console at: http://localhost:8888

Production Deployment on Klutch.sh

Required Environment Variables

Set these in the Klutch.sh dashboard:

Java Heap Configuration:

DRUID_XMX: Maximum heap size (e.g., 4g)
DRUID_XMS: Initial heap size (e.g., 4g)
DRUID_MAXNEWSIZE: Max new generation size (e.g., 1g)
DRUID_NEWSIZE: Initial new generation size (e.g., 1g)
DRUID_MAXDIRECTMEMORYSIZE: Max direct memory (e.g., 2g)

PostgreSQL Metadata Storage (Recommended for production):

POSTGRES_HOST: PostgreSQL host
POSTGRES_PORT: PostgreSQL port (default: 5432)
POSTGRES_DB: Database name (e.g., druid)
POSTGRES_USER: Database user
POSTGRES_PASSWORD: Database password

S3 Deep Storage (Recommended for production):

S3_BUCKET: S3 bucket name
S3_BASE_KEY: Base path in bucket (default: druid/segments)
S3_ACCESS_KEY: AWS access key
S3_SECRET_KEY: AWS secret key
S3_ENDPOINT: S3 endpoint (optional, for S3-compatible storage)

Persistent Volumes

Attach a persistent volume for local caching and temporary storage:

Mount Path: /opt/druid/var
Recommended Size: 50GB-200GB depending on query volume

Traffic Configuration

Traffic Type: Select HTTP for web console and API access
Internal Port: 8888 (Router endpoint)

Alternative for programmatic access:

Traffic Type: TCP for native Druid client connections
Internal Port: 8082 (Broker endpoint)

Usage

Web Console

Access the Druid web console to:

Load data through the data loader wizard
Execute SQL queries
Monitor ingestion tasks
View datasources and segments

SQL Queries

Query Druid using standard SQL:

SELECT
  TIME_FLOOR(__time, 'PT1H') AS hour,
  COUNT(*) AS event_count,
  SUM(bytes_sent) AS total_bytes
FROM events
WHERE __time >= CURRENT_TIMESTAMP - INTERVAL '24' HOUR
GROUP BY 1
ORDER BY 1 DESC

Streaming Ingestion

Ingest data from Kafka using supervisor specs or the web console data loader.

License

Apache License 2.0

### Step 7: Initialize Git and Push to GitHub

```bash
# Add all files
git add .

# Commit
git commit -m "Initial Druid deployment configuration"

# Add GitHub remote (replace with your repository URL)
git remote add origin https://github.com/yourusername/druid-klutch.git

# Push to GitHub
git branch -M main
git push -u origin main

Deploying to Klutch.sh

Now that your repository is prepared, follow these steps to deploy Apache Druid on Klutch.sh.

Deployment Steps

**Navigate to Klutch.sh Dashboard**
Visit klutch.sh/app and log in to your account.
**Create a New Project**
Click “New Project” and give it a name like “Druid Analytics” to organize your deployment.
**Create a New App**
Click “New App” or “Create App” and select GitHub as your source.
**Connect Your Repository**
- Authenticate with GitHub if not already connected
- Select your Druid repository from the list
- Choose the main branch for deployment
**Configure Application Settings**
- App Name: Choose a unique name (e.g., druid-analytics)
- Traffic Type: Select HTTP for web console access
- Internal Port: Set to 8888 (Druid Router endpoint)
For programmatic access via native Druid protocol, you can alternatively use:
- Traffic Type: TCP
- Internal Port: 8082 (Broker endpoint)

**Set Environment Variables**

Configure these environment variables in the Klutch.sh dashboard:

Required - Java Heap Configuration:

DRUID_XMX=4g
DRUID_XMS=4g
DRUID_MAXNEWSIZE=1g
DRUID_NEWSIZE=1g
DRUID_MAXDIRECTMEMORYSIZE=2g

Recommended - PostgreSQL Metadata Storage:

First, deploy PostgreSQL using our PostgreSQL guide, then configure:

POSTGRES_HOST=your-postgres-app.klutch.sh
POSTGRES_PORT=8000
POSTGRES_DB=druid
POSTGRES_USER=druid
POSTGRES_PASSWORD=your-secure-password

Recommended - S3 Deep Storage:

S3_BUCKET=your-druid-bucket
S3_BASE_KEY=druid/segments
S3_ACCESS_KEY=your-access-key
S3_SECRET_KEY=your-secret-key

For S3-compatible storage (MinIO, Wasabi, etc.):

S3_ENDPOINT=https://s3.your-provider.com

Optional - Server Sizing:

DRUID_SINGLE_SERVER_TYPE=small

Options: micro-quickstart, small, medium, large, xlarge

**Attach Persistent Volume**
Critical for local segment caching and temporary storage:
- Click “Add Volume” in the Volumes section
- Mount Path: /opt/druid/var
- Size: 50GB minimum, 100-200GB recommended for production
This volume stores:
- Segment cache for fast query performance
- Task logs and temporary ingestion files
- ZooKeeper data for embedded mode
**Deploy Application**
Click “Create” or “Deploy” to start the deployment. Klutch.sh will:
- Automatically detect your Dockerfile
- Build the Docker image with your Druid configuration
- Attach the persistent volume
- Start your Druid container
- Assign a URL for external access
The first deployment takes 3-5 minutes as Druid initializes metadata tables and starts all services.
**Verify Deployment**
Once deployed, your Druid instance will be available at:
```
https://your-app-name.klutch.sh
```
Access the Druid web console by visiting this URL in your browser. You should see:
- The Druid console home page
- Available datasources (empty on first deployment)
- Status indicators showing all services running
**Test Database Connection**
Verify Druid is running properly:

Via Web Console:
- Navigate to the Query view
- Execute a test query: SELECT 1
- Verify successful execution
Via HTTP API:
Terminal window
```
curl https://your-app-name.klutch.sh/status/health
```
Expected response: {"status":"healthy"}

Via SQL endpoint:
Terminal window
```
curl -X POST \
  -H 'Content-Type: application/json' \
  https://your-app-name.klutch.sh/druid/v2/sql \
  -d '{"query": "SELECT 1"}'
```

Connecting to Druid

Once deployed, you can connect to Druid from your applications using various methods and client libraries.

Connection URL Formats

HTTP REST API:

https://example-app.klutch.sh/druid/v2

SQL endpoint:

https://example-app.klutch.sh/druid/v2/sql

Web Console:

https://example-app.klutch.sh

Native Query (TCP traffic):

If deployed with TCP traffic on broker port:

example-app.klutch.sh:8000

Example Connection Code

Python (using pydruid)

from pydruid.client import PyDruid

# Connect to Druid
druid = PyDruid(
    'https://example-app.klutch.sh',
    'druid/v2/'
)

# Execute a native Druid query
result = druid.timeseries(
    datasource='events',
    granularity='hour',
    intervals='2024-01-01/2024-01-02',
    aggregations={'count': doublesum('count')},
    filter=Dimension('country') == 'US'
)

print(result)

Python (using SQL with requests)

import requests
import json

# SQL query endpoint
url = 'https://example-app.klutch.sh/druid/v2/sql'

# Execute SQL query
query = {
    "query": """
        SELECT
            TIME_FLOOR(__time, 'PT1H') AS hour,
            COUNT(*) AS event_count,
            SUM(bytes_sent) AS total_bytes
        FROM events
        WHERE __time >= CURRENT_TIMESTAMP - INTERVAL '24' HOUR
        GROUP BY 1
        ORDER BY 1 DESC
    """
}

response = requests.post(
    url,
    headers={'Content-Type': 'application/json'},
    data=json.dumps(query)
)

results = response.json()
for row in results:
    print(f"Hour: {row['hour']}, Events: {row['event_count']}, Bytes: {row['total_bytes']}")

Node.js (using axios)

const axios = require('axios');

// Druid SQL endpoint
const druidUrl = 'https://example-app.klutch.sh/druid/v2/sql';

// Execute SQL query
async function queryDruid() {
  const query = {
    query: `
      SELECT
        TIME_FLOOR(__time, 'PT1H') AS hour,
        COUNT(*) AS event_count
      FROM events
      WHERE __time >= CURRENT_TIMESTAMP - INTERVAL '24' HOUR
      GROUP BY 1
      ORDER BY 1 DESC
    `
  };

  try {
    const response = await axios.post(druidUrl, query, {
      headers: { 'Content-Type': 'application/json' }
    });

    console.log('Query results:', response.data);
    return response.data;
  } catch (error) {
    console.error('Query failed:', error.message);
    throw error;
  }
}

// Run query
queryDruid();

Java (using Druid SQL JDBC)

import java.sql.*;

public class DruidExample {
    public static void main(String[] args) {
        String url = "jdbc:avatica:remote:url=https://example-app.klutch.sh/druid/v2/sql/avatica/";

        try (Connection conn = DriverManager.getConnection(url)) {
            String sql = """
                SELECT
                    TIME_FLOOR(__time, 'PT1H') AS hour,
                    COUNT(*) AS event_count
                FROM events
                WHERE __time >= CURRENT_TIMESTAMP - INTERVAL '24' HOUR
                GROUP BY 1
                ORDER BY 1 DESC
                """;

            try (Statement stmt = conn.createStatement();
                 ResultSet rs = stmt.executeQuery(sql)) {

                while (rs.next()) {
                    System.out.println(
                        "Hour: " + rs.getTimestamp("hour") +
                        ", Events: " + rs.getLong("event_count")
                    );
                }
            }
        } catch (SQLException e) {
            e.printStackTrace();
        }
    }
}

Go (using HTTP client)

package main

import (
    "bytes"
    "encoding/json"
    "fmt"
    "io/ioutil"
    "net/http"
)

type SQLQuery struct {
    Query string `json:"query"`
}

type QueryResult struct {
    Hour       string `json:"hour"`
    EventCount int64  `json:"event_count"`
}

func main() {
    druidURL := "https://example-app.klutch.sh/druid/v2/sql"

    query := SQLQuery{
        Query: `
            SELECT
                TIME_FLOOR(__time, 'PT1H') AS hour,
                COUNT(*) AS event_count
            FROM events
            WHERE __time >= CURRENT_TIMESTAMP - INTERVAL '24' HOUR
            GROUP BY 1
            ORDER BY 1 DESC
        `,
    }

    jsonData, _ := json.Marshal(query)

    resp, err := http.Post(
        druidURL,
        "application/json",
        bytes.NewBuffer(jsonData),
    )
    if err != nil {
        panic(err)
    }
    defer resp.Body.Close()

    body, _ := ioutil.ReadAll(resp.Body)

    var results []QueryResult
    json.Unmarshal(body, &results)

    for _, r := range results {
        fmt.Printf("Hour: %s, Events: %d\n", r.Hour, r.EventCount)
    }
}

Ruby (using HTTP client)

require 'net/http'
require 'json'
require 'uri'

# Druid SQL endpoint
uri = URI('https://example-app.klutch.sh/druid/v2/sql')

# SQL query
query = {
  query: <<~SQL
    SELECT
      TIME_FLOOR(__time, 'PT1H') AS hour,
      COUNT(*) AS event_count
    FROM events
    WHERE __time >= CURRENT_TIMESTAMP - INTERVAL '24' HOUR
    GROUP BY 1
    ORDER BY 1 DESC
  SQL
}

# Execute query
http = Net::HTTP.new(uri.host, uri.port)
http.use_ssl = true

request = Net::HTTP::Post.new(uri.path)
request['Content-Type'] = 'application/json'
request.body = query.to_json

response = http.request(request)
results = JSON.parse(response.body)

results.each do |row|
  puts "Hour: #{row['hour']}, Events: #{row['event_count']}"
end

PHP (using cURL)

<?php

// Druid SQL endpoint
$druidUrl = 'https://example-app.klutch.sh/druid/v2/sql';

// SQL query
$query = [
    'query' => "
        SELECT
            TIME_FLOOR(__time, 'PT1H') AS hour,
            COUNT(*) AS event_count
        FROM events
        WHERE __time >= CURRENT_TIMESTAMP - INTERVAL '24' HOUR
        GROUP BY 1
        ORDER BY 1 DESC
    "
];

// Execute query using cURL
$ch = curl_init($druidUrl);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_POST, true);
curl_setopt($ch, CURLOPT_POSTFIELDS, json_encode($query));
curl_setopt($ch, CURLOPT_HTTPHEADER, [
    'Content-Type: application/json'
]);

$response = curl_exec($ch);
$httpCode = curl_getinfo($ch, CURLINFO_HTTP_CODE);
curl_close($ch);

if ($httpCode === 200) {
    $results = json_decode($response, true);

    foreach ($results as $row) {
        echo sprintf(
            "Hour: %s, Events: %d\n",
            $row['hour'],
            $row['event_count']
        );
    }
} else {
    echo "Query failed with HTTP code: $httpCode\n";
}

Getting Started with Druid

After deploying Druid on Klutch.sh, follow these steps to load data and run your first queries.

Loading Sample Data

The easiest way to get started is through the Druid web console’s data loader:

**Access Web Console**
Navigate to https://your-app-name.klutch.sh
**Open Data Loader**
Click “Load data” from the home page or navigate to the Ingestion view.
**Choose Data Source**
Select from various options:
- Local disk: Upload a file directly
- HTTP: Load data from a URL
- Inline: Paste data directly
- Kafka: Connect to a Kafka topic
- Amazon Kinesis: Stream from Kinesis
For testing, choose “Example data” to load a sample dataset.
**Configure Ingestion**
Follow the wizard to:
- Parse your data format (JSON, CSV, etc.)
- Define time column and parsing format
- Configure dimensions and metrics
- Set rollup and partitioning options
- Review and submit ingestion task
**Monitor Ingestion**
Watch the task progress in the Ingestion view. Once complete, your data will be queryable immediately.

Running Your First Query

Execute SQL queries through the web console:

**Navigate to Query View**
Click “Query” in the top navigation.

**Write Your SQL**

Enter a SQL query:

SELECT
  TIME_FLOOR(__time, 'PT1H') AS hour,
  COUNT(*) AS events
FROM wikipedia
GROUP BY 1
ORDER BY 1 DESC
LIMIT 24

**Execute Query**
Click “Run” or press Ctrl+Enter (Cmd+Enter on Mac).
**View Results**
Results appear in a table below the query editor. You can:
- Export results to CSV or JSON
- Visualize data with built-in charts
- Save queries for later use

Streaming Ingestion from Kafka

To ingest real-time data from Kafka, you’ll need a Kafka cluster. Deploy one using our Kafka deployment guide.

Create a supervisor spec for streaming ingestion:

{
  "type": "kafka",
  "spec": {
    "dataSchema": {
      "dataSource": "events",
      "timestampSpec": {
        "column": "timestamp",
        "format": "iso"
      },
      "dimensionsSpec": {
        "dimensions": [
          "user_id",
          "event_type",
          "country",
          "device"
        ]
      },
      "metricsSpec": [
        {
          "type": "count",
          "name": "count"
        },
        {
          "type": "longSum",
          "name": "bytes_sent",
          "fieldName": "bytes"
        }
      ],
      "granularitySpec": {
        "type": "uniform",
        "segmentGranularity": "HOUR",
        "queryGranularity": "MINUTE",
        "rollup": true
      }
    },
    "ioConfig": {
      "topic": "events",
      "consumerProperties": {
        "bootstrap.servers": "your-kafka-app.klutch.sh:8000"
      },
      "taskCount": 1,
      "replicas": 1,
      "taskDuration": "PT1H"
    },
    "tuningConfig": {
      "type": "kafka",
      "maxRowsPerSegment": 5000000
    }
  }
}

Submit this spec through the web console or API:

curl -X POST \
  -H 'Content-Type: application/json' \
  https://your-app-name.klutch.sh/druid/indexer/v1/supervisor \
  -d @supervisor-spec.json

Batch Ingestion from Files

Ingest data from local files or cloud storage:

{
  "type": "index_parallel",
  "spec": {
    "dataSchema": {
      "dataSource": "events",
      "timestampSpec": {
        "column": "timestamp",
        "format": "iso"
      },
      "dimensionsSpec": {
        "dimensions": ["user_id", "event_type"]
      },
      "metricsSpec": [
        {"type": "count", "name": "count"}
      ],
      "granularitySpec": {
        "type": "uniform",
        "segmentGranularity": "DAY",
        "queryGranularity": "HOUR"
      }
    },
    "ioConfig": {
      "type": "index_parallel",
      "inputSource": {
        "type": "http",
        "uris": ["https://example.com/data.json"]
      },
      "inputFormat": {
        "type": "json"
      }
    },
    "tuningConfig": {
      "type": "index_parallel",
      "maxRowsPerSegment": 5000000
    }
  }
}

Advanced Configuration

Metadata Storage Configuration

For production deployments, use PostgreSQL for metadata storage instead of embedded Derby.

First, deploy PostgreSQL following our PostgreSQL guide. Then configure Druid to use it:

Environment Variables:

POSTGRES_HOST=your-postgres-app.klutch.sh
POSTGRES_PORT=8000
POSTGRES_DB=druid
POSTGRES_USER=druid
POSTGRES_PASSWORD=your-secure-password

The startup script automatically configures Druid to use PostgreSQL when these variables are set.

Manual Configuration (in common.runtime.properties):

druid.metadata.storage.type=postgresql
druid.metadata.storage.connector.connectURI=jdbc:postgresql://your-postgres-app.klutch.sh:8000/druid
druid.metadata.storage.connector.user=druid
druid.metadata.storage.connector.password=your-secure-password
druid.metadata.storage.connector.createTables=true

Deep Storage Configuration

Configure S3 or S3-compatible storage for segment archival:

AWS S3:

S3_BUCKET=my-druid-segments
S3_BASE_KEY=production/segments
S3_ACCESS_KEY=AKIAIOSFODNN7EXAMPLE
S3_SECRET_KEY=wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY

S3-Compatible Storage (MinIO, Wasabi, etc.):

S3_BUCKET=druid-segments
S3_BASE_KEY=segments
S3_ACCESS_KEY=minioadmin
S3_SECRET_KEY=minioadmin
S3_ENDPOINT=https://minio.example.com

Manual Configuration (in common.runtime.properties):

druid.storage.type=s3
druid.storage.bucket=my-druid-segments
druid.storage.baseKey=production/segments
druid.s3.accessKey=AKIAIOSFODNN7EXAMPLE
druid.s3.secretKey=wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY

Java Heap Tuning

Adjust Java heap sizes based on your workload:

Small Workload (2-4GB RAM available):

DRUID_XMX=2g
DRUID_XMS=2g
DRUID_MAXNEWSIZE=500m
DRUID_NEWSIZE=500m
DRUID_MAXDIRECTMEMORYSIZE=1g

Medium Workload (8-16GB RAM available):

DRUID_XMX=8g
DRUID_XMS=8g
DRUID_MAXNEWSIZE=2g
DRUID_NEWSIZE=2g
DRUID_MAXDIRECTMEMORYSIZE=4g

Large Workload (32GB+ RAM available):

DRUID_XMX=16g
DRUID_XMS=16g
DRUID_MAXNEWSIZE=4g
DRUID_NEWSIZE=4g
DRUID_MAXDIRECTMEMORYSIZE=8g

Guidelines:

Set DRUID_XMX and DRUID_XMS to the same value to avoid heap resizing
Allocate 50-75% of available RAM to heap memory
Reserve RAM for direct memory and OS cache
New generation size should be 25-30% of max heap

Query Performance Tuning

Optimize query performance with these settings:

Enable Query Caching:

Add to common.runtime.properties:

# Enable caching
druid.cache.type=caffeine
druid.cache.sizeInBytes=256000000
druid.cache.expireAfter=3600000

# Broker cache config
druid.broker.cache.useCache=true
druid.broker.cache.populateCache=true

# Historical cache config
druid.historical.cache.useCache=true
druid.historical.cache.populateCache=true

Segment Cache Size:

Adjust how much of the segment cache Historical nodes keep in memory:

# In Historical node config
druid.segmentCache.locations=[{"path":"/opt/druid/var/druid/segment-cache","maxSize":10737418240}]
druid.server.maxSize=10737418240

Parallel Query Processing:

# Enable parallel query processing
druid.processing.buffer.sizeBytes=134217728
druid.processing.numThreads=7
druid.processing.numMergeBuffers=2

Security Configuration

Enable authentication and authorization:

Basic Authentication:

# Enable basic auth
druid.auth.authenticatorChain=["basic"]
druid.auth.authenticator.basic.type=basic
druid.auth.authenticator.basic.initialAdminPassword=admin123
druid.auth.authenticator.basic.initialInternalClientPassword=internal123
druid.auth.authenticator.basic.credentialsValidator.type=metadata
druid.auth.authenticator.basic.skipOnFailure=false

# Enable authorization
druid.auth.authorizers=["basic"]
druid.auth.authorizer.basic.type=basic

TLS/SSL:

To enable HTTPS for Druid endpoints:

# Enable TLS
druid.enablePlaintextPort=false
druid.enableTlsPort=true
druid.tls.keyStorePath=/path/to/keystore.jks
druid.tls.keyStorePassword=keystorePassword
druid.tls.certAlias=druid

Note: When deployed on Klutch.sh, HTTPS is provided automatically by the platform. Internal Druid communication can use plaintext.

Monitoring and Metrics

Druid emits metrics that can be consumed by monitoring systems:

Enable Prometheus Metrics:

Add to common.runtime.properties:

druid.emitter=composing
druid.emitter.composing.emitters=["prometheus"]
druid.emitter.prometheus.strategy=exporter
druid.emitter.prometheus.port=9090

Key Metrics to Monitor:

query/time: Query execution time
query/bytes: Bytes processed per query
segment/scan/pending: Pending segment scans
jvm/mem/used: JVM memory usage
ingest/events/processed: Events ingested
segment/count: Total segments
segment/size: Total segment size

Health Check Endpoints:

# Overall cluster health
curl https://your-app-name.klutch.sh/status/health

# Coordinator status
curl https://your-app-name.klutch.sh/druid/coordinator/v1/leader

# Datasources
curl https://your-app-name.klutch.sh/druid/coordinator/v1/datasources

Production Best Practices

Resource Allocation

CPU Requirements:

Minimum: 2 CPU cores for micro-quickstart
Recommended: 4-8 CPU cores for production workloads
Druid scales well with CPU - more cores enable better query parallelism

Memory Requirements:

Minimum: 4GB RAM for testing
Small production: 8-16GB RAM
Medium production: 32-64GB RAM
Large production: 128GB+ RAM

Storage Requirements:

Persistent volume for segment cache: 50-200GB
Deep storage (S3): Based on data retention policy
Metadata storage (PostgreSQL): 10-50GB depending on datasources

Sizing Formula:

Required RAM = (Heap Memory + Direct Memory + OS Cache)
Heap Memory ≈ 50-60% of total RAM
Direct Memory ≈ 20-30% of total RAM
OS Cache ≈ 20-30% of total RAM

High Availability Setup

For production deployments requiring high availability:

Multiple Druid Instances:

Deploy multiple single-server Druid instances behind a load balancer
Each instance can serve queries independently
Share the same metadata storage and deep storage

External Dependencies:

Use managed PostgreSQL with replication for metadata
Use cloud object storage (S3) for deep storage with built-in redundancy
Deploy external ZooKeeper cluster for coordination (advanced)

Health Checks:

Configure load balancer health checks on /status/health
Set up monitoring alerts for service availability
Implement automatic failover for coordinator/overlord roles

Backup and Recovery

Metadata Backup:

Regular backups of PostgreSQL metadata database:

# Backup metadata
pg_dump -h your-postgres-app.klutch.sh -p 8000 -U druid druid > druid_metadata_backup.sql

# Restore metadata
psql -h your-postgres-app.klutch.sh -p 8000 -U druid druid < druid_metadata_backup.sql

Segment Backup:

Deep storage automatically serves as segment backup. Segments are immutable and safe in S3.

Disaster Recovery Plan:

Maintain regular metadata database backups
Ensure deep storage has versioning enabled
Document configuration in version control (your GitHub repository)
Test recovery procedures regularly
Keep runbooks for common failure scenarios

Segment Retention:

Configure retention policies to automatically drop old data:

# Drop segments older than 90 days
druid.coordinator.period=PT300S
druid.coordinator.rules=[
  {"type": "loadByInterval", "interval": "PT90D/PT0S", "tieredReplicants": {"_default_tier": 2}},
  {"type": "dropForever"}
]

Security Hardening

Authentication:

Enable basic authentication with strong passwords
Rotate credentials regularly
Use separate credentials for internal and external access

Authorization:

Implement role-based access control (RBAC)
Restrict datasource access by user role
Audit query and ingestion operations

Network Security:

Use HTTPS for all external communications (provided by Klutch.sh)
Restrict database access to Druid’s IP address
Use VPC peering for cloud resources when possible

Secrets Management:

Store sensitive credentials in environment variables
Never commit secrets to version control
Rotate database and S3 credentials periodically

Query Limits:

Prevent resource exhaustion from expensive queries:

# Limit query time
druid.server.http.maxQueryTimeout=300000

# Limit concurrent queries
druid.broker.http.numConnections=20

# Limit result size
druid.server.http.maxSubqueryRows=100000

Performance Optimization

Segment Optimization:

Use appropriate segment granularity (HOUR, DAY, WEEK)
Smaller segments improve parallelism
Larger segments reduce metadata overhead
Aim for 5-10 million rows per segment

Query Optimization:

Use filters to reduce data scanned
Leverage rollup for pre-aggregation
Create appropriate indexes on filter columns
Avoid SELECT * queries

Ingestion Optimization:

Batch ingestion: Use parallel ingestion for large datasets
Streaming ingestion: Tune taskCount and taskDuration
Enable rollup to reduce segment size
Use appropriate queryGranularity for your use case

Caching Strategy:

Enable caching on Broker and Historical nodes
Set appropriate cache expiration times
Monitor cache hit rates
Size cache based on working set

Monitoring and Alerting

Key Metrics to Track:

Query Performance:
- Average query time
- 95th/99th percentile latency
- Query failures
Ingestion Health:
- Ingestion task success rate
- Lag for streaming ingestion
- Segment creation rate
Resource Utilization:
- JVM heap usage
- Direct memory usage
- CPU utilization
- Disk I/O
Cluster Health:
- Service availability
- Segment availability
- Failed tasks

Alerting Thresholds:

Critical:
- Any service down > 1 minute
- Query failure rate > 5%
- JVM heap usage > 90%
- Disk usage > 85%

Warning:
- Query latency p95 > 5 seconds
- Ingestion lag > 10 minutes
- Heap usage > 75%
- Cache hit rate < 50%

Scaling Strategies

Vertical Scaling:

Start with vertical scaling for simplicity:

Increase CPU cores for better query parallelism
Add RAM for larger segment cache
Adjust Java heap sizes proportionally
Monitor resource utilization to identify bottlenecks

Horizontal Scaling:

When vertical scaling is insufficient:

Deploy dedicated Historical nodes for queries
Deploy dedicated MiddleManager nodes for ingestion
Separate Coordinator and Overlord from data nodes
Use external ZooKeeper cluster

Data Tiering:

Optimize costs with hot/cold data tiers:

Recent data on fast SSD storage (hot tier)
Historical data on cheaper storage (cold tier)
Configure rules for automatic tier movement
Use different replica counts per tier

Troubleshooting

Issue: Druid Fails to Start

Symptoms: Container starts but Druid processes don’t initialize

Possible Causes and Solutions:

Insufficient Memory:

Check Java heap configuration:

# View container logs
# Look for "OutOfMemoryError" or "Cannot reserve enough space for object heap"

Solution: Increase heap size or container memory:

DRUID_XMX=4g
DRUID_XMS=4g

Metadata Storage Connection Failed:

Check PostgreSQL connectivity:

# Test PostgreSQL connection from Druid container
curl https://your-postgres-app.klutch.sh:8000

Solution: Verify PostgreSQL credentials and network connectivity:

POSTGRES_HOST=your-postgres-app.klutch.sh
POSTGRES_PORT=8000
POSTGRES_DB=druid
POSTGRES_USER=druid
POSTGRES_PASSWORD=correct-password

Port Conflicts:

Check if ports are already in use:

Solution: Ensure no other services are using Druid’s ports (8081-8091, 8888).

Persistent Volume Issues:

Verify volume is mounted correctly:

# Check if volume is accessible
ls -la /opt/druid/var

Solution: Ensure persistent volume is attached at /opt/druid/var.

Issue: Slow Query Performance

Symptoms: Queries take longer than expected

Troubleshooting Steps:

Check Query Plan:

Use EXPLAIN PLAN to understand query execution:

EXPLAIN PLAN FOR
SELECT COUNT(*) FROM events WHERE country = 'US'

Verify Segment Pruning:

Ensure time filters enable segment pruning:

-- Good: Uses time filter
SELECT COUNT(*) FROM events
WHERE __time >= CURRENT_TIMESTAMP - INTERVAL '1' HOUR

-- Bad: Scans all segments
SELECT COUNT(*) FROM events

Check Segment Cache:

Verify Historical nodes are caching segments:

curl https://your-app-name.klutch.sh/druid/coordinator/v1/servers?simple

Monitor Resource Usage:

Check CPU and memory in Klutch.sh dashboard:

High CPU: Increase cores or optimize query
High memory: Increase heap size or reduce cache size

Optimize Segment Size:

Merge small segments:

curl -X POST https://your-app-name.klutch.sh/druid/coordinator/v1/compact/tasks \
  -H 'Content-Type: application/json' \
  -d '{"dataSource": "events"}'

Issue: Ingestion Task Fails

Symptoms: Data not appearing in datasource, failed tasks in console

Common Causes:

Invalid Data Format:

Check task logs for parsing errors:

curl https://your-app-name.klutch.sh/druid/indexer/v1/task/{taskId}/log

Solution: Verify input format matches data:

{
  "inputFormat": {
    "type": "json",
    "flattenSpec": {
      "useFieldDiscovery": true
    }
  }
}

Timestamp Parsing Failed:

Ensure timestamp format is correct:

{
  "timestampSpec": {
    "column": "timestamp",
    "format": "iso"
  }
}

Common formats:

iso: ISO 8601 (e.g., 2024-01-01T12:00:00Z)
millis: Unix milliseconds
auto: Auto-detect format
yyyy-MM-dd HH:mm:ss: Custom format

Insufficient Resources:

Check MiddleManager capacity:

curl https://your-app-name.klutch.sh/druid/indexer/v1/workers

Solution: Adjust worker capacity or increase container resources.

Kafka Connection Issues:

For Kafka ingestion, verify connectivity:

# From Druid container
curl your-kafka-app.klutch.sh:8000

Solution: Check Kafka broker configuration and network access.

Issue: High Memory Usage

Symptoms: Container running out of memory, JVM crashes

Solutions:

Reduce Heap Size:

If heap is too large, reduce it:

DRUID_XMX=4g
DRUID_XMS=4g

Ensure: Heap + Direct Memory + OS < Total Container RAM

Reduce Segment Cache:

Limit segment cache size in common.runtime.properties:

druid.segmentCache.locations=[{"path":"/opt/druid/var/druid/segment-cache","maxSize":5368709120}]

Reduce Processing Buffer:

Lower processing buffer size:

druid.processing.buffer.sizeBytes=67108864
druid.processing.numThreads=4

Enable Caching Limits:

Configure cache eviction:

druid.cache.type=caffeine
druid.cache.sizeInBytes=128000000

Increase Container Resources:

Scale up container in Klutch.sh dashboard to provide more RAM.

Issue: Cannot Connect to Druid

Symptoms: Unable to access web console or API

Troubleshooting Steps:

Verify Deployment Status:

Check Klutch.sh dashboard for deployment status and logs.

Check Port Configuration:

Ensure internal port is set correctly:

HTTP traffic: Port 8888 (Router)
TCP traffic: Port 8082 (Broker)

Test Health Endpoint:

curl https://your-app-name.klutch.sh/status/health

Expected: {"status":"healthy"}

Check Firewall Rules:

Ensure no firewall blocking traffic to Klutch.sh domain.

Verify Service Status:

Check if all Druid services started:

curl https://your-app-name.klutch.sh/druid/coordinator/v1/servers

Issue: Segments Not Loading

Symptoms: Data ingested but not queryable

Troubleshooting Steps:

Check Segment Availability:

curl https://your-app-name.klutch.sh/druid/coordinator/v1/datasources/{datasource}/loadstatus

Verify Deep Storage Access:

Ensure S3 credentials are correct and bucket is accessible.

Check Historical Node Capacity:

curl https://your-app-name.klutch.sh/druid/coordinator/v1/servers?simple

Review Coordinator Logs:

Check for segment assignment errors in logs.

Force Segment Load:

Manually trigger segment loading:

curl -X POST https://your-app-name.klutch.sh/druid/coordinator/v1/datasources/{datasource}

Additional Resources

PostgreSQL - Deploy PostgreSQL for Druid metadata storage
Kafka - Stream real-time data into Druid
ClickHouse - Alternative analytics database
Metabase - Visualize Druid data with dashboards

You now have Apache Druid running on Klutch.sh! Your real-time analytics database is ready to ingest streaming data, serve fast queries, and power interactive dashboards. Start loading data through the web console, configure metadata and deep storage for production, and scale your deployment as your analytics needs grow.

Deploying Druid

Introduction

Why Deploy Druid on Klutch.sh

Prerequisites

Understanding Druid Architecture

Core Processes

External Dependencies

Deployment Modes

Preparing Your Repository

Step 1: Create Repository Structure

Step 2: Create the Dockerfile

Step 3: Create Common Configuration

Step 4: Create Startup Script

Step 5: Create Docker Compose for Local Testing

Step 6: Create Documentation

Production Deployment on Klutch.sh

Required Environment Variables

Persistent Volumes

Traffic Configuration

Usage

Web Console

SQL Queries

Streaming Ingestion

License

Deploying to Klutch.sh

Deployment Steps

Connecting to Druid

Connection URL Formats

Example Connection Code

Python (using pydruid)

Python (using SQL with requests)

Node.js (using axios)

Java (using Druid SQL JDBC)

Go (using HTTP client)

Ruby (using HTTP client)

PHP (using cURL)

Getting Started with Druid

Loading Sample Data

Running Your First Query

Streaming Ingestion from Kafka

Batch Ingestion from Files

Advanced Configuration

Metadata Storage Configuration

Deep Storage Configuration

Java Heap Tuning

Query Performance Tuning

Security Configuration

Monitoring and Metrics

Production Best Practices

Resource Allocation

High Availability Setup

Backup and Recovery

Security Hardening

Performance Optimization

Monitoring and Alerting

Scaling Strategies

Troubleshooting

Issue: Druid Fails to Start

Issue: Slow Query Performance

Issue: Ingestion Task Fails

Issue: High Memory Usage

Issue: Cannot Connect to Druid

Issue: Segments Not Loading

Additional Resources

Related Guides