Skip to content

Deploying a Cassandra Database

Introduction

Apache Cassandra is a highly scalable, distributed NoSQL database designed to handle massive amounts of data across multiple commodity servers with no single point of failure. Originally developed at Facebook and released as an open-source project in 2008, Cassandra is now an Apache Software Foundation project trusted by thousands of companies worldwide for mission-critical applications.

Cassandra combines the distributed systems architecture of Amazon’s Dynamo with the data model of Google’s BigTable, creating a powerful database that excels at handling large volumes of data with high availability and linear scalability.

Cassandra is renowned for its:

  • Linear Scalability: Add nodes to increase capacity without downtime or performance degradation
  • High Availability: No single point of failure with multi-datacenter replication
  • Fault Tolerance: Automatic data replication across multiple nodes and data centers
  • Tunable Consistency: Choose between eventual and strong consistency per operation
  • Column-Oriented Storage: Efficient storage and retrieval of structured data
  • High Write Throughput: Optimized for write-heavy workloads with commitlog and memtable architecture
  • Flexible Data Model: Wide-column store supporting complex data structures
  • CQL (Cassandra Query Language): SQL-like query language that’s familiar and powerful
  • Masterless Architecture: Peer-to-peer design eliminates single points of failure
  • Time-Series Data: Excellent for time-series data, IoT sensor data, and event logging

Common use cases include recommendation engines, fraud detection systems, IoT applications, time-series data storage, messaging platforms, product catalogs, user profile management, and any application requiring high availability and massive scalability.

This comprehensive guide walks you through deploying Apache Cassandra on Klutch.sh using Docker, including detailed installation steps, sample configurations, and production-ready best practices for persistent storage and cluster optimization.

Prerequisites

Before you begin, ensure you have the following:

  • A Klutch.sh account
  • A GitHub account with a repository for your Cassandra project
  • Docker installed locally for testing (optional but recommended)
  • Basic understanding of Docker and distributed databases
  • Familiarity with CQL (Cassandra Query Language) is helpful but not required

Installation and Setup

Step 1: Create Your Project Directory

First, create a new directory for your Cassandra deployment project:

Terminal window
mkdir cassandra-klutch
cd cassandra-klutch
git init

Step 2: Create the Dockerfile

Create a Dockerfile in your project root directory. This will define your Cassandra container configuration:

FROM cassandra:5.0
# Set default environment variables
# These can be overridden in the Klutch.sh dashboard
ENV CASSANDRA_CLUSTER_NAME="KlutchCluster"
ENV CASSANDRA_DC="datacenter1"
ENV CASSANDRA_RACK="rack1"
ENV CASSANDRA_ENDPOINT_SNITCH="SimpleSnitch"
ENV CASSANDRA_SEEDS="127.0.0.1"
# Configure memory settings for optimal performance
# Adjust these based on your container resources
ENV MAX_HEAP_SIZE="512M"
ENV HEAP_NEWSIZE="128M"
# Expose Cassandra ports
# 9042: CQL native transport port (client connections)
# 7000: Inter-node cluster communication
# 7001: TLS inter-node cluster communication
# 9160: Thrift client API (legacy, optional)
# 7199: JMX monitoring port
EXPOSE 9042 7000 7001 9160 7199
# Optional: Copy custom Cassandra configuration
# COPY ./cassandra.yaml /etc/cassandra/cassandra.yaml
# Optional: Copy initialization CQL scripts
# These will need to be executed manually after startup
# COPY ./init.cql /docker-entrypoint-initdb.d/

Note: Cassandra 5.0 is the latest major version with improved performance, enhanced security features, and better resource management.

Step 3: (Optional) Create Initialization Scripts

You can create CQL scripts to set up your initial keyspaces, tables, and data. Create a file named init.cql:

-- init.cql - Cassandra initialization script
-- This script should be executed after Cassandra starts using cqlsh
-- Create a keyspace with replication
CREATE KEYSPACE IF NOT EXISTS myapp
WITH replication = {
'class': 'SimpleStrategy',
'replication_factor': 1
};
-- Use the keyspace
USE myapp;
-- Create a users table
CREATE TABLE IF NOT EXISTS users (
user_id UUID PRIMARY KEY,
username TEXT,
email TEXT,
created_at TIMESTAMP,
last_login TIMESTAMP
);
-- Create an index on email for faster lookups
CREATE INDEX IF NOT EXISTS idx_users_email ON users (email);
-- Create a time-series events table
CREATE TABLE IF NOT EXISTS events (
event_id TIMEUUID,
user_id UUID,
event_type TEXT,
event_data TEXT,
created_at TIMESTAMP,
PRIMARY KEY (user_id, created_at, event_id)
) WITH CLUSTERING ORDER BY (created_at DESC);
-- Insert sample data
INSERT INTO users (user_id, username, email, created_at)
VALUES (uuid(), 'admin', 'admin@example.com', toTimestamp(now()));
INSERT INTO users (user_id, username, email, created_at)
VALUES (uuid(), 'testuser', 'test@example.com', toTimestamp(now()));

Note: Unlike some other databases, Cassandra initialization scripts need to be executed manually after the database starts. You can execute them by connecting to the database with cqlsh and running SOURCE '/path/to/init.cql';.

Step 4: (Optional) Create Custom Configuration

For advanced configurations, you can create a custom cassandra.yaml file. Here’s a basic example with common settings:

# cassandra.yaml - Custom Cassandra configuration
cluster_name: 'KlutchCluster'
num_tokens: 256
hinted_handoff_enabled: true
max_hint_window_in_ms: 10800000
hinted_handoff_throttle_in_kb: 1024
max_hints_delivery_threads: 2
hints_directory: /var/lib/cassandra/hints
batchlog_replay_throttle_in_kb: 1024
authenticator: PasswordAuthenticator
authorizer: CassandraAuthorizer
role_manager: CassandraRoleManager
roles_validity_in_ms: 2000
permissions_validity_in_ms: 2000
credentials_validity_in_ms: 2000
partitioner: org.apache.cassandra.dht.Murmur3Partitioner
data_file_directories:
- /var/lib/cassandra/data
commitlog_directory: /var/lib/cassandra/commitlog
disk_failure_policy: stop
commit_failure_policy: stop
key_cache_size_in_mb:
key_cache_save_period: 14400
row_cache_size_in_mb: 0
row_cache_save_period: 0
counter_cache_size_in_mb:
counter_cache_save_period: 7200
saved_caches_directory: /var/lib/cassandra/saved_caches
commitlog_sync: periodic
commitlog_sync_period_in_ms: 10000
commitlog_segment_size_in_mb: 32
seed_provider:
- class_name: org.apache.cassandra.locator.SimpleSeedProvider
parameters:
- seeds: "127.0.0.1"
concurrent_reads: 32
concurrent_writes: 32
concurrent_counter_writes: 32
memtable_allocation_type: heap_buffers
index_summary_capacity_in_mb:
index_summary_resize_interval_in_minutes: 60
trickle_fsync: false
trickle_fsync_interval_in_kb: 10240
storage_port: 7000
ssl_storage_port: 7001
listen_address: 0.0.0.0
start_native_transport: true
native_transport_port: 9042
start_rpc: false
rpc_address: 0.0.0.0
rpc_port: 9160
rpc_keepalive: true
thrift_framed_transport_size_in_mb: 15
incremental_backups: false
snapshot_before_compaction: false
auto_snapshot: true
column_index_size_in_kb: 64
column_index_cache_size_in_kb: 2
compaction_throughput_mb_per_sec: 16
sstable_preemptive_open_interval_in_mb: 50
read_request_timeout_in_ms: 5000
range_request_timeout_in_ms: 10000
write_request_timeout_in_ms: 2000
counter_write_request_timeout_in_ms: 5000
cas_contention_timeout_in_ms: 1000
truncate_request_timeout_in_ms: 60000
request_timeout_in_ms: 10000
cross_node_timeout: false
endpoint_snitch: SimpleSnitch
dynamic_snitch_update_interval_in_ms: 100
dynamic_snitch_reset_interval_in_ms: 600000
dynamic_snitch_badness_threshold: 0.1
request_scheduler: org.apache.cassandra.scheduler.NoScheduler
server_encryption_options:
internode_encryption: none
keystore: conf/.keystore
keystore_password: cassandra
truststore: conf/.truststore
truststore_password: cassandra
client_encryption_options:
enabled: false
optional: false
keystore: conf/.keystore
keystore_password: cassandra
internode_compression: dc
inter_dc_tcp_nodelay: false
tracetype_query_ttl: 86400
tracetype_repair_ttl: 604800
gc_warn_threshold_in_ms: 1000
windows_timer_interval: 1

If you create a custom configuration file, uncomment the COPY line in your Dockerfile to include it.

Step 5: Test Locally (Optional)

Before deploying to Klutch.sh, you can test your Cassandra setup locally:

Terminal window
# Build the Docker image
docker build -t my-cassandra .
# Run the container
docker run -d \
--name cassandra-test \
-p 9042:9042 \
-p 7000:7000 \
-e CASSANDRA_CLUSTER_NAME="TestCluster" \
my-cassandra
# Wait for Cassandra to start (this can take 30-60 seconds)
echo "Waiting for Cassandra to start..."
sleep 60
# Check if Cassandra is ready
docker exec cassandra-test nodetool status
# Connect to Cassandra using cqlsh
docker exec -it cassandra-test cqlsh
# Inside cqlsh, you can test with:
# DESCRIBE KEYSPACES;
# CREATE KEYSPACE test WITH replication = {'class': 'SimpleStrategy', 'replication_factor': 1};
# USE test;
# CREATE TABLE users (id UUID PRIMARY KEY, name TEXT);
# INSERT INTO users (id, name) VALUES (uuid(), 'Test User');
# SELECT * FROM users;
# Exit cqlsh with CTRL+D or 'exit'
# Stop and remove the test container when done
docker stop cassandra-test
docker rm cassandra-test

Step 6: Push to GitHub

Commit your Dockerfile and any initialization scripts to your GitHub repository:

Terminal window
git add Dockerfile init.cql
git commit -m "Add Cassandra Dockerfile and initialization scripts"
git remote add origin https://github.com/yourusername/cassandra-klutch.git
git push -u origin main

Connecting to Cassandra

Once deployed, you can connect to your Cassandra database from any application using the native CQL protocol. Since Klutch.sh routes TCP traffic through port 8000, use the following connection configuration:

Connection Details

  • Host: example-app.klutch.sh (replace with your actual Klutch.sh app URL)
  • Port: 8000 (Klutch.sh’s external TCP port)
  • Internal Port: 9042 (Cassandra’s native CQL port)
  • Default Username: cassandra (can be changed via environment variables)
  • Default Password: cassandra (should be changed for production)

Example Connection Code

Node.js (using cassandra-driver):

const cassandra = require('cassandra-driver');
const client = new cassandra.Client({
contactPoints: ['example-app.klutch.sh'],
localDataCenter: 'datacenter1',
protocolOptions: { port: 8000 },
keyspace: 'myapp',
credentials: {
username: 'cassandra',
password: 'cassandra'
}
});
client.connect()
.then(() => {
console.log('Connected to Cassandra');
return client.execute('SELECT cluster_name, release_version FROM system.local');
})
.then(result => {
console.log('Cluster:', result.rows[0].cluster_name);
console.log('Version:', result.rows[0].release_version);
})
.catch(err => console.error('Connection error', err));

Python (using cassandra-driver):

from cassandra.cluster import Cluster
from cassandra.auth import PlainTextAuthProvider
# Configure authentication
auth_provider = PlainTextAuthProvider(
username='cassandra',
password='cassandra'
)
# Connect to cluster
cluster = Cluster(
contact_points=['example-app.klutch.sh'],
port=8000,
auth_provider=auth_provider
)
session = cluster.connect('myapp')
# Execute a query
rows = session.execute('SELECT * FROM users LIMIT 10')
for row in rows:
print(f"User: {row.username}, Email: {row.email}")
# Close connection
cluster.shutdown()

Java (using DataStax Java Driver):

import com.datastax.oss.driver.api.core.CqlSession;
import com.datastax.oss.driver.api.core.cql.ResultSet;
import com.datastax.oss.driver.api.core.cql.Row;
import java.net.InetSocketAddress;
public class CassandraConnection {
public static void main(String[] args) {
try (CqlSession session = CqlSession.builder()
.addContactPoint(new InetSocketAddress("example-app.klutch.sh", 8000))
.withLocalDatacenter("datacenter1")
.withKeyspace("myapp")
.withAuthCredentials("cassandra", "cassandra")
.build()) {
ResultSet rs = session.execute("SELECT * FROM users LIMIT 10");
for (Row row : rs) {
System.out.println("User: " + row.getString("username"));
}
}
}
}

Go (using gocql):

package main
import (
"fmt"
"log"
"github.com/gocql/gocql"
)
func main() {
cluster := gocql.NewCluster("example-app.klutch.sh")
cluster.Port = 8000
cluster.Keyspace = "myapp"
cluster.Consistency = gocql.Quorum
cluster.Authenticator = gocql.PasswordAuthenticator{
Username: "cassandra",
Password: "cassandra",
}
session, err := cluster.CreateSession()
if err != nil {
log.Fatal(err)
}
defer session.Close()
var username, email string
iter := session.Query("SELECT username, email FROM users LIMIT 10").Iter()
for iter.Scan(&username, &email) {
fmt.Printf("User: %s, Email: %s\n", username, email)
}
if err := iter.Close(); err != nil {
log.Fatal(err)
}
}

PHP (using cassandra-php-driver):

<?php
$cluster = Cassandra::cluster()
->withContactPoints('example-app.klutch.sh')
->withPort(8000)
->withCredentials('cassandra', 'cassandra')
->build();
$session = $cluster->connect('myapp');
$statement = new Cassandra\SimpleStatement('SELECT * FROM users LIMIT 10');
$result = $session->execute($statement);
foreach ($result as $row) {
printf("User: %s, Email: %s\n", $row['username'], $row['email']);
}
?>

Deploying to Klutch.sh

Now that your Cassandra project is ready and pushed to GitHub, follow these steps to deploy it on Klutch.sh with persistent storage.

Deployment Steps

    1. Log in to Klutch.sh

      Navigate to klutch.sh/app and sign in to your account.

    2. Create a New Project

      Go to Create Project and give your project a meaningful name (e.g., “Cassandra Database”).

    3. Create a New App

      Navigate to Create App and configure the following settings:

    4. Select Your Repository

      • Choose GitHub as your Git source
      • Select the repository containing your Dockerfile
      • Choose the branch you want to deploy (usually main or master)

      Klutch.sh will automatically detect the Dockerfile in your repository root and use it for deployment.

    5. Configure Traffic Type

      • Traffic Type: Select TCP (Cassandra requires TCP traffic for database connections)
      • Internal Port: Set to 9042 (the default Cassandra CQL native transport port that your container listens on)
    6. Set Environment Variables

      Add the following environment variables for your Cassandra configuration:

      • CASSANDRA_CLUSTER_NAME: Name for your cluster (e.g., ProductionCluster)
      • CASSANDRA_DC: Data center name (default: datacenter1)
      • CASSANDRA_RACK: Rack name for topology (default: rack1)
      • CASSANDRA_ENDPOINT_SNITCH: Snitch strategy (default: SimpleSnitch)
      • MAX_HEAP_SIZE: Maximum JVM heap size (e.g., 1G for 1GB, adjust based on your container memory)
      • HEAP_NEWSIZE: JVM new generation heap size (typically 1/4 of MAX_HEAP_SIZE, e.g., 256M)
      • CASSANDRA_AUTHENTICATOR: Set to PasswordAuthenticator to enable authentication (recommended for production)
      • CASSANDRA_AUTHORIZER: Set to CassandraAuthorizer to enable authorization (recommended for production)

      Security Note: For production deployments, always enable authentication and use strong passwords. After deployment, connect to Cassandra and change the default cassandra user password immediately.

    7. Attach Persistent Volumes

      This is critical for ensuring your database data persists across deployments and restarts. Cassandra requires multiple persistent volumes for optimal performance:

      Primary Data Volume:

      • Mount Path: Enter /var/lib/cassandra/data (where Cassandra stores data files)
      • Size: Choose based on your expected data volume (start with 20GB minimum, 50GB+ recommended for production)

      Commitlog Volume (Optional but Recommended):

      • Mount Path: Enter /var/lib/cassandra/commitlog (where Cassandra stores transaction logs)
      • Size: 5-10GB (commitlogs are periodically flushed to data files)

      Note: While you can use a single volume for all data, separating commitlogs to a different volume can improve write performance. For development environments, a single data volume is sufficient.

    8. Configure Additional Settings

      • Region: Select the region closest to your users for optimal latency
      • Compute Resources: Cassandra is resource-intensive; allocate at least:
        • CPU: 2+ cores recommended
        • Memory: 2GB minimum (4GB+ recommended for production workloads)
      • Instances: Start with 1 instance for development (Cassandra clustering requires additional configuration)
    9. Deploy Your Database

      Click “Create” to start the deployment. Klutch.sh will:

      • Automatically detect your Dockerfile in the repository root
      • Build the Docker image
      • Attach the persistent volume(s)
      • Start your Cassandra container
      • Assign a URL for external connections

      Note: Cassandra can take 60-90 seconds to fully initialize. Monitor the deployment logs to confirm when it’s ready to accept connections.

    10. Verify Deployment and Change Default Credentials

      Once deployment is complete, you’ll receive a URL like example-app.klutch.sh.

      Important Security Step: The default Cassandra credentials are cassandra/cassandra. You must change these immediately:

      a. Connect to your Cassandra instance using cqlsh or your preferred client

      b. Create a new superuser:

      CREATE ROLE admin WITH PASSWORD = 'strong_password_here'
      AND SUPERUSER = true
      AND LOGIN = true;

      c. Grant necessary permissions and optionally remove the default cassandra user after creating your admin user

    11. Access Your Database

      Connect to your Cassandra database using:

      • Host: example-app.klutch.sh (your actual Klutch.sh URL)
      • Port: 8000
      • Username: admin (or your created user)
      • Password: Your secure password
      • Keyspace: myapp (or your created keyspace)

Production Best Practices

Security Recommendations

  • Enable Authentication: Always enable authentication in production by setting CASSANDRA_AUTHENTICATOR=PasswordAuthenticator
  • Enable Authorization: Set CASSANDRA_AUTHORIZER=CassandraAuthorizer to control user permissions
  • Change Default Credentials: Immediately change the default cassandra/cassandra credentials
  • Use Strong Passwords: Generate complex, random passwords using a password manager
  • Environment Variables: Store all sensitive credentials as environment variables in Klutch.sh
  • Network Security: Use Cassandra’s client-to-node and node-to-node encryption for sensitive data (requires additional configuration)
  • Regular Security Updates: Keep your Cassandra version updated with the latest security patches
  • Principle of Least Privilege: Create application-specific users with minimal necessary permissions

Performance Optimization

  • Memory Allocation: Allocate sufficient heap memory (typically 1/4 to 1/2 of total container memory)
  • Proper Data Modeling: Design tables based on query patterns, not normalized relational models
  • Partitioning Strategy: Use appropriate partition keys to distribute data evenly
  • Avoid Large Partitions: Keep partitions under 100MB for optimal performance
  • Compression: Enable compression on tables to reduce storage and I/O
  • Compaction Strategy: Choose the right compaction strategy (STCS, LCS, TWCS) based on workload
  • Read/Write Optimization: Tune concurrent_reads and concurrent_writes based on workload
  • Connection Pooling: Use connection pooling in application clients
  • Prepared Statements: Use prepared statements to reduce query parsing overhead
  • Batch Operations: Use batch operations carefully (only for same partition key)

Data Modeling Best Practices

  • Query-Driven Design: Design tables around specific queries, not entities
  • Denormalization: Embrace denormalization to optimize read performance
  • Partition Keys: Choose partition keys that distribute data evenly
  • Clustering Columns: Use clustering columns to sort data within partitions
  • Collections: Use collections (sets, lists, maps) appropriately but avoid growing them unbounded
  • Time-Series Data: Use time-based clustering keys with TWCS compaction strategy
  • Secondary Indexes: Use sparingly; materialized views or denormalized tables are often better

Monitoring and Maintenance

Monitor your Cassandra database for:

  • Node Health: Use nodetool status to check cluster health
  • Disk Usage: Monitor data directory size and commitlog volume
  • Memory Usage: Track heap and off-heap memory consumption
  • Read/Write Latency: Monitor p95 and p99 latencies
  • Compaction: Monitor compaction progress and pending tasks
  • Garbage Collection: Track GC frequency and pause times
  • Connection Count: Monitor active client connections
  • Thread Pool Stats: Monitor rejected tasks and blocked threads

Regular maintenance tasks:

  • Run Repairs: Regular repairs ensure data consistency (especially for multi-node clusters)
  • Manage Snapshots: Clean up old snapshots to free disk space
  • Monitor Tombstones: Excessive tombstones can impact read performance
  • Analyze Slow Queries: Use tracing to identify and optimize slow queries
  • Update Statistics: Run nodetool tablestats to monitor table-level metrics

Backup and Recovery

  • Automated Snapshots: Enable auto_snapshot for automatic backups before truncate/drop operations
  • Manual Snapshots: Take regular snapshots using nodetool snapshot
  • Export Data: Use COPY TO command or sstable2json for data exports
  • Volume Backups: Leverage Klutch.sh’s persistent volume backup capabilities
  • Test Restores: Regularly test your backup restoration process
  • Incremental Backups: Enable incremental backups for point-in-time recovery
  • Off-Site Storage: Store critical backups off-site for disaster recovery

Troubleshooting

Cannot Connect to Database

  • Verify that you’re using the correct connection details (host, port 8000)
  • Ensure your environment variables are set correctly in Klutch.sh
  • Check that the internal port is set to 9042 in your app configuration
  • Verify that TCP traffic is selected (not HTTP)
  • Check if Cassandra has fully started (it can take 60-90 seconds)
  • Review deployment logs for startup errors

Database Not Persisting Data

  • Verify that the persistent volume is correctly attached at /var/lib/cassandra/data
  • Check that the volume has sufficient space allocated
  • Ensure the container has write permissions to the volume
  • Verify that the volume path matches Cassandra’s data directory configuration

Performance Issues

  • Slow Reads: Check for wide partitions, missing indexes, or inefficient query patterns
  • Slow Writes: Verify commitlog settings, check for compaction backlog
  • High Memory Usage: Adjust heap size settings, check for memory leaks
  • Compaction Lag: Increase compaction throughput or adjust compaction strategy
  • Resource Constraints: Consider increasing CPU/memory allocation in Klutch.sh
  • Connection Timeouts: Increase timeout values in client configuration

Cassandra Won’t Start

  • Check memory allocation (ensure MAX_HEAP_SIZE is appropriate)
  • Review startup logs for configuration errors
  • Verify that persistent volumes are properly mounted
  • Ensure sufficient disk space is available
  • Check for port conflicts (9042, 7000, etc.)

Data Consistency Issues

  • Run nodetool repair to reconcile data across nodes (if using multi-node setup)
  • Check replication factor matches your cluster size
  • Verify read/write consistency levels in application code
  • Monitor for hinted handoff activity

Advanced Topics

CQL Best Practices

Creating Keyspaces:

-- Development (single instance)
CREATE KEYSPACE myapp
WITH replication = {
'class': 'SimpleStrategy',
'replication_factor': 1
};
-- Production (cluster setup would require NetworkTopologyStrategy)
CREATE KEYSPACE myapp
WITH replication = {
'class': 'SimpleStrategy',
'replication_factor': 3
}
AND durable_writes = true;

Creating Tables with Proper Primary Keys:

-- Time-series data with compound primary key
CREATE TABLE sensor_data (
sensor_id UUID,
timestamp TIMESTAMP,
reading DOUBLE,
unit TEXT,
PRIMARY KEY (sensor_id, timestamp)
) WITH CLUSTERING ORDER BY (timestamp DESC)
AND compaction = {
'class': 'TimeWindowCompactionStrategy',
'compaction_window_size': 1,
'compaction_window_unit': 'DAYS'
};
-- User profile with static columns
CREATE TABLE user_profiles (
user_id UUID,
username TEXT STATIC,
email TEXT STATIC,
session_id UUID,
login_time TIMESTAMP,
ip_address TEXT,
PRIMARY KEY (user_id, session_id)
) WITH CLUSTERING ORDER BY (session_id DESC);

Using Materialized Views:

-- Create a materialized view for different query patterns
CREATE MATERIALIZED VIEW users_by_email AS
SELECT user_id, username, email, created_at
FROM users
WHERE email IS NOT NULL AND user_id IS NOT NULL
PRIMARY KEY (email, user_id);

Useful nodetool Commands

When you need to manage your Cassandra instance, you can execute commands inside the container:

Terminal window
# Check cluster status
docker exec <container-id> nodetool status
# Check node info
docker exec <container-id> nodetool info
# View table statistics
docker exec <container-id> nodetool tablestats myapp.users
# Take a snapshot
docker exec <container-id> nodetool snapshot
# Clear snapshot
docker exec <container-id> nodetool clearsnapshot
# Run repair (for consistency)
docker exec <container-id> nodetool repair
# Flush memtables to disk
docker exec <container-id> nodetool flush
# View thread pool stats
docker exec <container-id> nodetool tpstats
# Check compaction status
docker exec <container-id> nodetool compactionstats

Additional Resources


Conclusion

Deploying Apache Cassandra to Klutch.sh with Docker provides a powerful, scalable NoSQL database solution with persistent storage and high availability. By following this guide, you’ve set up a production-ready Cassandra database with proper data persistence, security configurations, and connection capabilities.

Cassandra’s distributed architecture and linear scalability make it an excellent choice for applications that need to handle massive amounts of data with predictable performance. Your database is now ready to support applications requiring high write throughput, low-latency reads, and the ability to scale horizontally as your data grows.

Remember to follow the production best practices outlined in this guide, regularly monitor your database performance, and adjust resources as your workload evolves. With proper data modeling and maintenance, Cassandra on Klutch.sh will provide reliable, high-performance data storage for your most demanding applications.