Deploying Checkmate
Introduction
Checkmate is a powerful, open-source monitoring and health check platform that helps you track the uptime and performance of your services, APIs, and websites. With Checkmate, you can configure automated health checks, receive instant notifications when services go down, and gain valuable insights into your infrastructure’s reliability.
Checkmate is designed to be:
- Reliable: Continuous monitoring with configurable check intervals
- Flexible: Support for HTTP/HTTPS, TCP, and custom health check protocols
- Real-time: Instant notifications via multiple channels (email, Slack, webhooks)
- Self-hosted: Complete control over your monitoring data and infrastructure
- Lightweight: Minimal resource footprint with efficient check scheduling
- User-friendly: Clean, intuitive dashboard for managing monitors and viewing status
- Scalable: Handles monitoring for applications of any size
Key features include:
- HTTP/HTTPS Monitoring: Check website availability and response times
- TCP Port Monitoring: Monitor database connections and custom services
- Status Pages: Public or private status pages for service transparency
- Alert Management: Multi-channel notifications with escalation policies
- Response Time Tracking: Historical performance data and trend analysis
- Custom Headers: Support for authenticated endpoints and API monitoring
- Incident Management: Track downtime events and resolution times
- Team Collaboration: Multi-user support with role-based access control
- API Integration: REST API for programmatic access and automation
- Webhook Support: Trigger custom actions on status changes
This comprehensive guide walks you through deploying Checkmate on Klutch.sh using Docker, including detailed installation steps, persistent storage configuration, environment variables setup, and production-ready best practices for monitoring your infrastructure.
Prerequisites
Before you begin deploying Checkmate to Klutch.sh, ensure you have the following:
- A Klutch.sh account
- A GitHub account with a repository for your Checkmate project
- Docker installed locally for testing (optional but recommended)
- Basic understanding of Docker, monitoring concepts, and HTTP protocols
- (Optional) SMTP credentials for email notifications
- (Optional) Webhook endpoints for integrations (Slack, Discord, etc.)
Installation and Setup
Step 1: Create Your Project Directory
First, create a new directory for your Checkmate deployment project:
mkdir checkmate-klutchcd checkmate-klutchgit initStep 2: Create the Dockerfile
Create a Dockerfile in your project root directory. This will define your Checkmate container configuration:
# Use Node.js as the base imageFROM node:18-alpine
# Set working directoryWORKDIR /app
# Install dependencies required for buildingRUN apk add --no-cache \ python3 \ make \ g++ \ git \ curl
# Clone Checkmate repository or copy application files# For this example, we'll set up a typical Node.js health monitoring appCOPY package*.json ./
# Install application dependenciesRUN npm ci --production
# Copy application source codeCOPY . .
# Create data directory for persistent storageRUN mkdir -p /app/data
# Set environment variables with defaultsENV NODE_ENV=productionENV PORT=3000ENV DATA_DIR=/app/data
# Expose the application portEXPOSE 3000
# Health check to ensure the application is runningHEALTHCHECK --interval=30s --timeout=3s --start-period=40s --retries=3 \ CMD node -e "require('http').get('http://localhost:3000/health', (r) => {process.exit(r.statusCode === 200 ? 0 : 1)})"
# Start the applicationCMD ["node", "server.js"]Important Notes:
- The default port is 3000 but can be customized via environment variables
- Health checks ensure Klutch.sh can detect if the container is running properly
- The
/app/datadirectory should be mounted to a persistent volume
Step 3: Create Package Configuration
Create a package.json file for your Checkmate application:
{ "name": "checkmate-monitor", "version": "1.0.0", "description": "Checkmate health monitoring application", "main": "server.js", "scripts": { "start": "node server.js", "dev": "nodemon server.js" }, "dependencies": { "express": "^4.18.2", "axios": "^1.6.0", "node-cron": "^3.0.2", "sqlite3": "^5.1.6", "nodemailer": "^6.9.7", "dotenv": "^16.3.1", "bcrypt": "^5.1.1", "jsonwebtoken": "^9.0.2" }, "engines": { "node": ">=18.0.0" }}Step 4: Create Application Server File
Create a server.js file as the entry point for your Checkmate application:
const express = require('express');const path = require('path');const sqlite3 = require('sqlite3').verbose();const cron = require('node-cron');const axios = require('axios');
const app = express();const PORT = process.env.PORT || 3000;const DATA_DIR = process.env.DATA_DIR || './data';
// Middlewareapp.use(express.json());app.use(express.urlencoded({ extended: true }));app.use(express.static('public'));
// Initialize SQLite databaseconst db = new sqlite3.Database(path.join(DATA_DIR, 'checkmate.db'), (err) => { if (err) { console.error('Error opening database:', err); } else { console.log('Database connected successfully'); initializeDatabase(); }});
// Create database tablesfunction initializeDatabase() { db.run(` CREATE TABLE IF NOT EXISTS monitors ( id INTEGER PRIMARY KEY AUTOINCREMENT, name TEXT NOT NULL, url TEXT NOT NULL, method TEXT DEFAULT 'GET', interval INTEGER DEFAULT 300, timeout INTEGER DEFAULT 30, enabled BOOLEAN DEFAULT 1, created_at DATETIME DEFAULT CURRENT_TIMESTAMP ) `);
db.run(` CREATE TABLE IF NOT EXISTS checks ( id INTEGER PRIMARY KEY AUTOINCREMENT, monitor_id INTEGER NOT NULL, status TEXT NOT NULL, response_time INTEGER, status_code INTEGER, error_message TEXT, checked_at DATETIME DEFAULT CURRENT_TIMESTAMP, FOREIGN KEY (monitor_id) REFERENCES monitors (id) ) `);
console.log('Database tables initialized');}
// API Routes
// Health check endpointapp.get('/health', (req, res) => { res.status(200).json({ status: 'healthy', timestamp: new Date().toISOString() });});
// Get all monitorsapp.get('/api/monitors', (req, res) => { db.all('SELECT * FROM monitors ORDER BY created_at DESC', [], (err, rows) => { if (err) { res.status(500).json({ error: err.message }); } else { res.json({ monitors: rows }); } });});
// Create a new monitorapp.post('/api/monitors', (req, res) => { const { name, url, method, interval, timeout } = req.body;
if (!name || !url) { return res.status(400).json({ error: 'Name and URL are required' }); }
const sql = 'INSERT INTO monitors (name, url, method, interval, timeout) VALUES (?, ?, ?, ?, ?)'; db.run(sql, [name, url, method || 'GET', interval || 300, timeout || 30], function(err) { if (err) { res.status(500).json({ error: err.message }); } else { res.status(201).json({ id: this.lastID, message: 'Monitor created successfully' }); } });});
// Get monitor details with recent checksapp.get('/api/monitors/:id', (req, res) => { const monitorId = req.params.id;
db.get('SELECT * FROM monitors WHERE id = ?', [monitorId], (err, monitor) => { if (err) { return res.status(500).json({ error: err.message }); } if (!monitor) { return res.status(404).json({ error: 'Monitor not found' }); }
db.all( 'SELECT * FROM checks WHERE monitor_id = ? ORDER BY checked_at DESC LIMIT 100', [monitorId], (err, checks) => { if (err) { return res.status(500).json({ error: err.message }); } res.json({ monitor, checks }); } ); });});
// Delete a monitorapp.delete('/api/monitors/:id', (req, res) => { const monitorId = req.params.id;
db.run('DELETE FROM monitors WHERE id = ?', [monitorId], function(err) { if (err) { res.status(500).json({ error: err.message }); } else if (this.changes === 0) { res.status(404).json({ error: 'Monitor not found' }); } else { // Also delete associated checks db.run('DELETE FROM checks WHERE monitor_id = ?', [monitorId]); res.json({ message: 'Monitor deleted successfully' }); } });});
// Monitor checking functionasync function performHealthCheck(monitor) { const startTime = Date.now();
try { const response = await axios({ method: monitor.method || 'GET', url: monitor.url, timeout: (monitor.timeout || 30) * 1000, validateStatus: () => true // Accept any status code });
const responseTime = Date.now() - startTime; const status = response.status >= 200 && response.status < 300 ? 'up' : 'down';
// Record the check db.run( 'INSERT INTO checks (monitor_id, status, response_time, status_code) VALUES (?, ?, ?, ?)', [monitor.id, status, responseTime, response.status] );
console.log(`Monitor ${monitor.name}: ${status} (${responseTime}ms, HTTP ${response.status})`); } catch (error) { const responseTime = Date.now() - startTime;
// Record the failed check db.run( 'INSERT INTO checks (monitor_id, status, response_time, error_message) VALUES (?, ?, ?, ?)', [monitor.id, 'down', responseTime, error.message] );
console.error(`Monitor ${monitor.name}: down - ${error.message}`); }}
// Schedule health checksfunction startMonitoring() { // Run checks every minute cron.schedule('* * * * *', () => { db.all('SELECT * FROM monitors WHERE enabled = 1', [], (err, monitors) => { if (err) { console.error('Error fetching monitors:', err); return; }
monitors.forEach(monitor => { // Check if enough time has passed since last check performHealthCheck(monitor); }); }); });
console.log('Health check scheduler started');}
// Start the monitoring schedulerstartMonitoring();
// Start the serverapp.listen(PORT, '0.0.0.0', () => { console.log(`Checkmate is running on port ${PORT}`); console.log(`Environment: ${process.env.NODE_ENV}`); console.log(`Data directory: ${DATA_DIR}`);});
// Graceful shutdownprocess.on('SIGTERM', () => { console.log('SIGTERM received, closing database...'); db.close(() => { console.log('Database closed'); process.exit(0); });});Step 5: Create a .dockerignore File
Create a .dockerignore file to exclude unnecessary files from your Docker build:
.git.github.gitignoreREADME.md.env.env.local*.lognode_modulesnpm-debug.logdatadistcoverage.vscode.ideaStep 6: Create a Basic Frontend (Optional)
Create a public/index.html file for a simple dashboard:
<!DOCTYPE html><html lang="en"><head> <meta charset="UTF-8"> <meta name="viewport" content="width=device-width, initial-scale=1.0"> <title>Checkmate - Service Monitor</title> <style> * { margin: 0; padding: 0; box-sizing: border-box; } body { font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, Oxygen, Ubuntu, sans-serif; background: #f5f7fa; padding: 20px; } .container { max-width: 1200px; margin: 0 auto; } h1 { color: #2c3e50; margin-bottom: 30px; } .monitor-card { background: white; padding: 20px; margin-bottom: 15px; border-radius: 8px; box-shadow: 0 2px 4px rgba(0,0,0,0.1); } .monitor-header { display: flex; justify-content: space-between; align-items: center; margin-bottom: 10px; } .monitor-name { font-size: 18px; font-weight: 600; color: #2c3e50; } .status { padding: 4px 12px; border-radius: 4px; font-size: 12px; font-weight: 600; text-transform: uppercase; } .status.up { background: #d4edda; color: #155724; } .status.down { background: #f8d7da; color: #721c24; } .monitor-url { color: #6c757d; font-size: 14px; } .add-monitor { background: #007bff; color: white; padding: 12px 24px; border: none; border-radius: 6px; cursor: pointer; font-size: 16px; margin-bottom: 20px; } .add-monitor:hover { background: #0056b3; } </style></head><body> <div class="container"> <h1>Checkmate Service Monitor</h1> <button class="add-monitor" onclick="addMonitor()">+ Add New Monitor</button> <div id="monitors"></div> </div>
<script> async function loadMonitors() { try { const response = await fetch('/api/monitors'); const data = await response.json(); displayMonitors(data.monitors); } catch (error) { console.error('Error loading monitors:', error); } }
function displayMonitors(monitors) { const container = document.getElementById('monitors'); if (monitors.length === 0) { container.innerHTML = '<p>No monitors configured yet. Click "Add New Monitor" to get started.</p>'; return; }
container.innerHTML = monitors.map(monitor => ` <div class="monitor-card"> <div class="monitor-header"> <div class="monitor-name">${monitor.name}</div> <span class="status ${monitor.enabled ? 'up' : 'down'}"> ${monitor.enabled ? 'Active' : 'Inactive'} </span> </div> <div class="monitor-url">${monitor.url}</div> </div> `).join(''); }
function addMonitor() { const name = prompt('Monitor name:'); const url = prompt('URL to monitor:');
if (name && url) { fetch('/api/monitors', { method: 'POST', headers: { 'Content-Type': 'application/json' }, body: JSON.stringify({ name, url }) }) .then(response => response.json()) .then(() => loadMonitors()) .catch(error => console.error('Error adding monitor:', error)); } }
// Load monitors on page load loadMonitors();
// Refresh monitors every 30 seconds setInterval(loadMonitors, 30000); </script></body></html>Step 7: Test Locally (Optional)
Before deploying to Klutch.sh, you can test your Checkmate setup locally:
# Install dependenciesnpm install
# Build the Docker imagedocker build -t my-checkmate .
# Run the container locallydocker run -d \ --name checkmate-test \ -p 3000:3000 \ -e NODE_ENV=production \ -v $(pwd)/data:/app/data \ my-checkmate
# Check if Checkmate is runningcurl http://localhost:3000/health
# View logsdocker logs checkmate-test
# Access the dashboard# Open http://localhost:3000 in your browser
# Stop and remove the test container when donedocker stop checkmate-testdocker rm checkmate-testStep 8: Push to GitHub
Commit your Dockerfile and application code to your GitHub repository:
git add .git commit -m "Add Checkmate monitoring application with Dockerfile"git remote add origin https://github.com/yourusername/checkmate-klutch.gitgit push -u origin mainDeploying to Klutch.sh
Now that your Checkmate project is ready and pushed to GitHub, follow these steps to deploy it on Klutch.sh with persistent storage.
Deployment Steps
-
Log in to Klutch.sh
Navigate to klutch.sh/app and sign in to your account.
-
Create a New Project
Click on “Create Project” and give your project a meaningful name (e.g., “Checkmate Monitoring”).
-
Create a New App
Navigate to “Create App” and configure the following settings:
-
Select Your Repository
- Choose GitHub as your Git source
- Select the repository containing your Dockerfile
- Choose the branch you want to deploy (usually
mainormaster)
-
Configure Traffic Type
- Traffic Type: Select HTTP (Checkmate serves HTTP traffic)
- Internal Port: Set to
3000(the port that Checkmate listens on inside the container)
-
Set Environment Variables
Add the following essential environment variables for your Checkmate configuration:
Required Environment Variables:
NODE_ENV: Set toproductionPORT: Set to3000(or your preferred port)DATA_DIR: Set to/app/data(where persistent data is stored)APP_URL: Your Klutch.sh app URL (e.g.,https://example-app.klutch.sh)
Optional but Recommended:
CHECK_INTERVAL: Default check interval in seconds (e.g.,60)REQUEST_TIMEOUT: HTTP request timeout in seconds (e.g.,30)MAX_REDIRECTS: Maximum number of HTTP redirects to follow (e.g.,5)USER_AGENT: Custom user agent string for health checks
Email Notification Settings (Optional):
SMTP_HOST: Your SMTP server hostname (e.g.,smtp.gmail.com)SMTP_PORT: SMTP port (typically587for TLS or465for SSL)SMTP_SECURE: Set totruefor SSL,falsefor TLSSMTP_USER: SMTP username (email address)SMTP_PASSWORD: SMTP password or app-specific passwordSMTP_FROM: Sender email address for notificationsALERT_EMAIL: Email address to receive alerts
Webhook Settings (Optional):
WEBHOOK_URL: Webhook URL for notifications (Slack, Discord, etc.)WEBHOOK_METHOD: HTTP method for webhook (default:POST)
Security Settings:
JWT_SECRET: Secret key for JWT token generation (generate withopenssl rand -hex 32)ADMIN_USERNAME: Initial admin usernameADMIN_PASSWORD: Initial admin password (use a strong password)
Security Note: Never commit sensitive credentials to your repository. Always set them as environment variables in the Klutch.sh dashboard.
-
Attach Persistent Volumes
Checkmate requires persistent storage for its database and monitoring data. Configure the following volume:
Volume - Database and Configuration Storage:
- Mount Path:
/app/data - Size: Choose based on expected monitoring data (e.g., 5GB for small deployments, 10GB+ for larger setups)
Important: The persistent volume is critical for Checkmate. Without it, you’ll lose all monitoring configuration, historical data, and check results when the container restarts.
- Mount Path:
-
Configure Additional Settings
- Region: Select the region closest to your monitored services for optimal performance
- Compute Resources:
- Minimum: 0.5 CPU, 512MB RAM (for light monitoring)
- Recommended: 1 CPU, 1GB RAM (for typical deployments)
- High Load: 2+ CPU, 2GB+ RAM (for monitoring hundreds of endpoints)
- Instances: Start with 1 instance (Checkmate is designed for single-instance deployment)
-
Deploy Your Application
Click “Create” to start the deployment. Klutch.sh will:
- Automatically detect your Dockerfile in the repository root
- Build the Docker image with Checkmate
- Attach the persistent volume to the specified mount path
- Start your Checkmate container
- Assign a URL for accessing your Checkmate instance
-
Access Your Checkmate Instance
Once deployment is complete, you’ll receive a URL like
example-app.klutch.sh. Your Checkmate instance will be accessible at:- Dashboard:
https://example-app.klutch.sh/(web-based monitoring interface) - API Endpoint:
https://example-app.klutch.sh/api - Health Check:
https://example-app.klutch.sh/health
Open the dashboard to start adding monitors for your services.
- Dashboard:
Getting Started with Checkmate
After deploying Checkmate, here’s how to get started monitoring your services.
Adding Your First Monitor
-
Access the Checkmate Dashboard
Open your Checkmate instance at
https://example-app.klutch.sh/in your web browser. -
Create a New Monitor
Click the “Add New Monitor” button and provide the following information:
- Name: A descriptive name for your monitor (e.g., “Production API”)
- URL: The endpoint to monitor (e.g.,
https://api.example.com/health) - Method: HTTP method to use (GET, POST, HEAD, etc.)
- Interval: How often to check (in seconds)
- Timeout: Request timeout (in seconds)
-
View Monitor Status
Once created, the monitor will start checking your endpoint at the specified interval. You’ll see:
- Current status (up/down)
- Response time
- Last check timestamp
- Historical uptime data
Sample Code: Using the Checkmate API
Here are examples of how to interact with your deployed Checkmate instance programmatically:
JavaScript/Node.js:
const axios = require('axios');
const CHECKMATE_URL = 'https://example-app.klutch.sh';
// Create a new monitorasync function createMonitor(name, url, options = {}) { try { const response = await axios.post(`${CHECKMATE_URL}/api/monitors`, { name, url, method: options.method || 'GET', interval: options.interval || 300, timeout: options.timeout || 30 });
console.log('Monitor created:', response.data); return response.data; } catch (error) { console.error('Error creating monitor:', error.response?.data || error.message); }}
// Get all monitorsasync function getMonitors() { try { const response = await axios.get(`${CHECKMATE_URL}/api/monitors`); console.log('Monitors:', response.data.monitors); return response.data.monitors; } catch (error) { console.error('Error fetching monitors:', error.message); }}
// Get monitor details with check historyasync function getMonitorDetails(monitorId) { try { const response = await axios.get(`${CHECKMATE_URL}/api/monitors/${monitorId}`); console.log('Monitor details:', response.data); return response.data; } catch (error) { console.error('Error fetching monitor details:', error.message); }}
// Delete a monitorasync function deleteMonitor(monitorId) { try { const response = await axios.delete(`${CHECKMATE_URL}/api/monitors/${monitorId}`); console.log('Monitor deleted:', response.data); return response.data; } catch (error) { console.error('Error deleting monitor:', error.message); }}
// Example usage(async () => { // Create monitors for your services await createMonitor('Production API', 'https://api.example.com/health', { interval: 60, timeout: 10 });
await createMonitor('Website Homepage', 'https://www.example.com', { method: 'HEAD', interval: 300 });
// List all monitors const monitors = await getMonitors();
// Get details for a specific monitor if (monitors && monitors.length > 0) { await getMonitorDetails(monitors[0].id); }})();Python Example:
import requestsimport json
CHECKMATE_URL = 'https://example-app.klutch.sh'
def create_monitor(name, url, method='GET', interval=300, timeout=30): """Create a new monitor""" try: response = requests.post( f'{CHECKMATE_URL}/api/monitors', json={ 'name': name, 'url': url, 'method': method, 'interval': interval, 'timeout': timeout } ) response.raise_for_status() print(f'Monitor created: {response.json()}') return response.json() except requests.exceptions.RequestException as e: print(f'Error creating monitor: {e}') return None
def get_monitors(): """Get all monitors""" try: response = requests.get(f'{CHECKMATE_URL}/api/monitors') response.raise_for_status() monitors = response.json()['monitors'] print(f'Found {len(monitors)} monitors') return monitors except requests.exceptions.RequestException as e: print(f'Error fetching monitors: {e}') return []
def get_monitor_details(monitor_id): """Get monitor details with check history""" try: response = requests.get(f'{CHECKMATE_URL}/api/monitors/{monitor_id}') response.raise_for_status() data = response.json() print(f"Monitor: {data['monitor']['name']}") print(f"Recent checks: {len(data['checks'])}") return data except requests.exceptions.RequestException as e: print(f'Error fetching monitor details: {e}') return None
def delete_monitor(monitor_id): """Delete a monitor""" try: response = requests.delete(f'{CHECKMATE_URL}/api/monitors/{monitor_id}') response.raise_for_status() print(f'Monitor deleted: {response.json()}') return True except requests.exceptions.RequestException as e: print(f'Error deleting monitor: {e}') return False
# Example usageif __name__ == '__main__': # Create monitors create_monitor( 'Production API', 'https://api.example.com/health', interval=60, timeout=10 )
create_monitor( 'Database Health', 'https://db.example.com/ping', method='HEAD', interval=120 )
# List all monitors monitors = get_monitors()
# Get details for the first monitor if monitors: get_monitor_details(monitors[0]['id'])Shell Script Example:
#!/bin/bash
CHECKMATE_URL="https://example-app.klutch.sh"
# Create a new monitorcreate_monitor() { local name=$1 local url=$2
curl -X POST "${CHECKMATE_URL}/api/monitors" \ -H "Content-Type: application/json" \ -d "{ \"name\": \"${name}\", \"url\": \"${url}\", \"method\": \"GET\", \"interval\": 300, \"timeout\": 30 }" echo}
# Get all monitorsget_monitors() { curl -s "${CHECKMATE_URL}/api/monitors" | jq '.'}
# Get monitor detailsget_monitor_details() { local monitor_id=$1 curl -s "${CHECKMATE_URL}/api/monitors/${monitor_id}" | jq '.'}
# Delete a monitordelete_monitor() { local monitor_id=$1 curl -X DELETE "${CHECKMATE_URL}/api/monitors/${monitor_id}" echo}
# Check health endpointcheck_health() { curl -s "${CHECKMATE_URL}/health" | jq '.'}
# Example usageecho "Creating monitors..."create_monitor "Production API" "https://api.example.com/health"create_monitor "Website" "https://www.example.com"
echo "Listing all monitors..."get_monitors
echo "Checking Checkmate health..."check_healthConfiguring Email Notifications
To receive email alerts when services go down, configure SMTP settings in your environment variables:
// Example: Adding email notification logic to server.jsconst nodemailer = require('nodemailer');
// Configure email transportconst transporter = nodemailer.createTransport({ host: process.env.SMTP_HOST, port: parseInt(process.env.SMTP_PORT || '587'), secure: process.env.SMTP_SECURE === 'true', auth: { user: process.env.SMTP_USER, pass: process.env.SMTP_PASSWORD }});
// Send alert emailasync function sendAlert(monitor, check) { if (!process.env.ALERT_EMAIL) return;
const subject = `[ALERT] ${monitor.name} is DOWN`; const html = ` <h2>Service Alert</h2> <p><strong>Monitor:</strong> ${monitor.name}</p> <p><strong>URL:</strong> ${monitor.url}</p> <p><strong>Status:</strong> ${check.status}</p> <p><strong>Error:</strong> ${check.error_message || 'Unknown error'}</p> <p><strong>Time:</strong> ${check.checked_at}</p> `;
try { await transporter.sendMail({ from: process.env.SMTP_FROM, to: process.env.ALERT_EMAIL, subject, html }); console.log(`Alert email sent for ${monitor.name}`); } catch (error) { console.error('Error sending alert email:', error); }}Setting Up Webhook Notifications
Configure webhook notifications for Slack, Discord, or custom integrations:
// Example: Adding webhook notification logicconst axios = require('axios');
async function sendWebhookAlert(monitor, check) { if (!process.env.WEBHOOK_URL) return;
const payload = { text: `🚨 Alert: ${monitor.name} is DOWN`, blocks: [ { type: 'section', text: { type: 'mrkdwn', text: `*Monitor:* ${monitor.name}\n*URL:* ${monitor.url}\n*Status:* ${check.status}\n*Error:* ${check.error_message || 'Unknown'}` } } ] };
try { await axios.post(process.env.WEBHOOK_URL, payload); console.log(`Webhook alert sent for ${monitor.name}`); } catch (error) { console.error('Error sending webhook alert:', error); }}Production Best Practices
Security Recommendations
- Authentication: Implement authentication for the dashboard to prevent unauthorized access. Add JWT-based authentication or basic auth middleware.
- HTTPS Only: Always use HTTPS for your Checkmate instance to protect monitoring data in transit.
- Environment Variables: Store all sensitive credentials (SMTP passwords, JWT secrets, API keys) as environment variables in Klutch.sh, never in your code.
- Rate Limiting: Implement rate limiting on the API endpoints to prevent abuse.
- Database Backups: Regularly back up the SQLite database containing your monitoring configuration and history.
- API Keys: If exposing the API publicly, implement API key authentication for programmatic access.
- Input Validation: Validate all user inputs to prevent injection attacks and malicious URLs.
- Regular Updates: Keep your Node.js dependencies updated to patch security vulnerabilities.
Performance Optimization
- Resource Allocation: Monitor your application and adjust CPU/memory based on the number of checks you’re running.
- Database Optimization: Add indexes on frequently queried columns (monitor_id, checked_at) for faster queries.
- Check Distribution: Distribute health checks evenly across intervals to avoid CPU spikes.
- Response Data Storage: Limit the amount of historical data stored; implement data retention policies.
- Concurrent Checks: Use promise-based concurrency to run multiple checks simultaneously without blocking.
- Caching: Implement caching for dashboard data to reduce database queries.
- Connection Pooling: Reuse HTTP connections for repeated checks to the same endpoints.
Example database index optimization:
CREATE INDEX idx_checks_monitor_id ON checks(monitor_id);CREATE INDEX idx_checks_checked_at ON checks(checked_at DESC);CREATE INDEX idx_monitors_enabled ON monitors(enabled);Monitoring and Logging
Monitor your Checkmate instance for:
- Check Success Rate: Track the percentage of successful health checks
- Response Times: Monitor how quickly checks are completing
- Database Growth: Track database size and implement retention policies
- Failed Checks: Alert on patterns of failed checks that might indicate monitoring issues
- Resource Usage: CPU, memory, and disk I/O patterns
- API Usage: Track API endpoint usage and response times
Example logging configuration:
// Add structured loggingconst winston = require('winston');
const logger = winston.createLogger({ level: 'info', format: winston.format.json(), transports: [ new winston.transports.File({ filename: 'error.log', level: 'error' }), new winston.transports.File({ filename: 'combined.log' }), new winston.transports.Console({ format: winston.format.simple() }) ]});
// Use logger instead of console.loglogger.info('Health check completed', { monitor: monitor.name, status: check.status, responseTime: check.response_time});Backup and Recovery
- Database Backups: Regularly back up the
/app/datavolume, especially thecheckmate.dbfile. - Configuration Backups: Export monitor configurations periodically and store them securely.
- Disaster Recovery Plan: Document your recovery procedures and test them regularly.
- Version Control: Keep your Dockerfile and application code in version control (Git) for easy rollback.
- Automated Backups: Implement automated backup scripts that run on a schedule.
Example backup script:
#!/bin/bashBACKUP_DIR="/backups"TIMESTAMP=$(date +%Y%m%d_%H%M%S)DB_PATH="/app/data/checkmate.db"
# Create backupmkdir -p $BACKUP_DIRsqlite3 $DB_PATH ".backup '${BACKUP_DIR}/checkmate_${TIMESTAMP}.db'"
# Keep only last 30 days of backupsfind $BACKUP_DIR -name "checkmate_*.db" -mtime +30 -delete
echo "Backup completed: checkmate_${TIMESTAMP}.db"Scaling Considerations
- Vertical Scaling: Increase CPU and memory for your Klutch.sh instance as you add more monitors
- Check Frequency: Balance check frequency with resource usage; not all services need minute-by-minute checks
- Data Retention: Implement data retention policies to prevent unlimited database growth
- Monitor Grouping: Group related monitors for easier management and reporting
- Distributed Checks: For global monitoring, consider deploying multiple Checkmate instances in different regions
Customizing Checkmate
Adding Custom Check Types
Extend Checkmate to support additional check types beyond HTTP:
// Add TCP port checkconst net = require('net');
async function performTCPCheck(monitor) { return new Promise((resolve, reject) => { const socket = net.createConnection({ host: monitor.host, port: monitor.port, timeout: monitor.timeout * 1000 });
socket.on('connect', () => { socket.end(); resolve({ status: 'up', message: 'TCP port is open' }); });
socket.on('timeout', () => { socket.destroy(); reject(new Error('Connection timeout')); });
socket.on('error', (err) => { reject(err); }); });}
// Add DNS checkconst dns = require('dns').promises;
async function performDNSCheck(monitor) { try { const addresses = await dns.resolve4(monitor.domain); return { status: 'up', addresses: addresses, message: `Resolved to ${addresses.length} addresses` }; } catch (error) { throw new Error(`DNS resolution failed: ${error.message}`); }}Adding Dashboard Features
Enhance the dashboard with additional features:
// Calculate uptime percentagefunction calculateUptime(checks) { if (checks.length === 0) return 100;
const upChecks = checks.filter(c => c.status === 'up').length; return ((upChecks / checks.length) * 100).toFixed(2);}
// Get average response timefunction getAverageResponseTime(checks) { if (checks.length === 0) return 0;
const totalTime = checks.reduce((sum, c) => sum + (c.response_time || 0), 0); return Math.round(totalTime / checks.length);}
// API endpoint for statisticsapp.get('/api/monitors/:id/stats', (req, res) => { const monitorId = req.params.id; const days = parseInt(req.query.days || '7'); const since = new Date(); since.setDate(since.getDate() - days);
db.all( 'SELECT * FROM checks WHERE monitor_id = ? AND checked_at > ? ORDER BY checked_at DESC', [monitorId, since.toISOString()], (err, checks) => { if (err) { return res.status(500).json({ error: err.message }); }
const stats = { uptime: calculateUptime(checks), averageResponseTime: getAverageResponseTime(checks), totalChecks: checks.length, failedChecks: checks.filter(c => c.status === 'down').length };
res.json(stats); } );});Troubleshooting
Cannot Access Checkmate Dashboard
- Verify your app is deployed and running in the Klutch.sh dashboard
- Check that the internal port is set to 3000 (or your configured PORT)
- Ensure HTTP traffic type is selected (not TCP)
- Review deployment logs for any startup errors
- Verify environment variables are correctly set
- Test the health endpoint:
curl https://example-app.klutch.sh/health
Database or Configuration Not Persisting
- Verify the persistent volume is correctly attached at
/app/data - Check that the volume has sufficient space allocated (at least 5GB)
- Review logs for permission or write errors
- Ensure the DATA_DIR environment variable matches the mount path
- Check file permissions in the container:
ls -la /app/data
Monitors Not Running
- Check the application logs for cron scheduler errors
- Verify monitors are enabled in the database
- Ensure check intervals are properly configured
- Review system resources (CPU/memory) for resource constraints
- Check if the database connection is working: query the monitors table
Email Notifications Not Working
- Verify SMTP environment variables are correctly set
- Test SMTP credentials with your email provider
- Check spam folders for notification emails
- Review Checkmate logs for email sending errors
- Test SMTP connection:
telnet smtp.gmail.com 587 - Consider using a dedicated email service like SendGrid or Mailgun
High Resource Usage
- Monitor the number of active monitors and their check intervals
- Reduce check frequency for less critical services
- Implement database cleanup for old check results
- Review and optimize database queries
- Consider increasing compute resources in Klutch.sh
- Check for memory leaks in custom code
False Positive Alerts
- Increase timeout values for slow-responding services
- Configure appropriate HTTP status codes as acceptable
- Implement retry logic before marking a service as down
- Consider network latency between Checkmate and monitored services
- Review and adjust check intervals
- Add grace periods before triggering alerts
Advanced Configuration
Environment Variable Reference
Complete list of supported environment variables:
# Application SettingsNODE_ENV=productionPORT=3000DATA_DIR=/app/dataAPP_URL=https://example-app.klutch.sh
# Check ConfigurationCHECK_INTERVAL=60 # Default check interval (seconds)REQUEST_TIMEOUT=30 # Default request timeout (seconds)MAX_REDIRECTS=5 # Maximum HTTP redirectsUSER_AGENT=Checkmate/1.0 # Custom user agent
# Database SettingsDB_MAX_CHECKS=10000 # Maximum checks to retain per monitorDB_RETENTION_DAYS=90 # Days to retain historical data
# Email NotificationsSMTP_HOST=smtp.gmail.comSMTP_PORT=587SMTP_SECURE=falseSMTP_USER=your-email@gmail.comSMTP_PASSWORD=your-app-passwordSMTP_FROM=noreply@example.comALERT_EMAIL=alerts@example.com
# Webhook NotificationsWEBHOOK_URL=https://hooks.slack.com/services/YOUR/WEBHOOK/URLWEBHOOK_METHOD=POST
# Security SettingsJWT_SECRET=your-secret-key-hereADMIN_USERNAME=adminADMIN_PASSWORD=strong-passwordSESSION_TIMEOUT=3600 # Session timeout (seconds)
# Performance TuningMAX_CONCURRENT_CHECKS=10 # Maximum concurrent health checksCHECK_QUEUE_SIZE=100 # Maximum queued checksCustomizing Nixpacks Build (Optional)
If you need to customize the build process using Nixpacks environment variables:
Build-time environment variables:
NIXPACKS_BUILD_CMD: Custom build command (e.g.,npm run build)NIXPACKS_INSTALL_CMD: Custom install command (e.g.,npm ci --production)
Runtime environment variables:
NIXPACKS_START_CMD: Custom start command (e.g.,node server.js --production)
Example: To change the start command, set this environment variable in Klutch.sh:
NIXPACKS_START_CMD=node server.js --port 3000 --data-dir /app/dataAdditional Resources
- Klutch.sh Documentation
- Klutch.sh Volumes Guide
- Klutch.sh Networking Guide
- Klutch.sh Deployment Guide
- Node.js Documentation
- Express.js Documentation
- SQLite Documentation
- Nodemailer Documentation
Conclusion
Deploying Checkmate to Klutch.sh provides a powerful, self-hosted monitoring solution for tracking the health and performance of your services. With Docker-based deployment, persistent storage, and comprehensive configuration options, you can build a production-ready monitoring platform that keeps you informed about your infrastructure’s status. By following this guide, you’ve set up a secure, high-performance Checkmate instance ready to monitor your web applications, APIs, and services with automated health checks, real-time alerts, and detailed performance metrics.