Deploying Operational.co
Introduction
Operational.co is an open-source incident management and on-call platform designed to help engineering teams respond to and resolve incidents efficiently. It provides tools for managing on-call rotations, escalation policies, incident tracking, and post-incident reviews, creating a structured approach to handling production issues.
Built for modern DevOps and SRE practices, Operational.co integrates with monitoring tools and alerting systems to streamline incident response workflows. Teams can define escalation policies, manage on-call schedules, and track incidents from detection through resolution and review.
Key highlights of Operational.co:
- Incident Management: Create, track, and resolve incidents with structured workflows
- On-Call Scheduling: Define and manage on-call rotations for teams
- Escalation Policies: Configure automatic escalation when incidents are not acknowledged
- Alert Integration: Connect with monitoring tools and alerting systems
- Status Pages: Communicate incident status to stakeholders
- Post-Mortems: Document and learn from incidents with structured reviews
- Team Management: Organize responders into teams with specific responsibilities
- Notification Channels: Alert via email, SMS, Slack, and other channels
- API Access: Programmatic access for automation and integration
- Self-Hosted: Complete control over your incident data
This guide walks through deploying Operational.co on Klutch.sh using Docker, configuring integrations, and setting up incident management for your team.
Why Deploy Operational.co on Klutch.sh
Deploying Operational.co on Klutch.sh provides several advantages:
Simplified Deployment: Klutch.sh automatically builds and deploys your incident management platform. Push to GitHub, and your service deploys automatically.
Persistent Storage: Attach persistent volumes for database and configuration. Your incident history and configurations survive container restarts.
HTTPS by Default: Klutch.sh provides automatic SSL certificates for secure access to your incident platform.
Always-On Availability: Your incident management system runs 24/7, essential for receiving alerts and managing on-call schedules.
GitHub Integration: Store configuration in Git for version-controlled infrastructure.
Scalable Resources: Allocate resources based on team size and alert volume.
Custom Domains: Use your organization’s domain for professional incident management URLs.
Prerequisites
Before deploying Operational.co on Klutch.sh, ensure you have:
- A Klutch.sh account
- A GitHub account with a repository for your configuration
- Basic familiarity with Docker and containerization concepts
- SMTP server or email service for notifications
- (Optional) Slack workspace for chat notifications
Understanding Operational.co Architecture
Operational.co consists of several components:
API Server: Handles incident management, user authentication, and business logic.
Web Interface: Dashboard for managing incidents, schedules, and team configuration.
Background Workers: Process notifications, escalations, and scheduled tasks.
Database: Stores incidents, users, schedules, and configuration. PostgreSQL recommended.
Cache: Redis for session management and background job queues.
Preparing Your Repository
Create a GitHub repository containing your Dockerfile and configuration.
Repository Structure
operational-deploy/├── Dockerfile├── .dockerignore└── README.mdCreating the Dockerfile
Create a Dockerfile for Operational.co:
FROM node:18-alpine
WORKDIR /app
# Install dependenciesRUN apk add --no-cache git python3 make g++
# Clone Operational.co repositoryRUN git clone https://github.com/operational-co/operational.git .
# Install dependenciesRUN npm install
# Build the applicationRUN npm run build
# Environment configurationENV NODE_ENV=productionENV PORT=3000
# Expose the application portEXPOSE 3000
# Start the applicationCMD ["npm", "start"]Environment Variables Reference
| Variable | Required | Description |
|---|---|---|
DATABASE_URL | Yes | PostgreSQL connection string |
REDIS_URL | Yes | Redis connection string |
SECRET_KEY | Yes | Application secret for sessions |
SMTP_HOST | Yes | SMTP server hostname |
SMTP_PORT | Yes | SMTP server port |
SMTP_USER | Yes | SMTP username |
SMTP_PASS | Yes | SMTP password |
APP_URL | Yes | Public URL of the application |
SLACK_WEBHOOK_URL | No | Slack webhook for notifications |
Deploying Operational.co on Klutch.sh
Follow these steps to deploy your incident management platform:
- PostgreSQL Database: Deploy a PostgreSQL instance
- Redis: Deploy a Redis instance for caching and job queues
- Select HTTP as the traffic type
- Set the internal port to 3000
- Create your admin account
- Set up your organization
- Configure notification channels
- Add team members
Generate Security Keys
Generate secure keys for your deployment:
# Generate secret keyopenssl rand -hex 32Deploy Required Services
Operational.co requires:
Note the connection URLs for configuration.
Push Your Repository to GitHub
Initialize and push your repository:
git initgit add Dockerfile .dockerignore README.mdgit commit -m "Initial Operational.co configuration"git remote add origin https://github.com/yourusername/operational-deploy.gitgit push -u origin mainCreate a New Project on Klutch.sh
Navigate to the Klutch.sh dashboard and create a new project named “operational” or “incident-management”.
Create a New App
Within your project, create a new app. Connect your GitHub account and select the repository containing your Dockerfile.
Configure HTTP Traffic
In the deployment settings:
Set Environment Variables
Configure your instance:
| Variable | Value |
|---|---|
DATABASE_URL | Your PostgreSQL connection string |
REDIS_URL | Your Redis connection string |
SECRET_KEY | Your generated secret key |
SMTP_HOST | Your SMTP server |
SMTP_PORT | SMTP port (typically 587) |
SMTP_USER | SMTP username |
SMTP_PASS | SMTP password |
APP_URL | https://your-app-name.klutch.sh |
Attach Persistent Volumes
Add persistent storage:
| Mount Path | Recommended Size | Purpose |
|---|---|---|
/app/data | 10 GB | Application data and uploads |
Deploy Your Application
Click Deploy to start the build process.
Complete Initial Setup
Access Operational.co at https://your-app-name.klutch.sh:
Initial Configuration
Creating Your Organization
Set up your organization structure:
- Create your organization
- Add teams (e.g., Backend, Frontend, Infrastructure)
- Invite team members
- Assign roles and permissions
Configuring Notification Channels
Set up how alerts are delivered:
- Email: Configure SMTP for email notifications
- Slack: Add Slack webhook URL for channel notifications
- SMS: Configure SMS provider for urgent alerts
- Phone: Set up voice calls for critical incidents
Setting Up On-Call Schedules
Create on-call rotations:
- Navigate to Schedules
- Create a new schedule
- Define rotation type (daily, weekly, custom)
- Add team members to the rotation
- Set rotation times and handoff procedures
Incident Management
Creating Incidents
When an incident occurs:
- Create a new incident manually or via API/integration
- Set severity level
- Assign to on-call responder
- Add initial description and impact
Incident Workflow
Standard incident lifecycle:
- Triggered: Incident is created
- Acknowledged: Responder confirms awareness
- Investigating: Active investigation in progress
- Resolved: Issue has been fixed
- Post-Mortem: Review and documentation
Escalation Policies
Configure automatic escalation:
- Create an escalation policy
- Define escalation levels
- Set timeout periods for each level
- Assign responders at each level
Example escalation:
- Level 1: Primary on-call (5-minute timeout)
- Level 2: Secondary on-call (10-minute timeout)
- Level 3: Team lead (15-minute timeout)
- Level 4: Engineering manager
Integration Setup
Monitoring Integration
Connect with monitoring tools:
- Configure webhook endpoints
- Map alert severity to incident priority
- Test integration with sample alerts
Slack Integration
Enable Slack notifications:
- Create a Slack app or incoming webhook
- Add webhook URL to configuration
- Configure which events trigger notifications
- Test with a sample incident
API Integration
Use the API for automation:
# Create an incident via APIcurl -X POST https://your-app-name.klutch.sh/api/incidents \ -H "Authorization: Bearer YOUR_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "title": "Database connection errors", "severity": "high", "description": "Multiple services reporting database timeouts" }'Status Pages
Creating a Status Page
Communicate with stakeholders:
- Navigate to Status Pages
- Create a new status page
- Add components to monitor
- Configure public URL
Updating Status
During incidents:
- Link incident to affected components
- Update component status
- Post status updates for subscribers
Post-Incident Reviews
Creating Post-Mortems
Learn from incidents:
- After resolution, create a post-mortem
- Document timeline of events
- Identify root cause
- Define action items
- Share with team
Post-Mortem Template
Standard sections:
- Summary: Brief description of what happened
- Impact: Who/what was affected and for how long
- Timeline: Chronological events during the incident
- Root Cause: Why the incident occurred
- Resolution: How the incident was resolved
- Action Items: Follow-up tasks to prevent recurrence
- Lessons Learned: What the team learned
Best Practices
On-Call Management
- Define clear escalation policies
- Ensure adequate coverage across time zones
- Rotate fairly to prevent burnout
- Provide runbooks for common issues
Incident Response
- Acknowledge quickly to stop escalation
- Communicate status regularly
- Focus on resolution first, investigation later
- Document everything during the incident
Continuous Improvement
- Review all incidents in post-mortems
- Track action items to completion
- Share learnings across teams
- Update runbooks based on incidents
Troubleshooting Common Issues
Notifications Not Sending
Symptoms: Alerts not reaching responders.
Solutions:
- Verify SMTP configuration
- Check notification channel settings
- Verify recipient contact information
- Review application logs
On-Call Not Escalating
Symptoms: Unacknowledged incidents not escalating.
Solutions:
- Verify escalation policy configuration
- Check timeout settings
- Ensure on-call schedule is active
- Verify responders are in rotation
Integration Not Working
Symptoms: External alerts not creating incidents.
Solutions:
- Verify webhook URL is correct
- Check authentication tokens
- Review incoming webhook logs
- Test with sample payloads
Additional Resources
- Operational.co Official Website
- Operational.co GitHub Repository
- Operational.co Documentation
- Klutch.sh Persistent Volumes
- Klutch.sh Deployments
Conclusion
Deploying Operational.co on Klutch.sh provides a comprehensive incident management solution for your engineering team. With on-call scheduling, escalation policies, and incident tracking, Operational.co brings structure to your incident response process.
The combination of persistent storage for incident history, reliable uptime for alert reception, and HTTPS security makes Klutch.sh well-suited for hosting Operational.co. Whether managing a small team or a large engineering organization, your self-hosted incident platform provides the control and reliability that commercial services cannot match.
Start with basic on-call schedules and incident tracking, then expand with integrations and status pages as your needs grow. With Operational.co on Klutch.sh, you own your incident management infrastructure.