Deploying Davis
Davis is an AI-powered virtual assistant specifically designed for Dynatrace monitoring and observability platforms. Named after the Davis AI engine within Dynatrace, this conversational interface allows teams to interact with their monitoring data using natural language queries instead of navigating complex dashboards or writing custom queries. Whether you’re investigating performance issues, analyzing metrics, or checking system health, Davis translates your questions into actionable insights pulled directly from your Dynatrace environment.
What sets Davis apart is its ability to understand context and intent. Rather than requiring precise syntax or memorizing query languages, you can ask questions like “What’s causing the high response time on my production API?” or “Show me CPU usage for the last hour” and receive intelligent responses with relevant charts, metrics, and recommendations. Davis integrates with Slack, Microsoft Teams, and other collaboration platforms, bringing monitoring insights directly into your team’s workflow. This makes monitoring data accessible to everyone, not just specialists who know how to navigate APM tools.
Why Deploy Davis on Klutch.sh?
Deploying Davis on Klutch.sh offers several advantages for hosting your AI monitoring assistant:
- Automatic Docker Detection: Klutch.sh recognizes your Dockerfile and handles containerization without manual configuration
- Persistent Storage: Built-in volume management ensures your conversation history and configuration persist across deployments
- HTTPS by Default: Secure access to your Davis instance with automatic SSL certificates
- Environment Management: Securely configure Dynatrace API tokens, webhook URLs, and integration credentials through environment variables
- Webhook Support: Receive real-time notifications and integrate with collaboration tools through HTTP endpoints
- Rapid Deployment: Go from configuration to production in minutes with GitHub integration
- Always-On Availability: Keep your monitoring assistant running 24/7 without managing infrastructure
Prerequisites
Before deploying Davis to Klutch.sh, ensure you have:
- A Klutch.sh account (sign up here)
- A GitHub account with a repository for your Davis deployment
- Basic understanding of Docker and containerization
- A Dynatrace account (SaaS or Managed)
- Dynatrace API token with appropriate permissions
- Dynatrace environment ID and URL
- Slack workspace (optional, for Slack integration)
- Git installed on your local development machine
- Familiarity with REST APIs and webhooks
Understanding Davis Architecture
Davis follows a microservices architecture designed for intelligent monitoring interactions:
Core Components
Node.js Application Server
Davis is built with Node.js, providing a responsive web application and API endpoints for handling user interactions. The application processes natural language queries, communicates with Dynatrace APIs, and formats responses in user-friendly formats. Express.js handles HTTP routing, middleware, and webhook endpoints for integrations with collaboration platforms.
Natural Language Processing
The NLP engine interprets user queries and extracts intent, entities, and context. When you ask “What’s wrong with my application?”, Davis parses the question, identifies relevant time ranges, application names, and problem categories, then constructs appropriate Dynatrace API queries to retrieve the answer.
Dynatrace API Integration
Davis connects to Dynatrace through REST APIs, accessing monitoring data, metrics, events, and problems. The integration layer handles authentication with API tokens, manages rate limiting, and caches frequent queries for performance. Davis can access:
- Application performance metrics
- Infrastructure monitoring data
- Problem detection and root cause analysis
- Log analytics and traces
- Custom metrics and events
- Synthetic monitoring results
Conversation Manager
The conversation manager maintains context across multiple interactions, allowing follow-up questions without repeating context. If you ask “Show me errors in production” followed by “What about staging?”, Davis remembers the context (errors) and adjusts the query scope (staging environment).
Response Formatter
Davis formats Dynatrace data into human-readable responses with charts, tables, and recommendations. Complex JSON responses from Dynatrace are transformed into conversational answers, graphs are rendered for visual analysis, and actionable suggestions are highlighted.
Integration Layer
Davis integrates with multiple platforms:
- Slack: Slash commands and interactive messages
- Microsoft Teams: Bot framework integration
- Webhook: Generic webhook support for custom integrations
- Web UI: Browser-based chat interface
Each integration maintains its own session state and user authentication.
Configuration System
Davis uses environment variables and configuration files to manage:
- Dynatrace connection settings
- API authentication tokens
- Integration platform credentials
- Feature flags and customization
- Response templates and formatting rules
Query Flow
- User asks question through chat interface (Slack, Teams, or web UI)
- Davis receives message and authenticates user
- NLP engine parses query to extract intent and entities
- Conversation manager retrieves context from previous interactions
- Query builder constructs appropriate Dynatrace API requests
- API calls are made to Dynatrace with authentication
- Response data is retrieved and cached
- Response formatter converts data to human-readable format
- Formatted response is sent back through integration channel
- Conversation context is updated for follow-up questions
Storage Requirements
Davis requires persistent storage for:
- Conversation History: Past interactions and context for users
- Cache Data: Frequently accessed metrics and query results
- Configuration: Custom response templates and user preferences
- Session State: Active conversation sessions and authentication tokens
A typical deployment needs 1GB-5GB for conversation history and cache data, growing based on user activity and cache retention policies.
Installation and Setup
Let’s walk through setting up Davis for deployment on Klutch.sh.
Step 1: Create the Project Structure
First, create a new directory for your Davis deployment:
mkdir davis-deploymentcd davis-deploymentgit initStep 2: Create Configuration File
Create a config.json file with your Dynatrace configuration:
{ "dynatrace": { "environmentId": "your-environment-id", "apiUrl": "https://your-environment-id.live.dynatrace.com/api/v2", "apiToken": "${DYNATRACE_API_TOKEN}", "timeout": 30000 }, "server": { "port": 3000, "host": "0.0.0.0" }, "nlp": { "confidenceThreshold": 0.6, "contextWindow": 5, "enableFollowUp": true }, "cache": { "enabled": true, "ttl": 300, "maxSize": 100 }, "integrations": { "slack": { "enabled": "${SLACK_ENABLED}", "botToken": "${SLACK_BOT_TOKEN}", "signingSecret": "${SLACK_SIGNING_SECRET}", "appToken": "${SLACK_APP_TOKEN}" }, "teams": { "enabled": "${TEAMS_ENABLED}", "appId": "${TEAMS_APP_ID}", "appPassword": "${TEAMS_APP_PASSWORD}" }, "webhook": { "enabled": true, "secret": "${WEBHOOK_SECRET}" } }, "features": { "problemAnalysis": true, "performanceMetrics": true, "logAnalytics": true, "customQueries": true, "aiInsights": true }, "logging": { "level": "info", "format": "json" }}Step 3: Create Environment Template
Create a .env.example file:
# Dynatrace ConfigurationDYNATRACE_ENVIRONMENT_ID=your-environment-idDYNATRACE_API_URL=https://your-environment-id.live.dynatrace.com/api/v2DYNATRACE_API_TOKEN=your-dynatrace-api-token
# Server ConfigurationPORT=3000NODE_ENV=production
# Slack Integration (Optional)SLACK_ENABLED=falseSLACK_BOT_TOKEN=xoxb-your-bot-tokenSLACK_SIGNING_SECRET=your-signing-secretSLACK_APP_TOKEN=xapp-your-app-token
# Microsoft Teams Integration (Optional)TEAMS_ENABLED=falseTEAMS_APP_ID=your-teams-app-idTEAMS_APP_PASSWORD=your-teams-password
# Webhook ConfigurationWEBHOOK_SECRET=your-webhook-secret
# Cache ConfigurationREDIS_URL=redis://localhost:6379CACHE_TTL=300
# LoggingLOG_LEVEL=infoLOG_FORMAT=json
# Session ConfigurationSESSION_SECRET=your-session-secretSESSION_TTL=86400Step 4: Create the Dockerfile
Create a Dockerfile in the root directory:
FROM node:18-alpine
# Set environment variablesENV NODE_ENV=production \ NPM_CONFIG_LOGLEVEL=warn \ PORT=3000
# Install system dependenciesRUN apk add --no-cache \ python3 \ make \ g++ \ curl
# Create app directoryWORKDIR /app
# Create davis userRUN addgroup -g 1000 davis && \ adduser -D -u 1000 -G davis davis
# Copy package filesCOPY package*.json ./
# Install dependenciesRUN npm ci --only=production && \ npm cache clean --force
# Copy application filesCOPY --chown=davis:davis . .
# Create necessary directoriesRUN mkdir -p /app/data /app/logs && \ chown -R davis:davis /app
# Switch to davis userUSER davis
# Expose portEXPOSE 3000
# Health checkHEALTHCHECK --interval=30s --timeout=10s --start-period=40s --retries=3 \ CMD node healthcheck.js || exit 1
# Start applicationCMD ["node", "server.js"]Step 5: Create Application Files
Create package.json:
{ "name": "davis-ai-assistant", "version": "1.0.0", "description": "AI-powered assistant for Dynatrace monitoring", "main": "server.js", "scripts": { "start": "node server.js", "dev": "nodemon server.js", "test": "jest", "lint": "eslint ." }, "dependencies": { "express": "^4.18.2", "axios": "^1.6.0", "dotenv": "^16.3.1", "express-rate-limit": "^7.1.0", "helmet": "^7.1.0", "cors": "^2.8.5", "morgan": "^1.10.0", "joi": "^17.11.0", "natural": "^6.10.0", "compromise": "^14.10.0", "ioredis": "^5.3.2", "express-session": "^1.17.3", "winston": "^3.11.0", "@slack/bolt": "^3.15.0", "botbuilder": "^4.21.0" }, "devDependencies": { "nodemon": "^3.0.1", "jest": "^29.7.0", "eslint": "^8.54.0" }, "engines": { "node": ">=18.0.0", "npm": ">=9.0.0" }}Create server.js:
require('dotenv').config();const express = require('express');const helmet = require('helmet');const cors = require('cors');const morgan = require('morgan');const rateLimit = require('express-rate-limit');const session = require('express-session');const winston = require('winston');
// Import custom modulesconst dynatraceClient = require('./lib/dynatrace-client');const nlpProcessor = require('./lib/nlp-processor');const conversationManager = require('./lib/conversation-manager');const responseFormatter = require('./lib/response-formatter');const slackIntegration = require('./integrations/slack');const teamsIntegration = require('./integrations/teams');const webhookHandler = require('./integrations/webhook');
// Initialize Express appconst app = express();const PORT = process.env.PORT || 3000;
// Configure loggerconst logger = winston.createLogger({ level: process.env.LOG_LEVEL || 'info', format: winston.format.json(), transports: [ new winston.transports.Console({ format: winston.format.simple() }), new winston.transports.File({ filename: '/app/logs/error.log', level: 'error' }), new winston.transports.File({ filename: '/app/logs/combined.log' }) ]});
// Middlewareapp.use(helmet());app.use(cors());app.use(express.json());app.use(express.urlencoded({ extended: true }));app.use(morgan('combined', { stream: { write: message => logger.info(message.trim()) } }));
// Rate limitingconst limiter = rateLimit({ windowMs: 15 * 60 * 1000, // 15 minutes max: 100 // limit each IP to 100 requests per windowMs});app.use('/api/', limiter);
// Session configurationapp.use(session({ secret: process.env.SESSION_SECRET || 'your-secret-key', resave: false, saveUninitialized: false, cookie: { secure: process.env.NODE_ENV === 'production', maxAge: parseInt(process.env.SESSION_TTL) * 1000 || 86400000 }}));
// Health check endpointapp.get('/health', (req, res) => { res.json({ status: 'healthy', timestamp: new Date().toISOString() });});
// Query endpointapp.post('/api/query', async (req, res) => { try { const { query, userId, sessionId } = req.body;
if (!query) { return res.status(400).json({ error: 'Query is required' }); }
logger.info(`Processing query: ${query}`, { userId, sessionId });
// Parse natural language query const intent = await nlpProcessor.parse(query);
// Get conversation context const context = await conversationManager.getContext(userId, sessionId);
// Build Dynatrace query const dynatraceQuery = await nlpProcessor.buildQuery(intent, context);
// Execute query against Dynatrace const data = await dynatraceClient.query(dynatraceQuery);
// Format response const response = await responseFormatter.format(data, intent);
// Update conversation context await conversationManager.updateContext(userId, sessionId, { query, response, intent });
res.json({ answer: response.text, data: response.data, visualizations: response.charts, suggestions: response.suggestions });
} catch (error) { logger.error('Error processing query:', error); res.status(500).json({ error: 'Failed to process query', message: error.message }); }});
// Conversation history endpointapp.get('/api/conversations/:userId', async (req, res) => { try { const { userId } = req.params; const history = await conversationManager.getHistory(userId); res.json({ conversations: history }); } catch (error) { logger.error('Error fetching conversation history:', error); res.status(500).json({ error: 'Failed to fetch conversations' }); }});
// Initialize integrationsif (process.env.SLACK_ENABLED === 'true') { slackIntegration.initialize(app, logger); logger.info('Slack integration initialized');}
if (process.env.TEAMS_ENABLED === 'true') { teamsIntegration.initialize(app, logger); logger.info('Teams integration initialized');}
webhookHandler.initialize(app, logger);logger.info('Webhook handler initialized');
// Error handling middlewareapp.use((err, req, res, next) => { logger.error('Unhandled error:', err); res.status(500).json({ error: 'Internal server error', message: process.env.NODE_ENV === 'development' ? err.message : undefined });});
// Start serverapp.listen(PORT, '0.0.0.0', () => { logger.info(`Davis AI Assistant listening on port ${PORT}`); logger.info(`Environment: ${process.env.NODE_ENV}`); logger.info(`Dynatrace Environment: ${process.env.DYNATRACE_ENVIRONMENT_ID}`);});
// Graceful shutdownprocess.on('SIGTERM', () => { logger.info('SIGTERM signal received: closing HTTP server'); app.close(() => { logger.info('HTTP server closed'); process.exit(0); });});Create healthcheck.js:
const http = require('http');
const options = { host: 'localhost', port: process.env.PORT || 3000, path: '/health', timeout: 2000};
const request = http.request(options, (res) => { if (res.statusCode === 200) { process.exit(0); } else { process.exit(1); }});
request.on('error', () => { process.exit(1);});
request.end();Step 6: Create Library Modules
Create lib/dynatrace-client.js:
const axios = require('axios');const logger = require('winston');
class DynatraceClient { constructor() { this.apiUrl = process.env.DYNATRACE_API_URL; this.apiToken = process.env.DYNATRACE_API_TOKEN; this.environmentId = process.env.DYNATRACE_ENVIRONMENT_ID;
this.client = axios.create({ baseURL: this.apiUrl, headers: { 'Authorization': `Api-Token ${this.apiToken}`, 'Content-Type': 'application/json' }, timeout: 30000 }); }
async query(params) { try { const { endpoint, method = 'GET', data = null, queryParams = {} } = params;
const response = await this.client.request({ method, url: endpoint, data, params: queryParams });
return response.data; } catch (error) { logger.error('Dynatrace API error:', error.message); throw new Error(`Failed to query Dynatrace: ${error.message}`); } }
async getProblems(timeframe = 'now-2h') { return this.query({ endpoint: '/problems', queryParams: { from: timeframe, fields: '+impactAnalysis,+rootCauseEntity' } }); }
async getMetrics(metricSelector, timeframe = 'now-1h') { return this.query({ endpoint: '/metrics/query', queryParams: { metricSelector, from: timeframe, resolution: '1m' } }); }
async getEntities(entityType, fields = []) { return this.query({ endpoint: `/entities/${entityType}`, queryParams: { fields: fields.join(',') } }); }
async getApplications() { return this.getEntities('applications', ['displayName', 'tags', 'entityId']); }
async getHosts() { return this.getEntities('hosts', ['displayName', 'osType', 'tags']); }
async getServices() { return this.getEntities('services', ['displayName', 'serviceType', 'tags']); }}
module.exports = new DynatraceClient();Create lib/nlp-processor.js:
const natural = require('natural');const compromise = require('compromise');
class NLPProcessor { constructor() { this.tokenizer = new natural.WordTokenizer(); this.tfidf = new natural.TfIdf();
// Intent patterns this.intents = { problemQuery: ['problem', 'issue', 'error', 'down', 'fail', 'crash'], metricsQuery: ['cpu', 'memory', 'response time', 'throughput', 'metric'], statusQuery: ['status', 'health', 'running', 'up', 'available'], listQuery: ['list', 'show', 'display', 'what are', 'get'], analyzeQuery: ['analyze', 'investigate', 'why', 'cause', 'root cause'] }; }
async parse(query) { const doc = compromise(query); const tokens = this.tokenizer.tokenize(query.toLowerCase());
// Extract intent const intent = this.extractIntent(tokens);
// Extract entities const entities = { timeframe: this.extractTimeframe(doc, query), environment: this.extractEnvironment(tokens), entityType: this.extractEntityType(tokens), metric: this.extractMetric(tokens) };
return { intent, entities, originalQuery: query }; }
extractIntent(tokens) { for (const [intentName, keywords] of Object.entries(this.intents)) { for (const keyword of keywords) { if (tokens.some(token => token.includes(keyword) || keyword.includes(token))) { return intentName; } } } return 'generalQuery'; }
extractTimeframe(doc, query) { // Extract time expressions const timeMatch = query.match(/(last|past)\s+(\d+)\s+(minute|hour|day|week)s?/i); if (timeMatch) { const value = parseInt(timeMatch[2]); const unit = timeMatch[3].toLowerCase(); return `now-${value}${unit[0]}`; }
// Default timeframe return 'now-1h'; }
extractEnvironment(tokens) { const environments = ['production', 'staging', 'development', 'prod', 'dev', 'stage']; for (const env of environments) { if (tokens.includes(env)) { return env; } } return null; }
extractEntityType(tokens) { const entityTypes = { application: ['app', 'application'], service: ['service', 'api'], host: ['host', 'server', 'machine'], database: ['database', 'db'] };
for (const [type, keywords] of Object.entries(entityTypes)) { if (tokens.some(token => keywords.includes(token))) { return type; } } return null; }
extractMetric(tokens) { const metrics = { 'cpu': 'builtin:host.cpu.usage', 'memory': 'builtin:host.mem.usage', 'response': 'builtin:service.response.time', 'throughput': 'builtin:service.requestCount.total', 'errors': 'builtin:service.errors.total.count' };
for (const [keyword, metricId] of Object.entries(metrics)) { if (tokens.some(token => token.includes(keyword))) { return metricId; } } return null; }
async buildQuery(parsedIntent, context) { const { intent, entities } = parsedIntent;
switch (intent) { case 'problemQuery': return { endpoint: '/problems', queryParams: { from: entities.timeframe, entitySelector: this.buildEntitySelector(entities) } };
case 'metricsQuery': return { endpoint: '/metrics/query', queryParams: { metricSelector: entities.metric || 'builtin:host.cpu.usage', from: entities.timeframe, resolution: '1m' } };
case 'statusQuery': return { endpoint: '/entities', queryParams: { entitySelector: this.buildEntitySelector(entities), fields: 'healthState,displayName' } };
default: return { endpoint: '/entities', queryParams: {} }; } }
buildEntitySelector(entities) { const selectors = [];
if (entities.entityType) { selectors.push(`type("${entities.entityType}")`); }
if (entities.environment) { selectors.push(`tag("environment:${entities.environment}")`); }
return selectors.join(',') || undefined; }}
module.exports = new NLPProcessor();Step 7: Create .dockerignore
Create a .dockerignore file:
node_modulesnpm-debug.log.env.env.local.git.gitignore*.mdREADME.md.DS_StoreThumbs.dblogs/*.logtest/tests/.vscode/.idea/coverage/Step 8: Create Documentation
Create README.md:
# Davis AI Assistant Deployment
This repository contains a Davis AI Assistant deployment configured for Klutch.sh.
## Features
- Natural language queries for Dynatrace monitoring- Real-time problem detection and analysis- Performance metrics visualization- Slack and Microsoft Teams integration- Conversation history and context awareness- Intelligent suggestions and recommendations
## Configuration
Set the following environment variables:
- `DYNATRACE_API_TOKEN`: Your Dynatrace API token- `DYNATRACE_ENVIRONMENT_ID`: Your environment ID- `DYNATRACE_API_URL`: Your Dynatrace API URL
## Example Queries
- "What problems occurred in the last hour?"- "Show me CPU usage for production servers"- "Are all services healthy?"- "What's causing high response time?"- "List all applications"
## Deployment
This application is configured to deploy on Klutch.sh with automatic Docker detection.Step 9: Initialize Git Repository
git add .git commit -m "Initial Davis AI Assistant setup for Klutch.sh deployment"git branch -M mastergit remote add origin https://github.com/yourusername/davis-deployment.gitgit push -u origin masterDeploying to Klutch.sh
Now that your Davis application is configured, let’s deploy it to Klutch.sh.
-
Log in to Klutch.sh
Navigate to klutch.sh/app and sign in with your GitHub account.
-
Create a New Project
Click “New Project” and select “Import from GitHub”. Choose the repository containing your Davis deployment.
-
Configure Build Settings
Klutch.sh will automatically detect the Dockerfile in your repository. The platform will use this for building your container.
-
Configure Traffic Settings
Select “HTTP” as the traffic type. Davis serves its web interface and API on port 3000, and Klutch.sh will route HTTPS traffic to this port.
-
Set Environment Variables
In the project settings, add the following environment variables:
DYNATRACE_API_TOKEN: Your Dynatrace API token (requires Read API v2 permissions)DYNATRACE_ENVIRONMENT_ID: Your environment ID (e.g.,abc12345)DYNATRACE_API_URL:https://your-environment-id.live.dynatrace.com/api/v2PORT:3000NODE_ENV:productionSESSION_SECRET: Generate usingopenssl rand -hex 32LOG_LEVEL:info
For Slack integration (optional):
SLACK_ENABLED:trueSLACK_BOT_TOKEN: Your Slack bot token (starts withxoxb-)SLACK_SIGNING_SECRET: Your Slack signing secretSLACK_APP_TOKEN: Your Slack app token (starts withxapp-)
-
Configure Persistent Storage
Davis requires persistent storage for conversation history and cache:
- Data Volume:
- Mount path:
/app/data - Size:
5GB
- Mount path:
- Logs Volume:
- Mount path:
/app/logs - Size:
2GB
- Mount path:
These volumes ensure your conversation history and logs persist across deployments.
- Data Volume:
-
Deploy the Application
Click “Deploy” to start the build process. Klutch.sh will:
- Clone your repository
- Build the Docker image using your Dockerfile
- Install Node.js dependencies
- Deploy the container with Davis
- Provision an HTTPS endpoint
The build process typically takes 2-3 minutes.
-
Access Your Davis Instance
Once deployment completes, Klutch.sh will provide a URL like
example-app.klutch.sh. Your Davis AI assistant will be available at this URL.
Getting Started with Davis
Once your Davis instance is deployed, here’s how to use it:
Using the Web Interface
Navigate to Your Deployment
Visit your deployed URL (e.g., https://example-app.klutch.sh) to access the Davis web interface.
Ask Questions
Type natural language questions in the chat interface:
What problems occurred in the last 2 hours?Response includes:
- List of detected problems
- Severity levels
- Affected entities
- Root cause analysis
- Recommended actions
Follow-Up Questions
Davis maintains conversation context:
You: Show me errors in productionDavis: [Lists production errors]You: What about staging?Davis: [Lists staging errors - understands context]View Metrics
Request performance metrics:
Show me CPU usage for the last hourDavis returns:
- Time series chart
- Current value
- Average, min, max values
- Trend analysis
- Anomaly detection
API Usage
Query Endpoint
Send natural language queries via API:
curl -X POST https://example-app.klutch.sh/api/query \ -H "Content-Type: application/json" \ -d '{ "query": "What is the response time for my API services?", "userId": "user123", "sessionId": "session456" }'Response:
{ "answer": "The average response time for your API services in the last hour is 245ms. The checkout-service has the highest response time at 380ms.", "data": { "metrics": [ { "service": "checkout-service", "responseTime": 380, "unit": "ms" }, { "service": "auth-service", "responseTime": 150, "unit": "ms" } ] }, "visualizations": [ { "type": "line-chart", "data": "..." } ], "suggestions": [ "Investigate high response time on checkout-service", "Check database query performance" ]}Health Check
Monitor Davis availability:
curl https://example-app.klutch.sh/healthConversation History
Retrieve past conversations:
curl https://example-app.klutch.sh/api/conversations/user123Example Queries
Problem Detection
What problems do I have right now?Show me critical issues from yesterdayAre there any errors in production?What went wrong with my application?Performance Metrics
Show me CPU usage for production hostsWhat's the memory consumption?Display response time for all servicesHow many requests per minute am I getting?Show me throughput trends for the last weekStatus Checks
Are all my services healthy?What's the status of my infrastructure?Is everything running normally?Which applications are down?Specific Entity Queries
Show me metrics for the payment-serviceWhat's happening with the database host?Display errors for the checkout applicationHow is my frontend performing?Time-Based Queries
Show me problems from the last 24 hoursWhat happened between 2pm and 3pm today?Display yesterday's performance metricsShow me last week's error rateRoot Cause Analysis
Why is my API slow?What's causing high CPU usage?Investigate the recent outageWhy are users experiencing errors?What's the root cause of the problem?Slack Integration
Configure Davis to work within your Slack workspace:
Step 1: Create Slack App
- Go to Slack API Apps
- Click “Create New App”
- Choose “From scratch”
- Name: “Davis AI Assistant”
- Select your workspace
- Click “Create App”
Step 2: Configure Bot Token
- Navigate to “OAuth & Permissions”
- Add these Bot Token Scopes:
chat:writechat:write.publiccommandsim:historyim:readim:writechannels:historychannels:readgroups:historygroups:read
- Install app to workspace
- Copy “Bot User OAuth Token” (starts with
xoxb-)
Step 3: Enable Socket Mode
- Navigate to “Socket Mode”
- Enable Socket Mode
- Generate App-Level Token with
connections:writescope - Copy token (starts with
xapp-)
Step 4: Configure Slash Command
- Navigate to “Slash Commands”
- Create new command:
- Command:
/davis - Request URL:
https://example-app.klutch.sh/slack/events - Short Description: “Ask Davis about monitoring”
- Usage Hint:
[your question]
- Command:
Step 5: Update Environment Variables
Add to Klutch.sh environment variables:
SLACK_ENABLED=trueSLACK_BOT_TOKEN=xoxb-your-tokenSLACK_SIGNING_SECRET=your-signing-secretSLACK_APP_TOKEN=xapp-your-tokenStep 6: Use in Slack
In any Slack channel:
/davis What problems occurred in the last hour?Or direct message @Davis:
@Davis show me CPU usageMicrosoft Teams Integration
Set up Davis for Microsoft Teams:
Step 1: Register Bot
- Go to Bot Framework
- Create new bot registration
- Name: “Davis AI Assistant”
- Messaging endpoint:
https://example-app.klutch.sh/api/teams/messages - Copy App ID and generate App Password
Step 2: Configure Teams App
Create manifest.json for Teams app package:
{ "$schema": "https://developer.microsoft.com/json-schemas/teams/v1.16/MicrosoftTeams.schema.json", "manifestVersion": "1.16", "version": "1.0.0", "id": "your-app-id", "packageName": "com.davis.assistant", "developer": { "name": "Your Company", "websiteUrl": "https://example-app.klutch.sh", "privacyUrl": "https://example-app.klutch.sh/privacy", "termsOfUseUrl": "https://example-app.klutch.sh/terms" }, "name": { "short": "Davis", "full": "Davis AI Monitoring Assistant" }, "description": { "short": "AI assistant for Dynatrace monitoring", "full": "Natural language interface for monitoring and observability" }, "icons": { "outline": "outline.png", "color": "color.png" }, "accentColor": "#1F8FE8", "bots": [ { "botId": "your-app-id", "scopes": ["personal", "team"], "supportsFiles": false, "isNotificationOnly": false } ], "permissions": ["identity", "messageTeamMembers"], "validDomains": ["example-app.klutch.sh"]}Step 3: Update Environment Variables
TEAMS_ENABLED=trueTEAMS_APP_ID=your-app-idTEAMS_APP_PASSWORD=your-app-passwordStep 4: Deploy to Teams
- Package manifest.json with icon files into zip
- Upload to Teams app catalog
- Install in your Teams workspace
Step 5: Use in Teams
Chat with Davis bot or mention in channels:
@Davis what's the status of my services?Advanced Configuration
Custom Query Templates
Create custom response templates in templates/responses.json:
{ "problemSummary": { "template": "Found {{count}} problem(s):\n{{#each problems}}- {{title}} ({{severity}})\n{{/each}}", "includeCharts": true, "suggestions": [ "View problem details", "Check affected entities", "See root cause analysis" ] }, "metricsSummary": { "template": "{{metricName}}: {{currentValue}}{{unit}}\nAverage: {{average}}{{unit}}\nTrend: {{trend}}", "includeCharts": true, "timeframes": ["1h", "24h", "7d"] }}Caching with Redis
For improved performance, integrate Redis caching:
Update Dockerfile:
# Add Redis client (already in package.json dependencies)# Configure Redis connection in config.jsonUpdate environment variables:
REDIS_URL=redis://your-redis-host:6379CACHE_TTL=300Authentication and Authorization
Implement user authentication:
Create lib/auth-middleware.js:
const jwt = require('jsonwebtoken');
function authenticate(req, res, next) { const token = req.headers.authorization?.split(' ')[1];
if (!token) { return res.status(401).json({ error: 'Authentication required' }); }
try { const decoded = jwt.verify(token, process.env.JWT_SECRET); req.user = decoded; next(); } catch (error) { res.status(401).json({ error: 'Invalid token' }); }}
module.exports = { authenticate };Apply to protected routes:
const { authenticate } = require('./lib/auth-middleware');
app.post('/api/query', authenticate, async (req, res) => { // Query handling});Custom Dynatrace Metrics
Query custom metrics from Dynatrace:
// In lib/dynatrace-client.jsasync getCustomMetric(metricKey, entitySelector) { return this.query({ endpoint: '/metrics/query', queryParams: { metricSelector: `ext:${metricKey}`, entitySelector, resolution: '1m' } });}Use in queries:
Show me custom metric shopify.orders.total for the last hourWebhook Notifications
Configure webhooks for proactive notifications:
// In integrations/webhook.jsasync function sendNotification(problem) { const webhookUrl = process.env.WEBHOOK_URL;
await axios.post(webhookUrl, { text: `New problem detected: ${problem.title}`, severity: problem.severity, affectedEntities: problem.impactedEntities, rootCause: problem.rootCause, link: `${process.env.DYNATRACE_URL}/ui/problems/${problem.id}` });}Multi-Environment Support
Support multiple Dynatrace environments:
const environments = { production: { apiUrl: process.env.PROD_DYNATRACE_API_URL, apiToken: process.env.PROD_DYNATRACE_API_TOKEN }, staging: { apiUrl: process.env.STAGING_DYNATRACE_API_URL, apiToken: process.env.STAGING_DYNATRACE_API_TOKEN }};
// Select environment based on query contextconst env = environments[parsedIntent.entities.environment] || environments.production;Production Best Practices
Follow these recommendations for running Davis in production:
Security
API Token Security
Never commit API tokens to version control:
# Use environment variablesDYNATRACE_API_TOKEN=your-token
# Rotate tokens regularly# Use tokens with minimal required permissionsDynatrace API Permissions
Create token with only necessary scopes:
- Read entities
- Read metrics
- Read problems
- Read logs (if using log analytics)
Rate Limiting
Protect your API from abuse:
const limiter = rateLimit({ windowMs: 15 * 60 * 1000, max: 100, message: 'Too many requests from this IP'});HTTPS Only
Klutch.sh provides automatic HTTPS. Ensure all webhook URLs and API endpoints use HTTPS.
Input Validation
Validate all user inputs:
const Joi = require('joi');
const querySchema = Joi.object({ query: Joi.string().min(3).max(500).required(), userId: Joi.string().required(), sessionId: Joi.string().optional()});Performance Optimization
Response Caching
Cache frequent queries:
const cache = new Map();
async function getCachedResponse(query) { const cacheKey = `query:${query}`; const cached = cache.get(cacheKey);
if (cached && Date.now() - cached.timestamp < CACHE_TTL) { return cached.data; }
const data = await executeQuery(query); cache.set(cacheKey, { data, timestamp: Date.now() }); return data;}Connection Pooling
Reuse Dynatrace API connections:
const axios = require('axios');const agent = new https.Agent({ keepAlive: true });
const client = axios.create({ httpsAgent: agent, timeout: 30000});Async Processing
Handle long-running queries asynchronously:
app.post('/api/query/async', async (req, res) => { const queryId = generateId();
// Start processing in background processQuery(req.body.query, queryId);
// Return immediately res.json({ queryId, status: 'processing' });});
app.get('/api/query/:queryId/status', (req, res) => { const status = getQueryStatus(req.params.queryId); res.json(status);});Monitoring
Application Metrics
Track Davis performance:
const metrics = { queriesProcessed: 0, averageResponseTime: 0, errorRate: 0};
// Update metricsfunction recordQuery(duration, success) { metrics.queriesProcessed++; metrics.averageResponseTime = (metrics.averageResponseTime * (metrics.queriesProcessed - 1) + duration) / metrics.queriesProcessed; if (!success) metrics.errorRate++;}
// Expose metrics endpointapp.get('/metrics', (req, res) => { res.json(metrics);});Health Checks
Comprehensive health monitoring:
app.get('/health/detailed', async (req, res) => { const health = { status: 'healthy', timestamp: new Date().toISOString(), checks: { dynatrace: await checkDynatraceConnection(), database: await checkDatabaseConnection(), redis: await checkRedisConnection() } };
const allHealthy = Object.values(health.checks).every(c => c.status === 'ok'); res.status(allHealthy ? 200 : 503).json(health);});Error Tracking
Log and track errors:
const winston = require('winston');
logger.error('Query processing failed', { query: req.body.query, userId: req.body.userId, error: error.message, stack: error.stack});Scaling Considerations
Horizontal Scaling
Davis can be scaled horizontally. Deploy multiple instances behind a load balancer:
- Session state stored in Redis or database
- Stateless API design
- Shared cache layer
Resource Allocation
Typical resource requirements:
- CPU: 0.5-1 core for up to 100 queries/minute
- Memory: 512MB-1GB
- Storage: 5GB for conversation history and cache
- Network: Depends on query complexity and response size
Load Balancing
Use round-robin or least-connections algorithm for distributing traffic across Davis instances.
Troubleshooting
Connection Issues
Problem: Cannot connect to Dynatrace API
Solutions:
- Verify API token is valid and has correct permissions
- Check API URL format:
https://{environment-id}.live.dynatrace.com/api/v2 - Ensure network connectivity from container
- Verify no firewall blocking outbound HTTPS
- Test with curl:
curl -H "Authorization: Api-Token YOUR_TOKEN" https://your-env.live.dynatrace.com/api/v2/entities
Problem: Webhook not receiving events
Solutions:
- Verify webhook URL is publicly accessible
- Check webhook secret matches configuration
- Review webhook logs for incoming requests
- Test webhook endpoint with curl
- Ensure HTTPS endpoint (some services require HTTPS)
Query Issues
Problem: Davis doesn’t understand queries
Solutions:
- Simplify query language
- Use explicit entity names from Dynatrace
- Include timeframes explicitly
- Check NLP logs for parsing results
- Add custom intent patterns for domain-specific queries
- Review and improve NLP training data
Problem: Incorrect or empty responses
Solutions:
- Verify Dynatrace data exists for query timeframe
- Check entity selectors are correct
- Review Dynatrace API response in logs
- Ensure metric keys match Dynatrace schema
- Validate query parameters being sent to API
Integration Issues
Problem: Slack integration not working
Solutions:
- Verify bot token starts with
xoxb- - Check signing secret is correct
- Ensure Socket Mode is enabled
- Verify app is installed in workspace
- Review Slack app event subscriptions
- Check bot has necessary permissions
- Test slash command configuration
Problem: Teams bot not responding
Solutions:
- Verify messaging endpoint is accessible
- Check app ID and password are correct
- Ensure bot is registered in Bot Framework
- Review Teams app manifest configuration
- Test bot endpoint with Bot Framework Emulator
- Check bot is added to Teams workspace
Performance Issues
Problem: Slow query responses
Solutions:
- Implement caching for frequent queries
- Reduce Dynatrace API query complexity
- Optimize NLP processing
- Increase container resources
- Use connection pooling for API calls
- Monitor Dynatrace API response times
- Consider async query processing for complex requests
Problem: High memory usage
Solutions:
- Clear conversation history cache periodically
- Reduce cache size limits
- Monitor for memory leaks in NLP processing
- Increase container memory limits
- Implement memory-efficient data structures
- Review and optimize conversation context storage
Data Issues
Problem: Conversation history not persisting
Solutions:
- Verify persistent volume is mounted at
/app/data - Check directory permissions
- Ensure adequate storage space
- Test write access to data directory
- Review session storage configuration
- Check Redis connection if using external cache
Problem: Missing metrics or entities
Solutions:
- Verify entity exists in Dynatrace
- Check timeframe includes relevant data
- Ensure API token has read access to entity type
- Review entity selector syntax
- Test query directly against Dynatrace API
- Check for entity naming mismatches
Additional Resources
- Dynatrace API Documentation
- Dynatrace API Authentication
- Dynatrace Entities API
- Dynatrace Metrics API
- Slack Bot Documentation
- Microsoft Teams Bot Documentation
- Klutch.sh Documentation
- Persistent Volumes Guide
Conclusion
Davis transforms how teams interact with monitoring data by bringing AI-powered natural language interfaces to Dynatrace. Instead of learning complex query languages or navigating intricate dashboards, team members can simply ask questions and receive intelligent, context-aware answers. This democratizes access to observability data, making it available to everyone from developers to product managers.
Deploying Davis on Klutch.sh gives you the infrastructure to run this AI assistant without managing servers or worrying about scaling. The integration with collaboration tools like Slack and Teams brings monitoring insights directly into your team’s daily workflow, enabling faster incident response and better understanding of system health. Whether you’re troubleshooting production issues, analyzing performance trends, or conducting post-mortems, Davis provides instant access to the data you need.
Start having conversations with your monitoring data today and experience the power of AI-assisted observability.