Skip to content

Deploying Davis

Davis is an AI-powered virtual assistant specifically designed for Dynatrace monitoring and observability platforms. Named after the Davis AI engine within Dynatrace, this conversational interface allows teams to interact with their monitoring data using natural language queries instead of navigating complex dashboards or writing custom queries. Whether you’re investigating performance issues, analyzing metrics, or checking system health, Davis translates your questions into actionable insights pulled directly from your Dynatrace environment.

What sets Davis apart is its ability to understand context and intent. Rather than requiring precise syntax or memorizing query languages, you can ask questions like “What’s causing the high response time on my production API?” or “Show me CPU usage for the last hour” and receive intelligent responses with relevant charts, metrics, and recommendations. Davis integrates with Slack, Microsoft Teams, and other collaboration platforms, bringing monitoring insights directly into your team’s workflow. This makes monitoring data accessible to everyone, not just specialists who know how to navigate APM tools.

Why Deploy Davis on Klutch.sh?

Deploying Davis on Klutch.sh offers several advantages for hosting your AI monitoring assistant:

  • Automatic Docker Detection: Klutch.sh recognizes your Dockerfile and handles containerization without manual configuration
  • Persistent Storage: Built-in volume management ensures your conversation history and configuration persist across deployments
  • HTTPS by Default: Secure access to your Davis instance with automatic SSL certificates
  • Environment Management: Securely configure Dynatrace API tokens, webhook URLs, and integration credentials through environment variables
  • Webhook Support: Receive real-time notifications and integrate with collaboration tools through HTTP endpoints
  • Rapid Deployment: Go from configuration to production in minutes with GitHub integration
  • Always-On Availability: Keep your monitoring assistant running 24/7 without managing infrastructure

Prerequisites

Before deploying Davis to Klutch.sh, ensure you have:

  • A Klutch.sh account (sign up here)
  • A GitHub account with a repository for your Davis deployment
  • Basic understanding of Docker and containerization
  • A Dynatrace account (SaaS or Managed)
  • Dynatrace API token with appropriate permissions
  • Dynatrace environment ID and URL
  • Slack workspace (optional, for Slack integration)
  • Git installed on your local development machine
  • Familiarity with REST APIs and webhooks

Understanding Davis Architecture

Davis follows a microservices architecture designed for intelligent monitoring interactions:

Core Components

Node.js Application Server

Davis is built with Node.js, providing a responsive web application and API endpoints for handling user interactions. The application processes natural language queries, communicates with Dynatrace APIs, and formats responses in user-friendly formats. Express.js handles HTTP routing, middleware, and webhook endpoints for integrations with collaboration platforms.

Natural Language Processing

The NLP engine interprets user queries and extracts intent, entities, and context. When you ask “What’s wrong with my application?”, Davis parses the question, identifies relevant time ranges, application names, and problem categories, then constructs appropriate Dynatrace API queries to retrieve the answer.

Dynatrace API Integration

Davis connects to Dynatrace through REST APIs, accessing monitoring data, metrics, events, and problems. The integration layer handles authentication with API tokens, manages rate limiting, and caches frequent queries for performance. Davis can access:

  • Application performance metrics
  • Infrastructure monitoring data
  • Problem detection and root cause analysis
  • Log analytics and traces
  • Custom metrics and events
  • Synthetic monitoring results

Conversation Manager

The conversation manager maintains context across multiple interactions, allowing follow-up questions without repeating context. If you ask “Show me errors in production” followed by “What about staging?”, Davis remembers the context (errors) and adjusts the query scope (staging environment).

Response Formatter

Davis formats Dynatrace data into human-readable responses with charts, tables, and recommendations. Complex JSON responses from Dynatrace are transformed into conversational answers, graphs are rendered for visual analysis, and actionable suggestions are highlighted.

Integration Layer

Davis integrates with multiple platforms:

  • Slack: Slash commands and interactive messages
  • Microsoft Teams: Bot framework integration
  • Webhook: Generic webhook support for custom integrations
  • Web UI: Browser-based chat interface

Each integration maintains its own session state and user authentication.

Configuration System

Davis uses environment variables and configuration files to manage:

  • Dynatrace connection settings
  • API authentication tokens
  • Integration platform credentials
  • Feature flags and customization
  • Response templates and formatting rules

Query Flow

  1. User asks question through chat interface (Slack, Teams, or web UI)
  2. Davis receives message and authenticates user
  3. NLP engine parses query to extract intent and entities
  4. Conversation manager retrieves context from previous interactions
  5. Query builder constructs appropriate Dynatrace API requests
  6. API calls are made to Dynatrace with authentication
  7. Response data is retrieved and cached
  8. Response formatter converts data to human-readable format
  9. Formatted response is sent back through integration channel
  10. Conversation context is updated for follow-up questions

Storage Requirements

Davis requires persistent storage for:

  • Conversation History: Past interactions and context for users
  • Cache Data: Frequently accessed metrics and query results
  • Configuration: Custom response templates and user preferences
  • Session State: Active conversation sessions and authentication tokens

A typical deployment needs 1GB-5GB for conversation history and cache data, growing based on user activity and cache retention policies.

Installation and Setup

Let’s walk through setting up Davis for deployment on Klutch.sh.

Step 1: Create the Project Structure

First, create a new directory for your Davis deployment:

Terminal window
mkdir davis-deployment
cd davis-deployment
git init

Step 2: Create Configuration File

Create a config.json file with your Dynatrace configuration:

{
"dynatrace": {
"environmentId": "your-environment-id",
"apiUrl": "https://your-environment-id.live.dynatrace.com/api/v2",
"apiToken": "${DYNATRACE_API_TOKEN}",
"timeout": 30000
},
"server": {
"port": 3000,
"host": "0.0.0.0"
},
"nlp": {
"confidenceThreshold": 0.6,
"contextWindow": 5,
"enableFollowUp": true
},
"cache": {
"enabled": true,
"ttl": 300,
"maxSize": 100
},
"integrations": {
"slack": {
"enabled": "${SLACK_ENABLED}",
"botToken": "${SLACK_BOT_TOKEN}",
"signingSecret": "${SLACK_SIGNING_SECRET}",
"appToken": "${SLACK_APP_TOKEN}"
},
"teams": {
"enabled": "${TEAMS_ENABLED}",
"appId": "${TEAMS_APP_ID}",
"appPassword": "${TEAMS_APP_PASSWORD}"
},
"webhook": {
"enabled": true,
"secret": "${WEBHOOK_SECRET}"
}
},
"features": {
"problemAnalysis": true,
"performanceMetrics": true,
"logAnalytics": true,
"customQueries": true,
"aiInsights": true
},
"logging": {
"level": "info",
"format": "json"
}
}

Step 3: Create Environment Template

Create a .env.example file:

Terminal window
# Dynatrace Configuration
DYNATRACE_ENVIRONMENT_ID=your-environment-id
DYNATRACE_API_URL=https://your-environment-id.live.dynatrace.com/api/v2
DYNATRACE_API_TOKEN=your-dynatrace-api-token
# Server Configuration
PORT=3000
NODE_ENV=production
# Slack Integration (Optional)
SLACK_ENABLED=false
SLACK_BOT_TOKEN=xoxb-your-bot-token
SLACK_SIGNING_SECRET=your-signing-secret
SLACK_APP_TOKEN=xapp-your-app-token
# Microsoft Teams Integration (Optional)
TEAMS_ENABLED=false
TEAMS_APP_ID=your-teams-app-id
TEAMS_APP_PASSWORD=your-teams-password
# Webhook Configuration
WEBHOOK_SECRET=your-webhook-secret
# Cache Configuration
REDIS_URL=redis://localhost:6379
CACHE_TTL=300
# Logging
LOG_LEVEL=info
LOG_FORMAT=json
# Session Configuration
SESSION_SECRET=your-session-secret
SESSION_TTL=86400

Step 4: Create the Dockerfile

Create a Dockerfile in the root directory:

FROM node:18-alpine
# Set environment variables
ENV NODE_ENV=production \
NPM_CONFIG_LOGLEVEL=warn \
PORT=3000
# Install system dependencies
RUN apk add --no-cache \
python3 \
make \
g++ \
curl
# Create app directory
WORKDIR /app
# Create davis user
RUN addgroup -g 1000 davis && \
adduser -D -u 1000 -G davis davis
# Copy package files
COPY package*.json ./
# Install dependencies
RUN npm ci --only=production && \
npm cache clean --force
# Copy application files
COPY --chown=davis:davis . .
# Create necessary directories
RUN mkdir -p /app/data /app/logs && \
chown -R davis:davis /app
# Switch to davis user
USER davis
# Expose port
EXPOSE 3000
# Health check
HEALTHCHECK --interval=30s --timeout=10s --start-period=40s --retries=3 \
CMD node healthcheck.js || exit 1
# Start application
CMD ["node", "server.js"]

Step 5: Create Application Files

Create package.json:

{
"name": "davis-ai-assistant",
"version": "1.0.0",
"description": "AI-powered assistant for Dynatrace monitoring",
"main": "server.js",
"scripts": {
"start": "node server.js",
"dev": "nodemon server.js",
"test": "jest",
"lint": "eslint ."
},
"dependencies": {
"express": "^4.18.2",
"axios": "^1.6.0",
"dotenv": "^16.3.1",
"express-rate-limit": "^7.1.0",
"helmet": "^7.1.0",
"cors": "^2.8.5",
"morgan": "^1.10.0",
"joi": "^17.11.0",
"natural": "^6.10.0",
"compromise": "^14.10.0",
"ioredis": "^5.3.2",
"express-session": "^1.17.3",
"winston": "^3.11.0",
"@slack/bolt": "^3.15.0",
"botbuilder": "^4.21.0"
},
"devDependencies": {
"nodemon": "^3.0.1",
"jest": "^29.7.0",
"eslint": "^8.54.0"
},
"engines": {
"node": ">=18.0.0",
"npm": ">=9.0.0"
}
}

Create server.js:

require('dotenv').config();
const express = require('express');
const helmet = require('helmet');
const cors = require('cors');
const morgan = require('morgan');
const rateLimit = require('express-rate-limit');
const session = require('express-session');
const winston = require('winston');
// Import custom modules
const dynatraceClient = require('./lib/dynatrace-client');
const nlpProcessor = require('./lib/nlp-processor');
const conversationManager = require('./lib/conversation-manager');
const responseFormatter = require('./lib/response-formatter');
const slackIntegration = require('./integrations/slack');
const teamsIntegration = require('./integrations/teams');
const webhookHandler = require('./integrations/webhook');
// Initialize Express app
const app = express();
const PORT = process.env.PORT || 3000;
// Configure logger
const logger = winston.createLogger({
level: process.env.LOG_LEVEL || 'info',
format: winston.format.json(),
transports: [
new winston.transports.Console({
format: winston.format.simple()
}),
new winston.transports.File({ filename: '/app/logs/error.log', level: 'error' }),
new winston.transports.File({ filename: '/app/logs/combined.log' })
]
});
// Middleware
app.use(helmet());
app.use(cors());
app.use(express.json());
app.use(express.urlencoded({ extended: true }));
app.use(morgan('combined', { stream: { write: message => logger.info(message.trim()) } }));
// Rate limiting
const limiter = rateLimit({
windowMs: 15 * 60 * 1000, // 15 minutes
max: 100 // limit each IP to 100 requests per windowMs
});
app.use('/api/', limiter);
// Session configuration
app.use(session({
secret: process.env.SESSION_SECRET || 'your-secret-key',
resave: false,
saveUninitialized: false,
cookie: {
secure: process.env.NODE_ENV === 'production',
maxAge: parseInt(process.env.SESSION_TTL) * 1000 || 86400000
}
}));
// Health check endpoint
app.get('/health', (req, res) => {
res.json({ status: 'healthy', timestamp: new Date().toISOString() });
});
// Query endpoint
app.post('/api/query', async (req, res) => {
try {
const { query, userId, sessionId } = req.body;
if (!query) {
return res.status(400).json({ error: 'Query is required' });
}
logger.info(`Processing query: ${query}`, { userId, sessionId });
// Parse natural language query
const intent = await nlpProcessor.parse(query);
// Get conversation context
const context = await conversationManager.getContext(userId, sessionId);
// Build Dynatrace query
const dynatraceQuery = await nlpProcessor.buildQuery(intent, context);
// Execute query against Dynatrace
const data = await dynatraceClient.query(dynatraceQuery);
// Format response
const response = await responseFormatter.format(data, intent);
// Update conversation context
await conversationManager.updateContext(userId, sessionId, { query, response, intent });
res.json({
answer: response.text,
data: response.data,
visualizations: response.charts,
suggestions: response.suggestions
});
} catch (error) {
logger.error('Error processing query:', error);
res.status(500).json({
error: 'Failed to process query',
message: error.message
});
}
});
// Conversation history endpoint
app.get('/api/conversations/:userId', async (req, res) => {
try {
const { userId } = req.params;
const history = await conversationManager.getHistory(userId);
res.json({ conversations: history });
} catch (error) {
logger.error('Error fetching conversation history:', error);
res.status(500).json({ error: 'Failed to fetch conversations' });
}
});
// Initialize integrations
if (process.env.SLACK_ENABLED === 'true') {
slackIntegration.initialize(app, logger);
logger.info('Slack integration initialized');
}
if (process.env.TEAMS_ENABLED === 'true') {
teamsIntegration.initialize(app, logger);
logger.info('Teams integration initialized');
}
webhookHandler.initialize(app, logger);
logger.info('Webhook handler initialized');
// Error handling middleware
app.use((err, req, res, next) => {
logger.error('Unhandled error:', err);
res.status(500).json({
error: 'Internal server error',
message: process.env.NODE_ENV === 'development' ? err.message : undefined
});
});
// Start server
app.listen(PORT, '0.0.0.0', () => {
logger.info(`Davis AI Assistant listening on port ${PORT}`);
logger.info(`Environment: ${process.env.NODE_ENV}`);
logger.info(`Dynatrace Environment: ${process.env.DYNATRACE_ENVIRONMENT_ID}`);
});
// Graceful shutdown
process.on('SIGTERM', () => {
logger.info('SIGTERM signal received: closing HTTP server');
app.close(() => {
logger.info('HTTP server closed');
process.exit(0);
});
});

Create healthcheck.js:

const http = require('http');
const options = {
host: 'localhost',
port: process.env.PORT || 3000,
path: '/health',
timeout: 2000
};
const request = http.request(options, (res) => {
if (res.statusCode === 200) {
process.exit(0);
} else {
process.exit(1);
}
});
request.on('error', () => {
process.exit(1);
});
request.end();

Step 6: Create Library Modules

Create lib/dynatrace-client.js:

const axios = require('axios');
const logger = require('winston');
class DynatraceClient {
constructor() {
this.apiUrl = process.env.DYNATRACE_API_URL;
this.apiToken = process.env.DYNATRACE_API_TOKEN;
this.environmentId = process.env.DYNATRACE_ENVIRONMENT_ID;
this.client = axios.create({
baseURL: this.apiUrl,
headers: {
'Authorization': `Api-Token ${this.apiToken}`,
'Content-Type': 'application/json'
},
timeout: 30000
});
}
async query(params) {
try {
const { endpoint, method = 'GET', data = null, queryParams = {} } = params;
const response = await this.client.request({
method,
url: endpoint,
data,
params: queryParams
});
return response.data;
} catch (error) {
logger.error('Dynatrace API error:', error.message);
throw new Error(`Failed to query Dynatrace: ${error.message}`);
}
}
async getProblems(timeframe = 'now-2h') {
return this.query({
endpoint: '/problems',
queryParams: {
from: timeframe,
fields: '+impactAnalysis,+rootCauseEntity'
}
});
}
async getMetrics(metricSelector, timeframe = 'now-1h') {
return this.query({
endpoint: '/metrics/query',
queryParams: {
metricSelector,
from: timeframe,
resolution: '1m'
}
});
}
async getEntities(entityType, fields = []) {
return this.query({
endpoint: `/entities/${entityType}`,
queryParams: {
fields: fields.join(',')
}
});
}
async getApplications() {
return this.getEntities('applications', ['displayName', 'tags', 'entityId']);
}
async getHosts() {
return this.getEntities('hosts', ['displayName', 'osType', 'tags']);
}
async getServices() {
return this.getEntities('services', ['displayName', 'serviceType', 'tags']);
}
}
module.exports = new DynatraceClient();

Create lib/nlp-processor.js:

const natural = require('natural');
const compromise = require('compromise');
class NLPProcessor {
constructor() {
this.tokenizer = new natural.WordTokenizer();
this.tfidf = new natural.TfIdf();
// Intent patterns
this.intents = {
problemQuery: ['problem', 'issue', 'error', 'down', 'fail', 'crash'],
metricsQuery: ['cpu', 'memory', 'response time', 'throughput', 'metric'],
statusQuery: ['status', 'health', 'running', 'up', 'available'],
listQuery: ['list', 'show', 'display', 'what are', 'get'],
analyzeQuery: ['analyze', 'investigate', 'why', 'cause', 'root cause']
};
}
async parse(query) {
const doc = compromise(query);
const tokens = this.tokenizer.tokenize(query.toLowerCase());
// Extract intent
const intent = this.extractIntent(tokens);
// Extract entities
const entities = {
timeframe: this.extractTimeframe(doc, query),
environment: this.extractEnvironment(tokens),
entityType: this.extractEntityType(tokens),
metric: this.extractMetric(tokens)
};
return { intent, entities, originalQuery: query };
}
extractIntent(tokens) {
for (const [intentName, keywords] of Object.entries(this.intents)) {
for (const keyword of keywords) {
if (tokens.some(token => token.includes(keyword) || keyword.includes(token))) {
return intentName;
}
}
}
return 'generalQuery';
}
extractTimeframe(doc, query) {
// Extract time expressions
const timeMatch = query.match(/(last|past)\s+(\d+)\s+(minute|hour|day|week)s?/i);
if (timeMatch) {
const value = parseInt(timeMatch[2]);
const unit = timeMatch[3].toLowerCase();
return `now-${value}${unit[0]}`;
}
// Default timeframe
return 'now-1h';
}
extractEnvironment(tokens) {
const environments = ['production', 'staging', 'development', 'prod', 'dev', 'stage'];
for (const env of environments) {
if (tokens.includes(env)) {
return env;
}
}
return null;
}
extractEntityType(tokens) {
const entityTypes = {
application: ['app', 'application'],
service: ['service', 'api'],
host: ['host', 'server', 'machine'],
database: ['database', 'db']
};
for (const [type, keywords] of Object.entries(entityTypes)) {
if (tokens.some(token => keywords.includes(token))) {
return type;
}
}
return null;
}
extractMetric(tokens) {
const metrics = {
'cpu': 'builtin:host.cpu.usage',
'memory': 'builtin:host.mem.usage',
'response': 'builtin:service.response.time',
'throughput': 'builtin:service.requestCount.total',
'errors': 'builtin:service.errors.total.count'
};
for (const [keyword, metricId] of Object.entries(metrics)) {
if (tokens.some(token => token.includes(keyword))) {
return metricId;
}
}
return null;
}
async buildQuery(parsedIntent, context) {
const { intent, entities } = parsedIntent;
switch (intent) {
case 'problemQuery':
return {
endpoint: '/problems',
queryParams: {
from: entities.timeframe,
entitySelector: this.buildEntitySelector(entities)
}
};
case 'metricsQuery':
return {
endpoint: '/metrics/query',
queryParams: {
metricSelector: entities.metric || 'builtin:host.cpu.usage',
from: entities.timeframe,
resolution: '1m'
}
};
case 'statusQuery':
return {
endpoint: '/entities',
queryParams: {
entitySelector: this.buildEntitySelector(entities),
fields: 'healthState,displayName'
}
};
default:
return {
endpoint: '/entities',
queryParams: {}
};
}
}
buildEntitySelector(entities) {
const selectors = [];
if (entities.entityType) {
selectors.push(`type("${entities.entityType}")`);
}
if (entities.environment) {
selectors.push(`tag("environment:${entities.environment}")`);
}
return selectors.join(',') || undefined;
}
}
module.exports = new NLPProcessor();

Step 7: Create .dockerignore

Create a .dockerignore file:

node_modules
npm-debug.log
.env
.env.local
.git
.gitignore
*.md
README.md
.DS_Store
Thumbs.db
logs/
*.log
test/
tests/
.vscode/
.idea/
coverage/

Step 8: Create Documentation

Create README.md:

# Davis AI Assistant Deployment
This repository contains a Davis AI Assistant deployment configured for Klutch.sh.
## Features
- Natural language queries for Dynatrace monitoring
- Real-time problem detection and analysis
- Performance metrics visualization
- Slack and Microsoft Teams integration
- Conversation history and context awareness
- Intelligent suggestions and recommendations
## Configuration
Set the following environment variables:
- `DYNATRACE_API_TOKEN`: Your Dynatrace API token
- `DYNATRACE_ENVIRONMENT_ID`: Your environment ID
- `DYNATRACE_API_URL`: Your Dynatrace API URL
## Example Queries
- "What problems occurred in the last hour?"
- "Show me CPU usage for production servers"
- "Are all services healthy?"
- "What's causing high response time?"
- "List all applications"
## Deployment
This application is configured to deploy on Klutch.sh with automatic Docker detection.

Step 9: Initialize Git Repository

Terminal window
git add .
git commit -m "Initial Davis AI Assistant setup for Klutch.sh deployment"
git branch -M master
git remote add origin https://github.com/yourusername/davis-deployment.git
git push -u origin master

Deploying to Klutch.sh

Now that your Davis application is configured, let’s deploy it to Klutch.sh.

  1. Log in to Klutch.sh

    Navigate to klutch.sh/app and sign in with your GitHub account.

  2. Create a New Project

    Click “New Project” and select “Import from GitHub”. Choose the repository containing your Davis deployment.

  3. Configure Build Settings

    Klutch.sh will automatically detect the Dockerfile in your repository. The platform will use this for building your container.

  4. Configure Traffic Settings

    Select “HTTP” as the traffic type. Davis serves its web interface and API on port 3000, and Klutch.sh will route HTTPS traffic to this port.

  5. Set Environment Variables

    In the project settings, add the following environment variables:

    • DYNATRACE_API_TOKEN: Your Dynatrace API token (requires Read API v2 permissions)
    • DYNATRACE_ENVIRONMENT_ID: Your environment ID (e.g., abc12345)
    • DYNATRACE_API_URL: https://your-environment-id.live.dynatrace.com/api/v2
    • PORT: 3000
    • NODE_ENV: production
    • SESSION_SECRET: Generate using openssl rand -hex 32
    • LOG_LEVEL: info

    For Slack integration (optional):

    • SLACK_ENABLED: true
    • SLACK_BOT_TOKEN: Your Slack bot token (starts with xoxb-)
    • SLACK_SIGNING_SECRET: Your Slack signing secret
    • SLACK_APP_TOKEN: Your Slack app token (starts with xapp-)
  6. Configure Persistent Storage

    Davis requires persistent storage for conversation history and cache:

    • Data Volume:
      • Mount path: /app/data
      • Size: 5GB
    • Logs Volume:
      • Mount path: /app/logs
      • Size: 2GB

    These volumes ensure your conversation history and logs persist across deployments.

  7. Deploy the Application

    Click “Deploy” to start the build process. Klutch.sh will:

    • Clone your repository
    • Build the Docker image using your Dockerfile
    • Install Node.js dependencies
    • Deploy the container with Davis
    • Provision an HTTPS endpoint

    The build process typically takes 2-3 minutes.

  8. Access Your Davis Instance

    Once deployment completes, Klutch.sh will provide a URL like example-app.klutch.sh. Your Davis AI assistant will be available at this URL.

Getting Started with Davis

Once your Davis instance is deployed, here’s how to use it:

Using the Web Interface

Navigate to Your Deployment

Visit your deployed URL (e.g., https://example-app.klutch.sh) to access the Davis web interface.

Ask Questions

Type natural language questions in the chat interface:

What problems occurred in the last 2 hours?

Response includes:

  • List of detected problems
  • Severity levels
  • Affected entities
  • Root cause analysis
  • Recommended actions

Follow-Up Questions

Davis maintains conversation context:

You: Show me errors in production
Davis: [Lists production errors]
You: What about staging?
Davis: [Lists staging errors - understands context]

View Metrics

Request performance metrics:

Show me CPU usage for the last hour

Davis returns:

  • Time series chart
  • Current value
  • Average, min, max values
  • Trend analysis
  • Anomaly detection

API Usage

Query Endpoint

Send natural language queries via API:

Terminal window
curl -X POST https://example-app.klutch.sh/api/query \
-H "Content-Type: application/json" \
-d '{
"query": "What is the response time for my API services?",
"userId": "user123",
"sessionId": "session456"
}'

Response:

{
"answer": "The average response time for your API services in the last hour is 245ms. The checkout-service has the highest response time at 380ms.",
"data": {
"metrics": [
{
"service": "checkout-service",
"responseTime": 380,
"unit": "ms"
},
{
"service": "auth-service",
"responseTime": 150,
"unit": "ms"
}
]
},
"visualizations": [
{
"type": "line-chart",
"data": "..."
}
],
"suggestions": [
"Investigate high response time on checkout-service",
"Check database query performance"
]
}

Health Check

Monitor Davis availability:

Terminal window
curl https://example-app.klutch.sh/health

Conversation History

Retrieve past conversations:

Terminal window
curl https://example-app.klutch.sh/api/conversations/user123

Example Queries

Problem Detection

What problems do I have right now?
Show me critical issues from yesterday
Are there any errors in production?
What went wrong with my application?

Performance Metrics

Show me CPU usage for production hosts
What's the memory consumption?
Display response time for all services
How many requests per minute am I getting?
Show me throughput trends for the last week

Status Checks

Are all my services healthy?
What's the status of my infrastructure?
Is everything running normally?
Which applications are down?

Specific Entity Queries

Show me metrics for the payment-service
What's happening with the database host?
Display errors for the checkout application
How is my frontend performing?

Time-Based Queries

Show me problems from the last 24 hours
What happened between 2pm and 3pm today?
Display yesterday's performance metrics
Show me last week's error rate

Root Cause Analysis

Why is my API slow?
What's causing high CPU usage?
Investigate the recent outage
Why are users experiencing errors?
What's the root cause of the problem?

Slack Integration

Configure Davis to work within your Slack workspace:

Step 1: Create Slack App

  1. Go to Slack API Apps
  2. Click “Create New App”
  3. Choose “From scratch”
  4. Name: “Davis AI Assistant”
  5. Select your workspace
  6. Click “Create App”

Step 2: Configure Bot Token

  1. Navigate to “OAuth & Permissions”
  2. Add these Bot Token Scopes:
    • chat:write
    • chat:write.public
    • commands
    • im:history
    • im:read
    • im:write
    • channels:history
    • channels:read
    • groups:history
    • groups:read
  3. Install app to workspace
  4. Copy “Bot User OAuth Token” (starts with xoxb-)

Step 3: Enable Socket Mode

  1. Navigate to “Socket Mode”
  2. Enable Socket Mode
  3. Generate App-Level Token with connections:write scope
  4. Copy token (starts with xapp-)

Step 4: Configure Slash Command

  1. Navigate to “Slash Commands”
  2. Create new command:
    • Command: /davis
    • Request URL: https://example-app.klutch.sh/slack/events
    • Short Description: “Ask Davis about monitoring”
    • Usage Hint: [your question]

Step 5: Update Environment Variables

Add to Klutch.sh environment variables:

SLACK_ENABLED=true
SLACK_BOT_TOKEN=xoxb-your-token
SLACK_SIGNING_SECRET=your-signing-secret
SLACK_APP_TOKEN=xapp-your-token

Step 6: Use in Slack

In any Slack channel:

/davis What problems occurred in the last hour?

Or direct message @Davis:

@Davis show me CPU usage

Microsoft Teams Integration

Set up Davis for Microsoft Teams:

Step 1: Register Bot

  1. Go to Bot Framework
  2. Create new bot registration
  3. Name: “Davis AI Assistant”
  4. Messaging endpoint: https://example-app.klutch.sh/api/teams/messages
  5. Copy App ID and generate App Password

Step 2: Configure Teams App

Create manifest.json for Teams app package:

{
"$schema": "https://developer.microsoft.com/json-schemas/teams/v1.16/MicrosoftTeams.schema.json",
"manifestVersion": "1.16",
"version": "1.0.0",
"id": "your-app-id",
"packageName": "com.davis.assistant",
"developer": {
"name": "Your Company",
"websiteUrl": "https://example-app.klutch.sh",
"privacyUrl": "https://example-app.klutch.sh/privacy",
"termsOfUseUrl": "https://example-app.klutch.sh/terms"
},
"name": {
"short": "Davis",
"full": "Davis AI Monitoring Assistant"
},
"description": {
"short": "AI assistant for Dynatrace monitoring",
"full": "Natural language interface for monitoring and observability"
},
"icons": {
"outline": "outline.png",
"color": "color.png"
},
"accentColor": "#1F8FE8",
"bots": [
{
"botId": "your-app-id",
"scopes": ["personal", "team"],
"supportsFiles": false,
"isNotificationOnly": false
}
],
"permissions": ["identity", "messageTeamMembers"],
"validDomains": ["example-app.klutch.sh"]
}

Step 3: Update Environment Variables

TEAMS_ENABLED=true
TEAMS_APP_ID=your-app-id
TEAMS_APP_PASSWORD=your-app-password

Step 4: Deploy to Teams

  1. Package manifest.json with icon files into zip
  2. Upload to Teams app catalog
  3. Install in your Teams workspace

Step 5: Use in Teams

Chat with Davis bot or mention in channels:

@Davis what's the status of my services?

Advanced Configuration

Custom Query Templates

Create custom response templates in templates/responses.json:

{
"problemSummary": {
"template": "Found {{count}} problem(s):\n{{#each problems}}- {{title}} ({{severity}})\n{{/each}}",
"includeCharts": true,
"suggestions": [
"View problem details",
"Check affected entities",
"See root cause analysis"
]
},
"metricsSummary": {
"template": "{{metricName}}: {{currentValue}}{{unit}}\nAverage: {{average}}{{unit}}\nTrend: {{trend}}",
"includeCharts": true,
"timeframes": ["1h", "24h", "7d"]
}
}

Caching with Redis

For improved performance, integrate Redis caching:

Update Dockerfile:

# Add Redis client (already in package.json dependencies)
# Configure Redis connection in config.json

Update environment variables:

REDIS_URL=redis://your-redis-host:6379
CACHE_TTL=300

Authentication and Authorization

Implement user authentication:

Create lib/auth-middleware.js:

const jwt = require('jsonwebtoken');
function authenticate(req, res, next) {
const token = req.headers.authorization?.split(' ')[1];
if (!token) {
return res.status(401).json({ error: 'Authentication required' });
}
try {
const decoded = jwt.verify(token, process.env.JWT_SECRET);
req.user = decoded;
next();
} catch (error) {
res.status(401).json({ error: 'Invalid token' });
}
}
module.exports = { authenticate };

Apply to protected routes:

const { authenticate } = require('./lib/auth-middleware');
app.post('/api/query', authenticate, async (req, res) => {
// Query handling
});

Custom Dynatrace Metrics

Query custom metrics from Dynatrace:

// In lib/dynatrace-client.js
async getCustomMetric(metricKey, entitySelector) {
return this.query({
endpoint: '/metrics/query',
queryParams: {
metricSelector: `ext:${metricKey}`,
entitySelector,
resolution: '1m'
}
});
}

Use in queries:

Show me custom metric shopify.orders.total for the last hour

Webhook Notifications

Configure webhooks for proactive notifications:

// In integrations/webhook.js
async function sendNotification(problem) {
const webhookUrl = process.env.WEBHOOK_URL;
await axios.post(webhookUrl, {
text: `New problem detected: ${problem.title}`,
severity: problem.severity,
affectedEntities: problem.impactedEntities,
rootCause: problem.rootCause,
link: `${process.env.DYNATRACE_URL}/ui/problems/${problem.id}`
});
}

Multi-Environment Support

Support multiple Dynatrace environments:

const environments = {
production: {
apiUrl: process.env.PROD_DYNATRACE_API_URL,
apiToken: process.env.PROD_DYNATRACE_API_TOKEN
},
staging: {
apiUrl: process.env.STAGING_DYNATRACE_API_URL,
apiToken: process.env.STAGING_DYNATRACE_API_TOKEN
}
};
// Select environment based on query context
const env = environments[parsedIntent.entities.environment] || environments.production;

Production Best Practices

Follow these recommendations for running Davis in production:

Security

API Token Security

Never commit API tokens to version control:

Terminal window
# Use environment variables
DYNATRACE_API_TOKEN=your-token
# Rotate tokens regularly
# Use tokens with minimal required permissions

Dynatrace API Permissions

Create token with only necessary scopes:

  • Read entities
  • Read metrics
  • Read problems
  • Read logs (if using log analytics)

Rate Limiting

Protect your API from abuse:

const limiter = rateLimit({
windowMs: 15 * 60 * 1000,
max: 100,
message: 'Too many requests from this IP'
});

HTTPS Only

Klutch.sh provides automatic HTTPS. Ensure all webhook URLs and API endpoints use HTTPS.

Input Validation

Validate all user inputs:

const Joi = require('joi');
const querySchema = Joi.object({
query: Joi.string().min(3).max(500).required(),
userId: Joi.string().required(),
sessionId: Joi.string().optional()
});

Performance Optimization

Response Caching

Cache frequent queries:

const cache = new Map();
async function getCachedResponse(query) {
const cacheKey = `query:${query}`;
const cached = cache.get(cacheKey);
if (cached && Date.now() - cached.timestamp < CACHE_TTL) {
return cached.data;
}
const data = await executeQuery(query);
cache.set(cacheKey, { data, timestamp: Date.now() });
return data;
}

Connection Pooling

Reuse Dynatrace API connections:

const axios = require('axios');
const agent = new https.Agent({ keepAlive: true });
const client = axios.create({
httpsAgent: agent,
timeout: 30000
});

Async Processing

Handle long-running queries asynchronously:

app.post('/api/query/async', async (req, res) => {
const queryId = generateId();
// Start processing in background
processQuery(req.body.query, queryId);
// Return immediately
res.json({ queryId, status: 'processing' });
});
app.get('/api/query/:queryId/status', (req, res) => {
const status = getQueryStatus(req.params.queryId);
res.json(status);
});

Monitoring

Application Metrics

Track Davis performance:

const metrics = {
queriesProcessed: 0,
averageResponseTime: 0,
errorRate: 0
};
// Update metrics
function recordQuery(duration, success) {
metrics.queriesProcessed++;
metrics.averageResponseTime = (metrics.averageResponseTime * (metrics.queriesProcessed - 1) + duration) / metrics.queriesProcessed;
if (!success) metrics.errorRate++;
}
// Expose metrics endpoint
app.get('/metrics', (req, res) => {
res.json(metrics);
});

Health Checks

Comprehensive health monitoring:

app.get('/health/detailed', async (req, res) => {
const health = {
status: 'healthy',
timestamp: new Date().toISOString(),
checks: {
dynatrace: await checkDynatraceConnection(),
database: await checkDatabaseConnection(),
redis: await checkRedisConnection()
}
};
const allHealthy = Object.values(health.checks).every(c => c.status === 'ok');
res.status(allHealthy ? 200 : 503).json(health);
});

Error Tracking

Log and track errors:

const winston = require('winston');
logger.error('Query processing failed', {
query: req.body.query,
userId: req.body.userId,
error: error.message,
stack: error.stack
});

Scaling Considerations

Horizontal Scaling

Davis can be scaled horizontally. Deploy multiple instances behind a load balancer:

  • Session state stored in Redis or database
  • Stateless API design
  • Shared cache layer

Resource Allocation

Typical resource requirements:

  • CPU: 0.5-1 core for up to 100 queries/minute
  • Memory: 512MB-1GB
  • Storage: 5GB for conversation history and cache
  • Network: Depends on query complexity and response size

Load Balancing

Use round-robin or least-connections algorithm for distributing traffic across Davis instances.

Troubleshooting

Connection Issues

Problem: Cannot connect to Dynatrace API

Solutions:

  • Verify API token is valid and has correct permissions
  • Check API URL format: https://{environment-id}.live.dynatrace.com/api/v2
  • Ensure network connectivity from container
  • Verify no firewall blocking outbound HTTPS
  • Test with curl: curl -H "Authorization: Api-Token YOUR_TOKEN" https://your-env.live.dynatrace.com/api/v2/entities

Problem: Webhook not receiving events

Solutions:

  • Verify webhook URL is publicly accessible
  • Check webhook secret matches configuration
  • Review webhook logs for incoming requests
  • Test webhook endpoint with curl
  • Ensure HTTPS endpoint (some services require HTTPS)

Query Issues

Problem: Davis doesn’t understand queries

Solutions:

  • Simplify query language
  • Use explicit entity names from Dynatrace
  • Include timeframes explicitly
  • Check NLP logs for parsing results
  • Add custom intent patterns for domain-specific queries
  • Review and improve NLP training data

Problem: Incorrect or empty responses

Solutions:

  • Verify Dynatrace data exists for query timeframe
  • Check entity selectors are correct
  • Review Dynatrace API response in logs
  • Ensure metric keys match Dynatrace schema
  • Validate query parameters being sent to API

Integration Issues

Problem: Slack integration not working

Solutions:

  • Verify bot token starts with xoxb-
  • Check signing secret is correct
  • Ensure Socket Mode is enabled
  • Verify app is installed in workspace
  • Review Slack app event subscriptions
  • Check bot has necessary permissions
  • Test slash command configuration

Problem: Teams bot not responding

Solutions:

  • Verify messaging endpoint is accessible
  • Check app ID and password are correct
  • Ensure bot is registered in Bot Framework
  • Review Teams app manifest configuration
  • Test bot endpoint with Bot Framework Emulator
  • Check bot is added to Teams workspace

Performance Issues

Problem: Slow query responses

Solutions:

  • Implement caching for frequent queries
  • Reduce Dynatrace API query complexity
  • Optimize NLP processing
  • Increase container resources
  • Use connection pooling for API calls
  • Monitor Dynatrace API response times
  • Consider async query processing for complex requests

Problem: High memory usage

Solutions:

  • Clear conversation history cache periodically
  • Reduce cache size limits
  • Monitor for memory leaks in NLP processing
  • Increase container memory limits
  • Implement memory-efficient data structures
  • Review and optimize conversation context storage

Data Issues

Problem: Conversation history not persisting

Solutions:

  • Verify persistent volume is mounted at /app/data
  • Check directory permissions
  • Ensure adequate storage space
  • Test write access to data directory
  • Review session storage configuration
  • Check Redis connection if using external cache

Problem: Missing metrics or entities

Solutions:

  • Verify entity exists in Dynatrace
  • Check timeframe includes relevant data
  • Ensure API token has read access to entity type
  • Review entity selector syntax
  • Test query directly against Dynatrace API
  • Check for entity naming mismatches

Additional Resources

Conclusion

Davis transforms how teams interact with monitoring data by bringing AI-powered natural language interfaces to Dynatrace. Instead of learning complex query languages or navigating intricate dashboards, team members can simply ask questions and receive intelligent, context-aware answers. This democratizes access to observability data, making it available to everyone from developers to product managers.

Deploying Davis on Klutch.sh gives you the infrastructure to run this AI assistant without managing servers or worrying about scaling. The integration with collaboration tools like Slack and Teams brings monitoring insights directly into your team’s daily workflow, enabling faster incident response and better understanding of system health. Whether you’re troubleshooting production issues, analyzing performance trends, or conducting post-mortems, Davis provides instant access to the data you need.

Start having conversations with your monitoring data today and experience the power of AI-assisted observability.