Deploying Davis

Davis is an AI-powered virtual assistant specifically designed for Dynatrace monitoring and observability platforms. Named after the Davis AI engine within Dynatrace, this conversational interface allows teams to interact with their monitoring data using natural language queries instead of navigating complex dashboards or writing custom queries. Whether you’re investigating performance issues, analyzing metrics, or checking system health, Davis translates your questions into actionable insights pulled directly from your Dynatrace environment.

What sets Davis apart is its ability to understand context and intent. Rather than requiring precise syntax or memorizing query languages, you can ask questions like “What’s causing the high response time on my production API?” or “Show me CPU usage for the last hour” and receive intelligent responses with relevant charts, metrics, and recommendations. Davis integrates with Slack, Microsoft Teams, and other collaboration platforms, bringing monitoring insights directly into your team’s workflow. This makes monitoring data accessible to everyone, not just specialists who know how to navigate APM tools.

Why Deploy Davis on Klutch.sh?

Deploying Davis on Klutch.sh offers several advantages for hosting your AI monitoring assistant:

Automatic Docker Detection: Klutch.sh recognizes your Dockerfile and handles containerization without manual configuration
Persistent Storage: Built-in volume management ensures your conversation history and configuration persist across deployments
HTTPS by Default: Secure access to your Davis instance with automatic SSL certificates
Environment Management: Securely configure Dynatrace API tokens, webhook URLs, and integration credentials through environment variables
Webhook Support: Receive real-time notifications and integrate with collaboration tools through HTTP endpoints
Rapid Deployment: Go from configuration to production in minutes with GitHub integration
Always-On Availability: Keep your monitoring assistant running 24/7 without managing infrastructure

Prerequisites

Before deploying Davis to Klutch.sh, ensure you have:

A Klutch.sh account (sign up here)
A GitHub account with a repository for your Davis deployment
Basic understanding of Docker and containerization
A Dynatrace account (SaaS or Managed)
Dynatrace API token with appropriate permissions
Dynatrace environment ID and URL
Slack workspace (optional, for Slack integration)
Git installed on your local development machine
Familiarity with REST APIs and webhooks

Understanding Davis Architecture

Davis follows a microservices architecture designed for intelligent monitoring interactions:

Core Components

Node.js Application Server

Davis is built with Node.js, providing a responsive web application and API endpoints for handling user interactions. The application processes natural language queries, communicates with Dynatrace APIs, and formats responses in user-friendly formats. Express.js handles HTTP routing, middleware, and webhook endpoints for integrations with collaboration platforms.

Natural Language Processing

The NLP engine interprets user queries and extracts intent, entities, and context. When you ask “What’s wrong with my application?”, Davis parses the question, identifies relevant time ranges, application names, and problem categories, then constructs appropriate Dynatrace API queries to retrieve the answer.

Dynatrace API Integration

Davis connects to Dynatrace through REST APIs, accessing monitoring data, metrics, events, and problems. The integration layer handles authentication with API tokens, manages rate limiting, and caches frequent queries for performance. Davis can access:

Application performance metrics
Infrastructure monitoring data
Problem detection and root cause analysis
Log analytics and traces
Custom metrics and events
Synthetic monitoring results

Conversation Manager

The conversation manager maintains context across multiple interactions, allowing follow-up questions without repeating context. If you ask “Show me errors in production” followed by “What about staging?”, Davis remembers the context (errors) and adjusts the query scope (staging environment).

Response Formatter

Davis formats Dynatrace data into human-readable responses with charts, tables, and recommendations. Complex JSON responses from Dynatrace are transformed into conversational answers, graphs are rendered for visual analysis, and actionable suggestions are highlighted.

Integration Layer

Davis integrates with multiple platforms:

Slack: Slash commands and interactive messages
Microsoft Teams: Bot framework integration
Webhook: Generic webhook support for custom integrations
Web UI: Browser-based chat interface

Each integration maintains its own session state and user authentication.

Configuration System

Davis uses environment variables and configuration files to manage:

Dynatrace connection settings
API authentication tokens
Integration platform credentials
Feature flags and customization
Response templates and formatting rules

Query Flow

User asks question through chat interface (Slack, Teams, or web UI)
Davis receives message and authenticates user
NLP engine parses query to extract intent and entities
Conversation manager retrieves context from previous interactions
Query builder constructs appropriate Dynatrace API requests
API calls are made to Dynatrace with authentication
Response data is retrieved and cached
Response formatter converts data to human-readable format
Formatted response is sent back through integration channel
Conversation context is updated for follow-up questions

Storage Requirements

Davis requires persistent storage for:

Conversation History: Past interactions and context for users
Cache Data: Frequently accessed metrics and query results
Configuration: Custom response templates and user preferences
Session State: Active conversation sessions and authentication tokens

A typical deployment needs 1GB-5GB for conversation history and cache data, growing based on user activity and cache retention policies.

Installation and Setup

Let’s walk through setting up Davis for deployment on Klutch.sh.

Step 1: Create the Project Structure

First, create a new directory for your Davis deployment:

mkdir davis-deployment
cd davis-deployment
git init

Step 2: Create Configuration File

Create a config.json file with your Dynatrace configuration:

{
  "dynatrace": {
    "environmentId": "your-environment-id",
    "apiUrl": "https://your-environment-id.live.dynatrace.com/api/v2",
    "apiToken": "${DYNATRACE_API_TOKEN}",
    "timeout": 30000
  },
  "server": {
    "port": 3000,
    "host": "0.0.0.0"
  },
  "nlp": {
    "confidenceThreshold": 0.6,
    "contextWindow": 5,
    "enableFollowUp": true
  },
  "cache": {
    "enabled": true,
    "ttl": 300,
    "maxSize": 100
  },
  "integrations": {
    "slack": {
      "enabled": "${SLACK_ENABLED}",
      "botToken": "${SLACK_BOT_TOKEN}",
      "signingSecret": "${SLACK_SIGNING_SECRET}",
      "appToken": "${SLACK_APP_TOKEN}"
    },
    "teams": {
      "enabled": "${TEAMS_ENABLED}",
      "appId": "${TEAMS_APP_ID}",
      "appPassword": "${TEAMS_APP_PASSWORD}"
    },
    "webhook": {
      "enabled": true,
      "secret": "${WEBHOOK_SECRET}"
    }
  },
  "features": {
    "problemAnalysis": true,
    "performanceMetrics": true,
    "logAnalytics": true,
    "customQueries": true,
    "aiInsights": true
  },
  "logging": {
    "level": "info",
    "format": "json"
  }
}

Step 3: Create Environment Template

Create a .env.example file:

# Dynatrace Configuration
DYNATRACE_ENVIRONMENT_ID=your-environment-id
DYNATRACE_API_URL=https://your-environment-id.live.dynatrace.com/api/v2
DYNATRACE_API_TOKEN=your-dynatrace-api-token

# Server Configuration
PORT=3000
NODE_ENV=production

# Slack Integration (Optional)
SLACK_ENABLED=false
SLACK_BOT_TOKEN=xoxb-your-bot-token
SLACK_SIGNING_SECRET=your-signing-secret
SLACK_APP_TOKEN=xapp-your-app-token

# Microsoft Teams Integration (Optional)
TEAMS_ENABLED=false
TEAMS_APP_ID=your-teams-app-id
TEAMS_APP_PASSWORD=your-teams-password

# Webhook Configuration
WEBHOOK_SECRET=your-webhook-secret

# Cache Configuration
REDIS_URL=redis://localhost:6379
CACHE_TTL=300

# Logging
LOG_LEVEL=info
LOG_FORMAT=json

# Session Configuration
SESSION_SECRET=your-session-secret
SESSION_TTL=86400

Step 4: Create the Dockerfile

Create a Dockerfile in the root directory:

FROM node:18-alpine

# Set environment variables
ENV NODE_ENV=production \
    NPM_CONFIG_LOGLEVEL=warn \
    PORT=3000

# Install system dependencies
RUN apk add --no-cache \
    python3 \
    make \
    g++ \
    curl

# Create app directory
WORKDIR /app

# Create davis user
RUN addgroup -g 1000 davis && \
    adduser -D -u 1000 -G davis davis

# Copy package files
COPY package*.json ./

# Install dependencies
RUN npm ci --only=production && \
    npm cache clean --force

# Copy application files
COPY --chown=davis:davis . .

# Create necessary directories
RUN mkdir -p /app/data /app/logs && \
    chown -R davis:davis /app

# Switch to davis user
USER davis

# Expose port
EXPOSE 3000

# Health check
HEALTHCHECK --interval=30s --timeout=10s --start-period=40s --retries=3 \
  CMD node healthcheck.js || exit 1

# Start application
CMD ["node", "server.js"]

Step 5: Create Application Files

Create package.json:

{
  "name": "davis-ai-assistant",
  "version": "1.0.0",
  "description": "AI-powered assistant for Dynatrace monitoring",
  "main": "server.js",
  "scripts": {
    "start": "node server.js",
    "dev": "nodemon server.js",
    "test": "jest",
    "lint": "eslint ."
  },
  "dependencies": {
    "express": "^4.18.2",
    "axios": "^1.6.0",
    "dotenv": "^16.3.1",
    "express-rate-limit": "^7.1.0",
    "helmet": "^7.1.0",
    "cors": "^2.8.5",
    "morgan": "^1.10.0",
    "joi": "^17.11.0",
    "natural": "^6.10.0",
    "compromise": "^14.10.0",
    "ioredis": "^5.3.2",
    "express-session": "^1.17.3",
    "winston": "^3.11.0",
    "@slack/bolt": "^3.15.0",
    "botbuilder": "^4.21.0"
  },
  "devDependencies": {
    "nodemon": "^3.0.1",
    "jest": "^29.7.0",
    "eslint": "^8.54.0"
  },
  "engines": {
    "node": ">=18.0.0",
    "npm": ">=9.0.0"
  }
}

Create server.js:

require('dotenv').config();
const express = require('express');
const helmet = require('helmet');
const cors = require('cors');
const morgan = require('morgan');
const rateLimit = require('express-rate-limit');
const session = require('express-session');
const winston = require('winston');

// Import custom modules
const dynatraceClient = require('./lib/dynatrace-client');
const nlpProcessor = require('./lib/nlp-processor');
const conversationManager = require('./lib/conversation-manager');
const responseFormatter = require('./lib/response-formatter');
const slackIntegration = require('./integrations/slack');
const teamsIntegration = require('./integrations/teams');
const webhookHandler = require('./integrations/webhook');

// Initialize Express app
const app = express();
const PORT = process.env.PORT || 3000;

// Configure logger
const logger = winston.createLogger({
  level: process.env.LOG_LEVEL || 'info',
  format: winston.format.json(),
  transports: [
    new winston.transports.Console({
      format: winston.format.simple()
    }),
    new winston.transports.File({ filename: '/app/logs/error.log', level: 'error' }),
    new winston.transports.File({ filename: '/app/logs/combined.log' })
  ]
});

// Middleware
app.use(helmet());
app.use(cors());
app.use(express.json());
app.use(express.urlencoded({ extended: true }));
app.use(morgan('combined', { stream: { write: message => logger.info(message.trim()) } }));

// Rate limiting
const limiter = rateLimit({
  windowMs: 15 * 60 * 1000, // 15 minutes
  max: 100 // limit each IP to 100 requests per windowMs
});
app.use('/api/', limiter);

// Session configuration
app.use(session({
  secret: process.env.SESSION_SECRET || 'your-secret-key',
  resave: false,
  saveUninitialized: false,
  cookie: {
    secure: process.env.NODE_ENV === 'production',
    maxAge: parseInt(process.env.SESSION_TTL) * 1000 || 86400000
  }
}));

// Health check endpoint
app.get('/health', (req, res) => {
  res.json({ status: 'healthy', timestamp: new Date().toISOString() });
});

// Query endpoint
app.post('/api/query', async (req, res) => {
  try {
    const { query, userId, sessionId } = req.body;

    if (!query) {
      return res.status(400).json({ error: 'Query is required' });
    }

    logger.info(`Processing query: ${query}`, { userId, sessionId });

    // Parse natural language query
    const intent = await nlpProcessor.parse(query);

    // Get conversation context
    const context = await conversationManager.getContext(userId, sessionId);

    // Build Dynatrace query
    const dynatraceQuery = await nlpProcessor.buildQuery(intent, context);

    // Execute query against Dynatrace
    const data = await dynatraceClient.query(dynatraceQuery);

    // Format response
    const response = await responseFormatter.format(data, intent);

    // Update conversation context
    await conversationManager.updateContext(userId, sessionId, { query, response, intent });

    res.json({
      answer: response.text,
      data: response.data,
      visualizations: response.charts,
      suggestions: response.suggestions
    });

  } catch (error) {
    logger.error('Error processing query:', error);
    res.status(500).json({
      error: 'Failed to process query',
      message: error.message
    });
  }
});

// Conversation history endpoint
app.get('/api/conversations/:userId', async (req, res) => {
  try {
    const { userId } = req.params;
    const history = await conversationManager.getHistory(userId);
    res.json({ conversations: history });
  } catch (error) {
    logger.error('Error fetching conversation history:', error);
    res.status(500).json({ error: 'Failed to fetch conversations' });
  }
});

// Initialize integrations
if (process.env.SLACK_ENABLED === 'true') {
  slackIntegration.initialize(app, logger);
  logger.info('Slack integration initialized');
}

if (process.env.TEAMS_ENABLED === 'true') {
  teamsIntegration.initialize(app, logger);
  logger.info('Teams integration initialized');
}

webhookHandler.initialize(app, logger);
logger.info('Webhook handler initialized');

// Error handling middleware
app.use((err, req, res, next) => {
  logger.error('Unhandled error:', err);
  res.status(500).json({
    error: 'Internal server error',
    message: process.env.NODE_ENV === 'development' ? err.message : undefined
  });
});

// Start server
app.listen(PORT, '0.0.0.0', () => {
  logger.info(`Davis AI Assistant listening on port ${PORT}`);
  logger.info(`Environment: ${process.env.NODE_ENV}`);
  logger.info(`Dynatrace Environment: ${process.env.DYNATRACE_ENVIRONMENT_ID}`);
});

// Graceful shutdown
process.on('SIGTERM', () => {
  logger.info('SIGTERM signal received: closing HTTP server');
  app.close(() => {
    logger.info('HTTP server closed');
    process.exit(0);
  });
});

Create healthcheck.js:

const http = require('http');

const options = {
  host: 'localhost',
  port: process.env.PORT || 3000,
  path: '/health',
  timeout: 2000
};

const request = http.request(options, (res) => {
  if (res.statusCode === 200) {
    process.exit(0);
  } else {
    process.exit(1);
  }
});

request.on('error', () => {
  process.exit(1);
});

request.end();

Step 6: Create Library Modules

Create lib/dynatrace-client.js:

const axios = require('axios');
const logger = require('winston');

class DynatraceClient {
  constructor() {
    this.apiUrl = process.env.DYNATRACE_API_URL;
    this.apiToken = process.env.DYNATRACE_API_TOKEN;
    this.environmentId = process.env.DYNATRACE_ENVIRONMENT_ID;

    this.client = axios.create({
      baseURL: this.apiUrl,
      headers: {
        'Authorization': `Api-Token ${this.apiToken}`,
        'Content-Type': 'application/json'
      },
      timeout: 30000
    });
  }

  async query(params) {
    try {
      const { endpoint, method = 'GET', data = null, queryParams = {} } = params;

      const response = await this.client.request({
        method,
        url: endpoint,
        data,
        params: queryParams
      });

      return response.data;
    } catch (error) {
      logger.error('Dynatrace API error:', error.message);
      throw new Error(`Failed to query Dynatrace: ${error.message}`);
    }
  }

  async getProblems(timeframe = 'now-2h') {
    return this.query({
      endpoint: '/problems',
      queryParams: {
        from: timeframe,
        fields: '+impactAnalysis,+rootCauseEntity'
      }
    });
  }

  async getMetrics(metricSelector, timeframe = 'now-1h') {
    return this.query({
      endpoint: '/metrics/query',
      queryParams: {
        metricSelector,
        from: timeframe,
        resolution: '1m'
      }
    });
  }

  async getEntities(entityType, fields = []) {
    return this.query({
      endpoint: `/entities/${entityType}`,
      queryParams: {
        fields: fields.join(',')
      }
    });
  }

  async getApplications() {
    return this.getEntities('applications', ['displayName', 'tags', 'entityId']);
  }

  async getHosts() {
    return this.getEntities('hosts', ['displayName', 'osType', 'tags']);
  }

  async getServices() {
    return this.getEntities('services', ['displayName', 'serviceType', 'tags']);
  }
}

module.exports = new DynatraceClient();

Create lib/nlp-processor.js:

const natural = require('natural');
const compromise = require('compromise');

class NLPProcessor {
  constructor() {
    this.tokenizer = new natural.WordTokenizer();
    this.tfidf = new natural.TfIdf();

    // Intent patterns
    this.intents = {
      problemQuery: ['problem', 'issue', 'error', 'down', 'fail', 'crash'],
      metricsQuery: ['cpu', 'memory', 'response time', 'throughput', 'metric'],
      statusQuery: ['status', 'health', 'running', 'up', 'available'],
      listQuery: ['list', 'show', 'display', 'what are', 'get'],
      analyzeQuery: ['analyze', 'investigate', 'why', 'cause', 'root cause']
    };
  }

  async parse(query) {
    const doc = compromise(query);
    const tokens = this.tokenizer.tokenize(query.toLowerCase());

    // Extract intent
    const intent = this.extractIntent(tokens);

    // Extract entities
    const entities = {
      timeframe: this.extractTimeframe(doc, query),
      environment: this.extractEnvironment(tokens),
      entityType: this.extractEntityType(tokens),
      metric: this.extractMetric(tokens)
    };

    return { intent, entities, originalQuery: query };
  }

  extractIntent(tokens) {
    for (const [intentName, keywords] of Object.entries(this.intents)) {
      for (const keyword of keywords) {
        if (tokens.some(token => token.includes(keyword) || keyword.includes(token))) {
          return intentName;
        }
      }
    }
    return 'generalQuery';
  }

  extractTimeframe(doc, query) {
    // Extract time expressions
    const timeMatch = query.match(/(last|past)\s+(\d+)\s+(minute|hour|day|week)s?/i);
    if (timeMatch) {
      const value = parseInt(timeMatch[2]);
      const unit = timeMatch[3].toLowerCase();
      return `now-${value}${unit[0]}`;
    }

    // Default timeframe
    return 'now-1h';
  }

  extractEnvironment(tokens) {
    const environments = ['production', 'staging', 'development', 'prod', 'dev', 'stage'];
    for (const env of environments) {
      if (tokens.includes(env)) {
        return env;
      }
    }
    return null;
  }

  extractEntityType(tokens) {
    const entityTypes = {
      application: ['app', 'application'],
      service: ['service', 'api'],
      host: ['host', 'server', 'machine'],
      database: ['database', 'db']
    };

    for (const [type, keywords] of Object.entries(entityTypes)) {
      if (tokens.some(token => keywords.includes(token))) {
        return type;
      }
    }
    return null;
  }

  extractMetric(tokens) {
    const metrics = {
      'cpu': 'builtin:host.cpu.usage',
      'memory': 'builtin:host.mem.usage',
      'response': 'builtin:service.response.time',
      'throughput': 'builtin:service.requestCount.total',
      'errors': 'builtin:service.errors.total.count'
    };

    for (const [keyword, metricId] of Object.entries(metrics)) {
      if (tokens.some(token => token.includes(keyword))) {
        return metricId;
      }
    }
    return null;
  }

  async buildQuery(parsedIntent, context) {
    const { intent, entities } = parsedIntent;

    switch (intent) {
      case 'problemQuery':
        return {
          endpoint: '/problems',
          queryParams: {
            from: entities.timeframe,
            entitySelector: this.buildEntitySelector(entities)
          }
        };

      case 'metricsQuery':
        return {
          endpoint: '/metrics/query',
          queryParams: {
            metricSelector: entities.metric || 'builtin:host.cpu.usage',
            from: entities.timeframe,
            resolution: '1m'
          }
        };

      case 'statusQuery':
        return {
          endpoint: '/entities',
          queryParams: {
            entitySelector: this.buildEntitySelector(entities),
            fields: 'healthState,displayName'
          }
        };

      default:
        return {
          endpoint: '/entities',
          queryParams: {}
        };
    }
  }

  buildEntitySelector(entities) {
    const selectors = [];

    if (entities.entityType) {
      selectors.push(`type("${entities.entityType}")`);
    }

    if (entities.environment) {
      selectors.push(`tag("environment:${entities.environment}")`);
    }

    return selectors.join(',') || undefined;
  }
}

module.exports = new NLPProcessor();

Step 7: Create .dockerignore

Create a .dockerignore file:

node_modules
npm-debug.log
.env
.env.local
.git
.gitignore
*.md
README.md
.DS_Store
Thumbs.db
logs/
*.log
test/
tests/
.vscode/
.idea/
coverage/

Step 8: Create Documentation

Create README.md:

# Davis AI Assistant Deployment

This repository contains a Davis AI Assistant deployment configured for Klutch.sh.

## Features

- Natural language queries for Dynatrace monitoring
- Real-time problem detection and analysis
- Performance metrics visualization
- Slack and Microsoft Teams integration
- Conversation history and context awareness
- Intelligent suggestions and recommendations

## Configuration

Set the following environment variables:

- `DYNATRACE_API_TOKEN`: Your Dynatrace API token
- `DYNATRACE_ENVIRONMENT_ID`: Your environment ID
- `DYNATRACE_API_URL`: Your Dynatrace API URL

## Example Queries

- "What problems occurred in the last hour?"
- "Show me CPU usage for production servers"
- "Are all services healthy?"
- "What's causing high response time?"
- "List all applications"

## Deployment

This application is configured to deploy on Klutch.sh with automatic Docker detection.

Step 9: Initialize Git Repository

git add .
git commit -m "Initial Davis AI Assistant setup for Klutch.sh deployment"
git branch -M master
git remote add origin https://github.com/yourusername/davis-deployment.git
git push -u origin master

Deploying to Klutch.sh

Now that your Davis application is configured, let’s deploy it to Klutch.sh.

Log in to Klutch.sh
Navigate to klutch.sh/app and sign in with your GitHub account.
Create a New Project
Click “New Project” and select “Import from GitHub”. Choose the repository containing your Davis deployment.
Configure Build Settings
Klutch.sh will automatically detect the Dockerfile in your repository. The platform will use this for building your container.
Configure Traffic Settings
Select “HTTP” as the traffic type. Davis serves its web interface and API on port 3000, and Klutch.sh will route HTTPS traffic to this port.
Set Environment Variables
In the project settings, add the following environment variables:
- DYNATRACE_API_TOKEN: Your Dynatrace API token (requires Read API v2 permissions)
- DYNATRACE_ENVIRONMENT_ID: Your environment ID (e.g., abc12345)
- DYNATRACE_API_URL: https://your-environment-id.live.dynatrace.com/api/v2
- PORT: 3000
- NODE_ENV: production
- SESSION_SECRET: Generate using openssl rand -hex 32
- LOG_LEVEL: info
For Slack integration (optional):
- SLACK_ENABLED: true
- SLACK_BOT_TOKEN: Your Slack bot token (starts with xoxb-)
- SLACK_SIGNING_SECRET: Your Slack signing secret
- SLACK_APP_TOKEN: Your Slack app token (starts with xapp-)
Configure Persistent Storage
Davis requires persistent storage for conversation history and cache:
- Data Volume:
  - Mount path: /app/data
  - Size: 5GB
- Logs Volume:
  - Mount path: /app/logs
  - Size: 2GB
These volumes ensure your conversation history and logs persist across deployments.
Deploy the Application
Click “Deploy” to start the build process. Klutch.sh will:
- Clone your repository
- Build the Docker image using your Dockerfile
- Install Node.js dependencies
- Deploy the container with Davis
- Provision an HTTPS endpoint
The build process typically takes 2-3 minutes.
Access Your Davis Instance
Once deployment completes, Klutch.sh will provide a URL like example-app.klutch.sh. Your Davis AI assistant will be available at this URL.

Getting Started with Davis

Once your Davis instance is deployed, here’s how to use it:

Using the Web Interface

Navigate to Your Deployment

Visit your deployed URL (e.g., https://example-app.klutch.sh) to access the Davis web interface.

Ask Questions

Type natural language questions in the chat interface:

What problems occurred in the last 2 hours?

Response includes:

List of detected problems
Severity levels
Affected entities
Root cause analysis
Recommended actions

Follow-Up Questions

Davis maintains conversation context:

You: Show me errors in production
Davis: [Lists production errors]
You: What about staging?
Davis: [Lists staging errors - understands context]

View Metrics

Request performance metrics:

Show me CPU usage for the last hour

Davis returns:

Time series chart
Current value
Average, min, max values
Trend analysis
Anomaly detection

API Usage

Query Endpoint

Send natural language queries via API:

curl -X POST https://example-app.klutch.sh/api/query \
  -H "Content-Type: application/json" \
  -d '{
    "query": "What is the response time for my API services?",
    "userId": "user123",
    "sessionId": "session456"
  }'

Response:

{
  "answer": "The average response time for your API services in the last hour is 245ms. The checkout-service has the highest response time at 380ms.",
  "data": {
    "metrics": [
      {
        "service": "checkout-service",
        "responseTime": 380,
        "unit": "ms"
      },
      {
        "service": "auth-service",
        "responseTime": 150,
        "unit": "ms"
      }
    ]
  },
  "visualizations": [
    {
      "type": "line-chart",
      "data": "..."
    }
  ],
  "suggestions": [
    "Investigate high response time on checkout-service",
    "Check database query performance"
  ]
}

Health Check

Monitor Davis availability:

curl https://example-app.klutch.sh/health

Conversation History

Retrieve past conversations:

curl https://example-app.klutch.sh/api/conversations/user123

Example Queries

Problem Detection

What problems do I have right now?
Show me critical issues from yesterday
Are there any errors in production?
What went wrong with my application?

Performance Metrics

Show me CPU usage for production hosts
What's the memory consumption?
Display response time for all services
How many requests per minute am I getting?
Show me throughput trends for the last week

Status Checks

Are all my services healthy?
What's the status of my infrastructure?
Is everything running normally?
Which applications are down?

Specific Entity Queries

Show me metrics for the payment-service
What's happening with the database host?
Display errors for the checkout application
How is my frontend performing?

Time-Based Queries

Show me problems from the last 24 hours
What happened between 2pm and 3pm today?
Display yesterday's performance metrics
Show me last week's error rate

Root Cause Analysis

Why is my API slow?
What's causing high CPU usage?
Investigate the recent outage
Why are users experiencing errors?
What's the root cause of the problem?

Slack Integration

Configure Davis to work within your Slack workspace:

Step 1: Create Slack App

Go to Slack API Apps
Click “Create New App”
Choose “From scratch”
Name: “Davis AI Assistant”
Select your workspace
Click “Create App”

Step 2: Configure Bot Token

Navigate to “OAuth & Permissions”
Add these Bot Token Scopes:
- chat:write
- chat:write.public
- commands
- im:history
- im:read
- im:write
- channels:history
- channels:read
- groups:history
- groups:read
Install app to workspace
Copy “Bot User OAuth Token” (starts with xoxb-)

Step 3: Enable Socket Mode

Navigate to “Socket Mode”
Enable Socket Mode
Generate App-Level Token with connections:write scope
Copy token (starts with xapp-)

Step 4: Configure Slash Command

Navigate to “Slash Commands”
Create new command:
- Command: /davis
- Request URL: https://example-app.klutch.sh/slack/events
- Short Description: “Ask Davis about monitoring”
- Usage Hint: [your question]

Step 5: Update Environment Variables

Add to Klutch.sh environment variables:

SLACK_ENABLED=true
SLACK_BOT_TOKEN=xoxb-your-token
SLACK_SIGNING_SECRET=your-signing-secret
SLACK_APP_TOKEN=xapp-your-token

Step 6: Use in Slack

In any Slack channel:

/davis What problems occurred in the last hour?

Or direct message @Davis:

@Davis show me CPU usage

Microsoft Teams Integration

Set up Davis for Microsoft Teams:

Step 1: Register Bot

Go to Bot Framework
Create new bot registration
Name: “Davis AI Assistant”
Messaging endpoint: https://example-app.klutch.sh/api/teams/messages
Copy App ID and generate App Password

Step 2: Configure Teams App

Create manifest.json for Teams app package:

{
  "$schema": "https://developer.microsoft.com/json-schemas/teams/v1.16/MicrosoftTeams.schema.json",
  "manifestVersion": "1.16",
  "version": "1.0.0",
  "id": "your-app-id",
  "packageName": "com.davis.assistant",
  "developer": {
    "name": "Your Company",
    "websiteUrl": "https://example-app.klutch.sh",
    "privacyUrl": "https://example-app.klutch.sh/privacy",
    "termsOfUseUrl": "https://example-app.klutch.sh/terms"
  },
  "name": {
    "short": "Davis",
    "full": "Davis AI Monitoring Assistant"
  },
  "description": {
    "short": "AI assistant for Dynatrace monitoring",
    "full": "Natural language interface for monitoring and observability"
  },
  "icons": {
    "outline": "outline.png",
    "color": "color.png"
  },
  "accentColor": "#1F8FE8",
  "bots": [
    {
      "botId": "your-app-id",
      "scopes": ["personal", "team"],
      "supportsFiles": false,
      "isNotificationOnly": false
    }
  ],
  "permissions": ["identity", "messageTeamMembers"],
  "validDomains": ["example-app.klutch.sh"]
}

Step 3: Update Environment Variables

TEAMS_ENABLED=true
TEAMS_APP_ID=your-app-id
TEAMS_APP_PASSWORD=your-app-password

Step 4: Deploy to Teams

Package manifest.json with icon files into zip
Upload to Teams app catalog
Install in your Teams workspace

Step 5: Use in Teams

Chat with Davis bot or mention in channels:

@Davis what's the status of my services?

Advanced Configuration

Custom Query Templates

Create custom response templates in templates/responses.json:

{
  "problemSummary": {
    "template": "Found {{count}} problem(s):\n{{#each problems}}- {{title}} ({{severity}})\n{{/each}}",
    "includeCharts": true,
    "suggestions": [
      "View problem details",
      "Check affected entities",
      "See root cause analysis"
    ]
  },
  "metricsSummary": {
    "template": "{{metricName}}: {{currentValue}}{{unit}}\nAverage: {{average}}{{unit}}\nTrend: {{trend}}",
    "includeCharts": true,
    "timeframes": ["1h", "24h", "7d"]
  }
}

Caching with Redis

For improved performance, integrate Redis caching:

Update Dockerfile:

# Add Redis client (already in package.json dependencies)
# Configure Redis connection in config.json

Update environment variables:

REDIS_URL=redis://your-redis-host:6379
CACHE_TTL=300

Authentication and Authorization

Implement user authentication:

Create lib/auth-middleware.js:

const jwt = require('jsonwebtoken');

function authenticate(req, res, next) {
  const token = req.headers.authorization?.split(' ')[1];

  if (!token) {
    return res.status(401).json({ error: 'Authentication required' });
  }

  try {
    const decoded = jwt.verify(token, process.env.JWT_SECRET);
    req.user = decoded;
    next();
  } catch (error) {
    res.status(401).json({ error: 'Invalid token' });
  }
}

module.exports = { authenticate };

Apply to protected routes:

const { authenticate } = require('./lib/auth-middleware');

app.post('/api/query', authenticate, async (req, res) => {
  // Query handling
});

Custom Dynatrace Metrics

Query custom metrics from Dynatrace:

// In lib/dynatrace-client.js
async getCustomMetric(metricKey, entitySelector) {
  return this.query({
    endpoint: '/metrics/query',
    queryParams: {
      metricSelector: `ext:${metricKey}`,
      entitySelector,
      resolution: '1m'
    }
  });
}

Use in queries:

Show me custom metric shopify.orders.total for the last hour

Webhook Notifications

Configure webhooks for proactive notifications:

// In integrations/webhook.js
async function sendNotification(problem) {
  const webhookUrl = process.env.WEBHOOK_URL;

  await axios.post(webhookUrl, {
    text: `New problem detected: ${problem.title}`,
    severity: problem.severity,
    affectedEntities: problem.impactedEntities,
    rootCause: problem.rootCause,
    link: `${process.env.DYNATRACE_URL}/ui/problems/${problem.id}`
  });
}

Multi-Environment Support

Support multiple Dynatrace environments:

const environments = {
  production: {
    apiUrl: process.env.PROD_DYNATRACE_API_URL,
    apiToken: process.env.PROD_DYNATRACE_API_TOKEN
  },
  staging: {
    apiUrl: process.env.STAGING_DYNATRACE_API_URL,
    apiToken: process.env.STAGING_DYNATRACE_API_TOKEN
  }
};

// Select environment based on query context
const env = environments[parsedIntent.entities.environment] || environments.production;

Production Best Practices

Follow these recommendations for running Davis in production:

Security

API Token Security

Never commit API tokens to version control:

# Use environment variables
DYNATRACE_API_TOKEN=your-token

# Rotate tokens regularly
# Use tokens with minimal required permissions

Dynatrace API Permissions

Create token with only necessary scopes:

Read entities
Read metrics
Read problems
Read logs (if using log analytics)

Rate Limiting

Protect your API from abuse:

const limiter = rateLimit({
  windowMs: 15 * 60 * 1000,
  max: 100,
  message: 'Too many requests from this IP'
});

HTTPS Only

Klutch.sh provides automatic HTTPS. Ensure all webhook URLs and API endpoints use HTTPS.

Input Validation

Validate all user inputs:

const Joi = require('joi');

const querySchema = Joi.object({
  query: Joi.string().min(3).max(500).required(),
  userId: Joi.string().required(),
  sessionId: Joi.string().optional()
});

Performance Optimization

Response Caching

Cache frequent queries:

const cache = new Map();

async function getCachedResponse(query) {
  const cacheKey = `query:${query}`;
  const cached = cache.get(cacheKey);

  if (cached && Date.now() - cached.timestamp < CACHE_TTL) {
    return cached.data;
  }

  const data = await executeQuery(query);
  cache.set(cacheKey, { data, timestamp: Date.now() });
  return data;
}

Connection Pooling

Reuse Dynatrace API connections:

const axios = require('axios');
const agent = new https.Agent({ keepAlive: true });

const client = axios.create({
  httpsAgent: agent,
  timeout: 30000
});

Async Processing

Handle long-running queries asynchronously:

app.post('/api/query/async', async (req, res) => {
  const queryId = generateId();

  // Start processing in background
  processQuery(req.body.query, queryId);

  // Return immediately
  res.json({ queryId, status: 'processing' });
});

app.get('/api/query/:queryId/status', (req, res) => {
  const status = getQueryStatus(req.params.queryId);
  res.json(status);
});

Monitoring

Application Metrics

Track Davis performance:

const metrics = {
  queriesProcessed: 0,
  averageResponseTime: 0,
  errorRate: 0
};

// Update metrics
function recordQuery(duration, success) {
  metrics.queriesProcessed++;
  metrics.averageResponseTime = (metrics.averageResponseTime * (metrics.queriesProcessed - 1) + duration) / metrics.queriesProcessed;
  if (!success) metrics.errorRate++;
}

// Expose metrics endpoint
app.get('/metrics', (req, res) => {
  res.json(metrics);
});

Health Checks

Comprehensive health monitoring:

app.get('/health/detailed', async (req, res) => {
  const health = {
    status: 'healthy',
    timestamp: new Date().toISOString(),
    checks: {
      dynatrace: await checkDynatraceConnection(),
      database: await checkDatabaseConnection(),
      redis: await checkRedisConnection()
    }
  };

  const allHealthy = Object.values(health.checks).every(c => c.status === 'ok');
  res.status(allHealthy ? 200 : 503).json(health);
});

Error Tracking

Log and track errors:

const winston = require('winston');

logger.error('Query processing failed', {
  query: req.body.query,
  userId: req.body.userId,
  error: error.message,
  stack: error.stack
});

Scaling Considerations

Horizontal Scaling

Davis can be scaled horizontally. Deploy multiple instances behind a load balancer:

Session state stored in Redis or database
Stateless API design
Shared cache layer

Resource Allocation

Typical resource requirements:

CPU: 0.5-1 core for up to 100 queries/minute
Memory: 512MB-1GB
Storage: 5GB for conversation history and cache
Network: Depends on query complexity and response size

Load Balancing

Use round-robin or least-connections algorithm for distributing traffic across Davis instances.

Troubleshooting

Connection Issues

Problem: Cannot connect to Dynatrace API

Solutions:

Verify API token is valid and has correct permissions
Check API URL format: https://{environment-id}.live.dynatrace.com/api/v2
Ensure network connectivity from container
Verify no firewall blocking outbound HTTPS
Test with curl: curl -H "Authorization: Api-Token YOUR_TOKEN" https://your-env.live.dynatrace.com/api/v2/entities

Problem: Webhook not receiving events

Solutions:

Verify webhook URL is publicly accessible
Check webhook secret matches configuration
Review webhook logs for incoming requests
Test webhook endpoint with curl
Ensure HTTPS endpoint (some services require HTTPS)

Query Issues

Problem: Davis doesn’t understand queries

Solutions:

Simplify query language
Use explicit entity names from Dynatrace
Include timeframes explicitly
Check NLP logs for parsing results
Add custom intent patterns for domain-specific queries
Review and improve NLP training data

Problem: Incorrect or empty responses

Solutions:

Verify Dynatrace data exists for query timeframe
Check entity selectors are correct
Review Dynatrace API response in logs
Ensure metric keys match Dynatrace schema
Validate query parameters being sent to API

Integration Issues

Problem: Slack integration not working

Solutions:

Verify bot token starts with xoxb-
Check signing secret is correct
Ensure Socket Mode is enabled
Verify app is installed in workspace
Review Slack app event subscriptions
Check bot has necessary permissions
Test slash command configuration

Problem: Teams bot not responding

Solutions:

Verify messaging endpoint is accessible
Check app ID and password are correct
Ensure bot is registered in Bot Framework
Review Teams app manifest configuration
Test bot endpoint with Bot Framework Emulator
Check bot is added to Teams workspace

Performance Issues

Problem: Slow query responses

Solutions:

Implement caching for frequent queries
Reduce Dynatrace API query complexity
Optimize NLP processing
Increase container resources
Use connection pooling for API calls
Monitor Dynatrace API response times
Consider async query processing for complex requests

Problem: High memory usage

Solutions:

Clear conversation history cache periodically
Reduce cache size limits
Monitor for memory leaks in NLP processing
Increase container memory limits
Implement memory-efficient data structures
Review and optimize conversation context storage

Data Issues

Problem: Conversation history not persisting

Solutions:

Verify persistent volume is mounted at /app/data
Check directory permissions
Ensure adequate storage space
Test write access to data directory
Review session storage configuration
Check Redis connection if using external cache

Problem: Missing metrics or entities

Solutions:

Verify entity exists in Dynatrace
Check timeframe includes relevant data
Ensure API token has read access to entity type
Review entity selector syntax
Test query directly against Dynatrace API
Check for entity naming mismatches

Additional Resources

Conclusion

Davis transforms how teams interact with monitoring data by bringing AI-powered natural language interfaces to Dynatrace. Instead of learning complex query languages or navigating intricate dashboards, team members can simply ask questions and receive intelligent, context-aware answers. This democratizes access to observability data, making it available to everyone from developers to product managers.

Deploying Davis on Klutch.sh gives you the infrastructure to run this AI assistant without managing servers or worrying about scaling. The integration with collaboration tools like Slack and Teams brings monitoring insights directly into your team’s daily workflow, enabling faster incident response and better understanding of system health. Whether you’re troubleshooting production issues, analyzing performance trends, or conducting post-mortems, Davis provides instant access to the data you need.

Start having conversations with your monitoring data today and experience the power of AI-assisted observability.