Skip to content

Deploying a Puppeteer App

Puppeteer is a powerful Node.js library that provides a high-level API to control Chrome or Chromium over the DevTools Protocol. It enables developers to automate web scraping, performance testing, rendering, form submission, testing Chrome Extensions, and generating screenshots and PDFs of web pages. With Puppeteer, you can build sophisticated browser automation workflows that interact with modern web applications in a scriptable, reliable manner. It’s trusted by developers worldwide for mission-critical automation tasks.

This comprehensive guide walks you through deploying a Puppeteer application to Klutch.sh, covering both automatic Nixpacks-based deployments and Docker-based deployments. You’ll learn installation steps, explore sample code, configure environment variables, and discover best practices for production deployments.

Table of Contents

  • Prerequisites
  • Getting Started: Install Puppeteer
  • Sample Code Examples
  • Project Structure
  • Deploying Without a Dockerfile (Nixpacks)
  • Deploying With a Dockerfile
  • Environment Variables & Configuration
  • Browser Automation Patterns
  • Performance & Resource Management
  • Troubleshooting
  • Resources

Prerequisites

To deploy a Puppeteer application on Klutch.sh, ensure you have:

  • Node.js 18 or higher - Puppeteer requires a modern Node.js version
  • npm or yarn - For managing dependencies
  • Git - For version control
  • GitHub account - Klutch.sh integrates with GitHub for continuous deployments
  • Klutch.sh account - Sign up for free

Getting Started: Install Puppeteer

Create a New Puppeteer Project

Follow these steps to create and set up a new Puppeteer application:

  1. Create a new directory for your project and initialize npm:
    Terminal window
    mkdir my-puppeteer-app
    cd my-puppeteer-app
    npm init -y
  2. Install Puppeteer and development dependencies:
    Terminal window
    npm install puppeteer express
    npm install --save-dev nodemon

    We’re including Express for a simple API server to wrap Puppeteer functionality.

  3. Create a basic Puppeteer server. Create a file called `index.js`:
    const express = require('express');
    const puppeteer = require('puppeteer');
    const app = express();
    app.use(express.json());
    const PORT = process.env.PORT || 3000;
    app.get('/health', (req, res) => {
    res.json({ status: 'healthy', uptime: process.uptime() });
    });
    app.post('/api/screenshot', async (req, res) => {
    const { url } = req.body;
    if (!url) {
    return res.status(400).json({ error: 'URL is required' });
    }
    try {
    const browser = await puppeteer.launch({ args: ['--no-sandbox'] });
    const page = await browser.newPage();
    await page.goto(url, { waitUntil: 'networkidle2' });
    const screenshot = await page.screenshot({ encoding: 'base64' });
    await browser.close();
    res.json({ success: true, screenshot });
    } catch (error) {
    res.status(500).json({ error: error.message });
    }
    });
    app.listen(PORT, () => {
    console.log(`Puppeteer server running on port ${PORT}`);
    });
  4. Update your `package.json` with startup scripts:
    {
    "name": "my-puppeteer-app",
    "version": "1.0.0",
    "description": "Puppeteer app on Klutch.sh",
    "main": "index.js",
    "scripts": {
    "start": "node index.js",
    "dev": "nodemon index.js"
    },
    "dependencies": {
    "puppeteer": "^21.0.0",
    "express": "^4.18.0"
    },
    "devDependencies": {
    "nodemon": "^3.0.1"
    }
    }
  5. Test your app locally:
    Terminal window
    npm run dev

    Visit http://localhost:3000/health to verify the server is running.


Sample Code Examples

Basic Web Scraping with Puppeteer

Here’s a complete example for scraping web content:

scraper.js
const puppeteer = require('puppeteer');
async function scrapeWebsite(url) {
let browser;
try {
browser = await puppeteer.launch({
headless: true,
args: ['--no-sandbox', '--disable-setuid-sandbox']
});
const page = await browser.newPage();
await page.goto(url, { waitUntil: 'networkidle2' });
const data = await page.evaluate(() => {
return {
title: document.title,
url: document.url,
headings: Array.from(document.querySelectorAll('h1, h2'))
.map(h => h.textContent),
links: Array.from(document.querySelectorAll('a'))
.map(a => ({ text: a.textContent, href: a.href }))
.slice(0, 10)
};
});
return data;
} finally {
if (browser) {
await browser.close();
}
}
}
module.exports = scrapeWebsite;

PDF Generation Service

pdf-service.js
const puppeteer = require('puppeteer');
const path = require('path');
async function generatePDF(htmlContent, outputPath) {
let browser;
try {
browser = await puppeteer.launch({
headless: true,
args: ['--no-sandbox']
});
const page = await browser.newPage();
await page.setContent(htmlContent, { waitUntil: 'networkidle2' });
await page.pdf({
path: outputPath,
format: 'A4',
margin: {
top: '20mm',
right: '20mm',
bottom: '20mm',
left: '20mm'
}
});
return { success: true, path: outputPath };
} catch (error) {
throw new Error(`PDF generation failed: ${error.message}`);
} finally {
if (browser) {
await browser.close();
}
}
}
module.exports = generatePDF;

Performance Testing with Puppeteer

performance-test.js
const puppeteer = require('puppeteer');
async function testPagePerformance(url) {
let browser;
try {
browser = await puppeteer.launch({
headless: true,
args: ['--no-sandbox']
});
const page = await browser.newPage();
// Measure page load metrics
await page.goto(url, { waitUntil: 'networkidle2' });
const metrics = await page.metrics();
const performanceMetrics = await page.evaluate(() => {
const navigation = performance.getEntriesByType('navigation')[0];
return {
domContentLoaded: navigation.domContentLoadedEventEnd - navigation.domContentLoadedEventStart,
loadComplete: navigation.loadEventEnd - navigation.loadEventStart,
timeToFirstByte: navigation.responseStart - navigation.requestStart
};
});
return {
jsHeapSize: metrics.JSHeapUsedSize,
jsHeapLimit: metrics.JSHeapTotalSize,
performanceMetrics
};
} finally {
if (browser) {
await browser.close();
}
}
}
module.exports = testPagePerformance;

Project Structure

A typical Puppeteer project has this structure:

my-puppeteer-app/
├── node_modules/
├── src/
│ ├── scrapers/
│ │ ├── website.js
│ │ └── ecommerce.js
│ ├── services/
│ │ ├── pdf-generator.js
│ │ ├── screenshot-service.js
│ │ └── performance-tester.js
│ └── middleware/
│ └── errorHandler.js
├── output/
│ ├── screenshots/
│ └── pdfs/
├── .env
├── .gitignore
├── index.js
├── package.json
└── package-lock.json

Deploying Without a Dockerfile

Klutch.sh uses Nixpacks to automatically detect and build your Puppeteer application. This is the simplest deployment option that requires no additional configuration files.

  1. Test your Puppeteer app locally to ensure it works correctly:
    Terminal window
    npm start
  2. Push your Puppeteer application to a GitHub repository with all your source code, `package.json`, and `package-lock.json` files.
  3. Log in to your Klutch.sh dashboard.
  4. Create a new project and give it a name (e.g., "My Puppeteer App").
  5. Create a new app with the following configuration:
    • Repository - Select your Puppeteer GitHub repository and the branch to deploy
    • Traffic Type - Select HTTP (for web applications serving HTTP traffic)
    • Internal Port - Set to 3000 (the default port for Puppeteer applications)
    • Region - Choose your preferred region for deployment
    • Compute - Select the appropriate compute resource size (Puppeteer requires more memory than typical Node apps)
    • Instances - Choose how many instances to run (start with 1 for testing)
    • Environment Variables - Add any environment variables your app needs (API keys, URLs to scrape, etc.)

    If you need to customize the start command or build process, you can set Nixpacks environment variables:

    • START_COMMAND: Override the default start command (e.g., node index.js)
    • BUILD_COMMAND: Override the default build command
  6. Click "Create" to deploy. Klutch.sh will automatically detect your Node.js project, install dependencies, and start your Puppeteer application.
  7. Once deployed, your app will be available at a URL like `example-app.klutch.sh`. Test it by visiting the URL in your browser and checking the health endpoint at `/health`.

Deploying With a Dockerfile

If you prefer more control over the build and runtime environment, you can use a Dockerfile. Klutch.sh will automatically detect and use any Dockerfile in your repository’s root directory. This is especially important for Puppeteer to ensure all browser dependencies are installed.

  1. Create a `Dockerfile` in your project root:
    # Multi-stage build for optimized Puppeteer deployment
    FROM node:18-bullseye-slim AS builder
    WORKDIR /app
    # Copy package files
    COPY package*.json ./
    # Install dependencies
    RUN npm ci
    # Production stage with browser dependencies
    FROM node:18-bullseye-slim
    WORKDIR /app
    # Install Chromium and browser dependencies
    RUN apt-get update && apt-get install -y \
    ca-certificates \
    fonts-liberation \
    libasound2 \
    libatk-bridge2.0-0 \
    libatk1.0-0 \
    libcups2 \
    libdrm2 \
    libgbm1 \
    libgtk-3-0 \
    libnspr4 \
    libnss3 \
    libx11-xcb1 \
    libxcomposite1 \
    libxdamage1 \
    libxrandr2 \
    xdg-utils \
    wget \
    --no-install-recommends \
    && rm -rf /var/lib/apt/lists/*
    # Copy dependencies from builder
    COPY --from=builder /app/node_modules ./node_modules
    COPY --from=builder /app/package*.json ./
    # Copy application code
    COPY . .
    # Set environment variables
    ENV NODE_ENV=production
    # Expose port
    EXPOSE 3000
    # Health check
    HEALTHCHECK --interval=30s --timeout=10s --start-period=5s --retries=3 \
    CMD node -e "require('http').get('http://localhost:3000/health', (r) => {if (r.statusCode !== 200) throw new Error(r.statusCode)})"
    # Start the application
    CMD ["npm", "start"]
  2. Create a `.dockerignore` file to exclude unnecessary files from the Docker build:
    node_modules
    npm-debug.log
    .git
    .gitignore
    README.md
    .env
    .env.local
    .vscode
    .idea
    .DS_Store
    output
    screenshots
    pdfs
  3. Push your code (with Dockerfile and .dockerignore) to GitHub.
  4. Follow the same deployment steps as the Nixpacks method:
    • Log in to Klutch.sh
    • Create a new project
    • Create a new app pointing to your GitHub repository
    • Set the traffic type to HTTP and internal port to 3000
    • Add any required environment variables
    • Click “Create”

    Klutch.sh will automatically detect your Dockerfile and use it to build and deploy your application.

  5. Your deployed app will be available at `example-app.klutch.sh` once the build and deployment complete.

Environment Variables & Configuration

Puppeteer applications use environment variables for configuration. Set these in the Klutch.sh dashboard during app creation or update them afterward.

Common Environment Variables

# Server configuration
PORT=3000
NODE_ENV=production
# Browser configuration
PUPPETEER_SKIP_CHROMIUM_DOWNLOAD=true
PUPPETEER_EXECUTABLE_PATH=/usr/bin/chromium
PUPPETEER_TIMEOUT=30000
# Application settings
LOG_LEVEL=info
MAX_WORKERS=2
QUEUE_SIZE=100
# URLs to process
TARGET_URL=https://example.com
API_BASE_URL=https://api.example.com
# API Keys and Secrets
API_KEY=your_api_key_here
SECRET_KEY=your_secret_key_here
# Output configuration
OUTPUT_DIR=/tmp/output
SCREENSHOT_FORMAT=png
PDF_FORMAT=A4

Using Environment Variables in Your App

config.js
module.exports = {
port: process.env.PORT || 3000,
environment: process.env.NODE_ENV || 'development',
puppeteer: {
timeout: parseInt(process.env.PUPPETEER_TIMEOUT || '30000'),
headless: true,
args: [
'--no-sandbox',
'--disable-setuid-sandbox',
`--user-data-dir=${process.env.USER_DATA_DIR || '/tmp/chrome'}`
]
},
maxWorkers: parseInt(process.env.MAX_WORKERS || '2'),
queueSize: parseInt(process.env.QUEUE_SIZE || '100'),
targetUrl: process.env.TARGET_URL,
logLevel: process.env.LOG_LEVEL || 'info'
};

Customizing Build and Start Commands with Nixpacks

If using Nixpacks deployment without a Dockerfile, you can customize build and start commands by setting environment variables:

BUILD_COMMAND: npm run build
START_COMMAND: npm start

Set these as environment variables during app creation on Klutch.sh.


Browser Automation Patterns

Connection Pooling for Multiple Pages

browser-pool.js
const puppeteer = require('puppeteer');
class BrowserPool {
constructor(maxBrowsers = 2) {
this.maxBrowsers = maxBrowsers;
this.browsers = [];
this.pageQueues = [];
}
async initialize() {
for (let i = 0; i < this.maxBrowsers; i++) {
const browser = await puppeteer.launch({
headless: true,
args: ['--no-sandbox', '--disable-setuid-sandbox']
});
this.browsers.push(browser);
}
}
async getPage() {
let browser = this.browsers[this.pageQueues.length % this.maxBrowsers];
const page = await browser.newPage();
return page;
}
async closePage(page) {
if (page) {
await page.close();
}
}
async close() {
for (const browser of this.browsers) {
await browser.close();
}
}
}
module.exports = BrowserPool;

Error Handling and Retry Logic

retry-handler.js
async function retryWithBackoff(fn, maxRetries = 3, backoffMs = 1000) {
for (let i = 0; i < maxRetries; i++) {
try {
return await fn();
} catch (error) {
if (i === maxRetries - 1) {
throw error;
}
console.log(`Attempt ${i + 1} failed, retrying in ${backoffMs}ms...`);
await new Promise(resolve => setTimeout(resolve, backoffMs));
backoffMs *= 2; // Exponential backoff
}
}
}
module.exports = retryWithBackoff;

Performance & Resource Management

Memory and Resource Optimization

  1. Limit Browser Instances - Control the number of concurrent browser instances
  2. Close Resources Properly - Always close pages and browsers after use
  3. Monitor Memory Usage - Track heap size and close unused pages
  4. Set Timeouts - Prevent hanging requests with appropriate timeouts
  5. Use Headless Mode - Always run in headless mode for server deployments
  6. Disable GPU - Add --disable-gpu flag to reduce resource usage
  7. Set Page Timeout - Configure page load timeouts

Example with optimization:

optimized-scraper.js
const puppeteer = require('puppeteer');
async function optimizedScrape(url) {
let browser;
let page;
try {
browser = await puppeteer.launch({
headless: true,
args: [
'--no-sandbox',
'--disable-setuid-sandbox',
'--disable-gpu',
'--disable-dev-shm-usage'
]
});
page = await browser.newPage();
// Set viewport and timeout
await page.setViewport({ width: 1280, height: 720 });
await page.setDefaultTimeout(30000);
await page.setDefaultNavigationTimeout(30000);
// Set request interceptor to block unnecessary resources
await page.on('request', (request) => {
if (['image', 'stylesheet', 'font'].includes(request.resourceType())) {
request.abort();
} else {
request.continue();
}
});
// Navigate and scrape
await page.goto(url, { waitUntil: 'domcontentloaded' });
const data = await page.evaluate(() => ({
title: document.title,
text: document.body.innerText.substring(0, 500)
}));
return data;
} finally {
if (page) {
await page.close();
}
if (browser) {
await browser.close();
}
}
}
module.exports = optimizedScrape;

Troubleshooting

Application Won’t Start

Problem - Deployment completes but the app shows as unhealthy

Solution:

  • Verify your Puppeteer app starts locally: npm start
  • Check that package.json has a valid start script
  • Ensure the app listens on port 3000
  • Verify all browser dependencies are installed (if using Docker)
  • Check application logs in the Klutch.sh dashboard
  • Verify environment variables are set correctly

Browser Launch Fails

Problem - “Failed to launch Chrome/Chromium” error

Solution:

  • Ensure using --no-sandbox flag for containers
  • Add --disable-setuid-sandbox flag
  • Add --disable-dev-shm-usage flag (if running in restricted memory)
  • For Docker, ensure browser dependencies are installed (bullseye-slim base)
  • Check available memory in Klutch.sh compute tier

Memory Issues

Problem - App crashes with out of memory errors

Solution:

  • Reduce number of concurrent browser instances
  • Limit page creation and ensure proper cleanup
  • Add --disable-dev-shm-usage to browser arguments
  • Implement page pooling instead of creating new browsers per request
  • Monitor memory with console.log(process.memoryUsage())
  • Consider upgrading compute tier for Puppeteer workloads

Timeout Errors

Problem - Pages fail to load with timeout errors

Solution:

  • Increase timeout values appropriately
  • Use waitUntil: 'domcontentloaded' instead of networkidle2 if appropriate
  • Check target URLs for actual availability
  • Implement retry logic with exponential backoff
  • Check network connectivity in your Klutch.sh region

High CPU Usage

Problem - Deployment uses excessive CPU

Solution:

  • Limit concurrent browser instances
  • Use request interception to block unnecessary resources
  • Implement proper queuing for requests
  • Add CPU limiting via Klutch.sh compute configuration
  • Monitor resource usage during development

Resources


Summary

Deploying a Puppeteer application on Klutch.sh is straightforward whether you choose Nixpacks or Docker. Both methods provide reliable, scalable hosting for your browser automation workflows. Start with Nixpacks for simplicity, or use Docker for complete control over browser dependencies. With Puppeteer’s powerful automation capabilities and Klutch.sh’s scalable infrastructure, automatic load balancing, and environment management, you can deploy your browser automation applications from development to production and handle large-scale scraping, testing, and rendering tasks efficiently.