Deploying Solr

Introduction

Apache Solr is a powerful, open-source enterprise search platform built on Apache Lucene. Trusted by companies like Netflix, Instagram, and eBay, Solr provides lightning-fast full-text search, hit highlighting, faceted navigation, real-time indexing, dynamic clustering, database integration, and rich document handling. Whether you’re building an e-commerce search engine, document management system, or data analytics platform, Solr delivers the scalability and performance you need.

Solr excels at indexing and searching large volumes of text-centric data, making it ideal for product catalogs, content management systems, log analysis, and any application requiring sophisticated search capabilities. With support for complex queries, relevancy tuning, and distributed search via SolrCloud, it’s the go-to solution for enterprise-grade search infrastructure.

This guide walks you through deploying Apache Solr on Klutch.sh using a Dockerfile. You’ll learn how to create cores, configure schemas, index documents, execute queries, and optimize your search deployment for production workloads.

What You’ll Learn

How to deploy Apache Solr with a Dockerfile on Klutch.sh
Creating and configuring Solr cores for your data
Defining schemas and field types
Indexing documents via REST API
Executing search queries with filters and facets
Setting up persistent storage for your search index
Best practices for production search deployments

Prerequisites

Before you begin, ensure you have:

A Klutch.sh account
A GitHub repository for your Solr project
Basic understanding of search concepts (indexing, querying, relevancy)
(Optional) Sample data to index

Understanding Solr Architecture

Apache Solr consists of several key components:

Core: A single index with its own configuration (schema, solrconfig.xml)
Collection: In SolrCloud, a logical index that can span multiple shards
Schema: Defines the structure of your documents (fields, field types, analyzers)
Request Handlers: Process different types of requests (search, update, admin)
Document Root: /var/solr stores data, logs, and configuration

Solr exposes a REST API on port 8983 for all operations including indexing, querying, and administration.

Step 1: Prepare Your GitHub Repository

Create a new GitHub repository for your Solr deployment.

Create the project structure:

my-solr-project/
├── Dockerfile
├── configsets/
│   └── myconfig/
│       └── conf/
│           ├── solrconfig.xml
│           └── managed-schema.xml
├── scripts/
│   └── init-solr.sh
└── .dockerignore

Create a custom schema at configsets/myconfig/conf/managed-schema.xml:

<?xml version="1.0" encoding="UTF-8" ?>
<schema name="myconfig" version="1.6">
    <!-- Field Types -->
    <fieldType name="string" class="solr.StrField" sortMissingLast="true" docValues="true"/>
    <fieldType name="boolean" class="solr.BoolField" sortMissingLast="true"/>
    <fieldType name="pint" class="solr.IntPointField" docValues="true"/>
    <fieldType name="pfloat" class="solr.FloatPointField" docValues="true"/>
    <fieldType name="plong" class="solr.LongPointField" docValues="true"/>
    <fieldType name="pdouble" class="solr.DoublePointField" docValues="true"/>
    <fieldType name="pdate" class="solr.DatePointField" docValues="true"/>

    <!-- Text field with standard analysis -->
    <fieldType name="text_general" class="solr.TextField" positionIncrementGap="100" multiValued="true">
        <analyzer type="index">
            <tokenizer class="solr.StandardTokenizerFactory"/>
            <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt"/>
            <filter class="solr.LowerCaseFilterFactory"/>
        </analyzer>
        <analyzer type="query">
            <tokenizer class="solr.StandardTokenizerFactory"/>
            <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt"/>
            <filter class="solr.SynonymGraphFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
            <filter class="solr.LowerCaseFilterFactory"/>
        </analyzer>
    </fieldType>

    <!-- Text field optimized for autocomplete -->
    <fieldType name="text_autocomplete" class="solr.TextField" positionIncrementGap="100">
        <analyzer type="index">
            <tokenizer class="solr.StandardTokenizerFactory"/>
            <filter class="solr.LowerCaseFilterFactory"/>
            <filter class="solr.EdgeNGramFilterFactory" minGramSize="2" maxGramSize="15"/>
        </analyzer>
        <analyzer type="query">
            <tokenizer class="solr.StandardTokenizerFactory"/>
            <filter class="solr.LowerCaseFilterFactory"/>
        </analyzer>
    </fieldType>

    <!-- Required unique key field -->
    <field name="id" type="string" indexed="true" stored="true" required="true" multiValued="false"/>

    <!-- Version field for optimistic locking -->
    <field name="_version_" type="plong" indexed="false" stored="false"/>

    <!-- Catch-all field for searching all content -->
    <field name="_text_" type="text_general" indexed="true" stored="false" multiValued="true"/>

    <!-- Common fields for a product catalog example -->
    <field name="name" type="text_general" indexed="true" stored="true"/>
    <field name="name_autocomplete" type="text_autocomplete" indexed="true" stored="false"/>
    <field name="description" type="text_general" indexed="true" stored="true"/>
    <field name="category" type="string" indexed="true" stored="true" multiValued="true"/>
    <field name="price" type="pfloat" indexed="true" stored="true"/>
    <field name="in_stock" type="boolean" indexed="true" stored="true"/>
    <field name="rating" type="pfloat" indexed="true" stored="true"/>
    <field name="created_at" type="pdate" indexed="true" stored="true"/>
    <field name="updated_at" type="pdate" indexed="true" stored="true"/>
    <field name="tags" type="string" indexed="true" stored="true" multiValued="true"/>
    <field name="brand" type="string" indexed="true" stored="true"/>
    <field name="sku" type="string" indexed="true" stored="true"/>

    <!-- Unique key -->
    <uniqueKey>id</uniqueKey>

    <!-- Copy fields for search-all functionality -->
    <copyField source="name" dest="_text_"/>
    <copyField source="description" dest="_text_"/>
    <copyField source="category" dest="_text_"/>
    <copyField source="tags" dest="_text_"/>
    <copyField source="brand" dest="_text_"/>

    <!-- Copy name to autocomplete field -->
    <copyField source="name" dest="name_autocomplete"/>
</schema>

Create a Solr configuration at configsets/myconfig/conf/solrconfig.xml:

<?xml version="1.0" encoding="UTF-8" ?>
<config>
    <luceneMatchVersion>9.0</luceneMatchVersion>

    <!-- Data directory -->
    <dataDir>${solr.data.dir:}</dataDir>

    <!-- Index configuration -->
    <indexConfig>
        <ramBufferSizeMB>100</ramBufferSizeMB>
        <maxBufferedDocs>1000</maxBufferedDocs>
        <mergePolicyFactory class="org.apache.solr.index.TieredMergePolicyFactory">
            <int name="maxMergeAtOnce">10</int>
            <int name="segmentsPerTier">10</int>
        </mergePolicyFactory>
    </indexConfig>

    <!-- Update handler configuration -->
    <updateHandler class="solr.DirectUpdateHandler2">
        <updateLog>
            <str name="dir">${solr.ulog.dir:}</str>
            <int name="numVersionBuckets">65536</int>
        </updateLog>
        <autoCommit>
            <maxTime>${solr.autoCommit.maxTime:15000}</maxTime>
            <openSearcher>false</openSearcher>
        </autoCommit>
        <autoSoftCommit>
            <maxTime>${solr.autoSoftCommit.maxTime:1000}</maxTime>
        </autoSoftCommit>
    </updateHandler>

    <!-- Query configuration -->
    <query>
        <maxBooleanClauses>1024</maxBooleanClauses>
        <filterCache class="solr.CaffeineCache"
            size="512"
            initialSize="512"
            autowarmCount="0"/>
        <queryResultCache class="solr.CaffeineCache"
            size="512"
            initialSize="512"
            autowarmCount="0"/>
        <documentCache class="solr.CaffeineCache"
            size="512"
            initialSize="512"
            autowarmCount="0"/>
        <enableLazyFieldLoading>true</enableLazyFieldLoading>
        <queryResultWindowSize>20</queryResultWindowSize>
        <queryResultMaxDocsCached>200</queryResultMaxDocsCached>
    </query>

    <!-- Request dispatcher -->
    <requestDispatcher>
        <requestParsers enableRemoteStreaming="false"
            multipartUploadLimitInKB="2048000"
            formdataUploadLimitInKB="2048"
            addHttpRequestToContext="false"/>
        <httpCaching never304="true"/>
    </requestDispatcher>

    <!-- Search handler -->
    <requestHandler name="/select" class="solr.SearchHandler">
        <lst name="defaults">
            <str name="echoParams">explicit</str>
            <int name="rows">10</int>
            <str name="df">_text_</str>
            <str name="wt">json</str>
        </lst>
    </requestHandler>

    <!-- Query handler with edismax -->
    <requestHandler name="/query" class="solr.SearchHandler">
        <lst name="defaults">
            <str name="echoParams">explicit</str>
            <str name="wt">json</str>
            <str name="defType">edismax</str>
            <str name="qf">name^3 description^2 category tags brand</str>
            <str name="pf">name^5 description^3</str>
            <int name="rows">10</int>
        </lst>
    </requestHandler>

    <!-- Autocomplete/suggest handler -->
    <requestHandler name="/autocomplete" class="solr.SearchHandler">
        <lst name="defaults">
            <str name="echoParams">explicit</str>
            <str name="wt">json</str>
            <str name="defType">edismax</str>
            <str name="qf">name_autocomplete^2 name</str>
            <int name="rows">10</int>
            <str name="fl">id,name,category</str>
        </lst>
    </requestHandler>

    <!-- Update handler for JSON -->
    <requestHandler name="/update" class="solr.UpdateRequestHandler">
        <lst name="defaults">
            <str name="update.chain">dedupe</str>
        </lst>
    </requestHandler>

    <!-- Update handler for JSON documents -->
    <requestHandler name="/update/json/docs" class="solr.UpdateRequestHandler">
        <lst name="defaults">
            <str name="stream.contentType">application/json</str>
        </lst>
    </requestHandler>

    <!-- Admin handlers -->
    <requestHandler name="/admin/ping" class="solr.PingRequestHandler">
        <lst name="invariants">
            <str name="q">*:*</str>
        </lst>
        <lst name="defaults">
            <str name="echoParams">all</str>
        </lst>
    </requestHandler>

    <!-- Highlighting -->
    <searchComponent class="solr.HighlightComponent" name="highlight">
        <highlighting>
            <fragmenter name="gap" default="true" class="solr.highlight.GapFragmenter">
                <lst name="defaults">
                    <int name="hl.fragsize">100</int>
                </lst>
            </fragmenter>
            <formatter name="html" default="true" class="solr.highlight.HtmlFormatter">
                <lst name="defaults">
                    <str name="hl.simple.pre"><![CDATA[<em>]]></str>
                    <str name="hl.simple.post"><![CDATA[</em>]]></str>
                </lst>
            </formatter>
        </highlighting>
    </searchComponent>

    <!-- Deduplication update processor -->
    <updateRequestProcessorChain name="dedupe">
        <processor class="solr.LogUpdateProcessorFactory"/>
        <processor class="solr.RunUpdateProcessorFactory"/>
    </updateRequestProcessorChain>
</config>

Create supporting files for the configset:

Create configsets/myconfig/conf/stopwords.txt:

# Standard English stopwords
a
an
and
are
as
at
be
but
by
for
if
in
into
is
it
no
not
of
on
or
such
that
the
their
then
there
these
they
this
to
was
will
with

Create configsets/myconfig/conf/synonyms.txt:

# Synonym mappings
# Format: word1,word2,word3 => normalized
# Or: word1,word2,word3 (all equivalent)

laptop,notebook => laptop
phone,mobile,cellphone => phone
tv,television => television
couch,sofa => sofa

Step 2: Create the Dockerfile

Klutch.sh automatically detects a Dockerfile in your repository’s root directory.

Create a Dockerfile in your project root:

# Use the official Apache Solr image
FROM solr:9-slim

# Set environment variables
ENV SOLR_HEAP=512m
ENV SOLR_JAVA_MEM="-Xms512m -Xmx512m"

# Switch to root to copy files
USER root

# Create directories for custom configuration
RUN mkdir -p /opt/solr/server/solr/configsets/myconfig/conf

# Copy custom configset
COPY --chown=solr:solr configsets/myconfig/conf/ /opt/solr/server/solr/configsets/myconfig/conf/

# Copy initialization script
COPY --chown=solr:solr scripts/init-solr.sh /docker-entrypoint-initdb.d/

# Make script executable
RUN chmod +x /docker-entrypoint-initdb.d/init-solr.sh

# Switch back to solr user
USER solr

# Expose Solr port
EXPOSE 8983

# Health check
HEALTHCHECK --interval=30s --timeout=10s --start-period=60s --retries=3 \
    CMD curl -f http://localhost:8983/solr/admin/ping || exit 1

# Start Solr with precreated core
CMD ["solr-precreate", "products", "/opt/solr/server/solr/configsets/myconfig"]

Create the initialization script at scripts/init-solr.sh:

#!/bin/bash
set -e

echo "=== Solr Initialization Script ==="
echo "Running custom initialization..."

# Wait for Solr to be ready
wait_for_solr() {
    local max_attempts=30
    local attempt=1

    while [ $attempt -le $max_attempts ]; do
        if curl -s "http://localhost:8983/solr/admin/ping" > /dev/null 2>&1; then
            echo "Solr is ready!"
            return 0
        fi
        echo "Waiting for Solr... (attempt $attempt/$max_attempts)"
        sleep 2
        attempt=$((attempt + 1))
    done

    echo "Solr did not become ready in time"
    return 1
}

echo "Initialization complete!"

Create a .dockerignore file:

.git
.gitignore
README.md
.DS_Store
*.log
.env
.env.local
node_modules/

Step 3: Deploy to Klutch.sh

Commit and push your changes to GitHub:

git add .
git commit -m "Add Solr configuration for Klutch.sh deployment"
git push origin main

Log in to Klutch.sh and navigate to the dashboard.
Create a new project (if you don’t have one already) by clicking “New Project”.
Create a new app within your project:
- Click “New App”
- Select your GitHub repository
- Choose the branch to deploy (e.g., main)
- Klutch.sh will automatically detect your Dockerfile
Configure the app settings:
- Traffic Type: Select HTTP
- Internal Port: Set to 8983 (Solr’s default port)

Configure environment variables (optional):

# Memory settings
SOLR_HEAP=1g
SOLR_JAVA_MEM=-Xms1g -Xmx1g

# Timezone
TZ=UTC

Click “Create” to deploy your Solr instance.

Once deployed, your Solr Admin UI will be accessible at https://example-app.klutch.sh/solr/.

Step 4: Configure Persistent Storage

Solr requires persistent storage to retain your search index and configuration across deployments.

In your app settings, navigate to the “Volumes” section.
Add a persistent volume with the following configuration:
- Mount Path: /var/solr
- Size: 20 GB (adjust based on your index size requirements)
Save the configuration and redeploy your app.

The /var/solr directory contains:

Index data for all cores
Transaction logs
Core configuration
Solr logs

For more details on managing storage, see the Volumes Guide.

Getting Started: Sample Code

Indexing Documents

Once your Solr instance is running, you can index documents using the REST API.

Index a single document:

curl -X POST "https://example-app.klutch.sh/solr/products/update/json/docs?commit=true" \
  -H "Content-Type: application/json" \
  -d '{
    "id": "product-001",
    "name": "Wireless Bluetooth Headphones",
    "description": "Premium noise-canceling wireless headphones with 30-hour battery life",
    "category": ["Electronics", "Audio"],
    "price": 199.99,
    "in_stock": true,
    "rating": 4.5,
    "brand": "AudioTech",
    "sku": "AT-WBH-001",
    "tags": ["wireless", "bluetooth", "noise-canceling", "premium"],
    "created_at": "2024-01-15T10:30:00Z"
  }'

Index multiple documents:

curl -X POST "https://example-app.klutch.sh/solr/products/update?commit=true" \
  -H "Content-Type: application/json" \
  -d '[
    {
      "id": "product-002",
      "name": "4K Smart TV 55 inch",
      "description": "Ultra HD smart television with HDR and streaming apps",
      "category": ["Electronics", "Television"],
      "price": 599.99,
      "in_stock": true,
      "rating": 4.7,
      "brand": "ViewMax",
      "sku": "VM-TV55-4K",
      "tags": ["4k", "smart-tv", "hdr", "streaming"]
    },
    {
      "id": "product-003",
      "name": "Ergonomic Office Chair",
      "description": "Adjustable lumbar support office chair with breathable mesh",
      "category": ["Furniture", "Office"],
      "price": 349.99,
      "in_stock": true,
      "rating": 4.3,
      "brand": "ComfortPlus",
      "sku": "CP-EOC-001",
      "tags": ["ergonomic", "office", "mesh", "adjustable"]
    },
    {
      "id": "product-004",
      "name": "Running Shoes Pro",
      "description": "Lightweight running shoes with responsive cushioning",
      "category": ["Sports", "Footwear"],
      "price": 129.99,
      "in_stock": false,
      "rating": 4.6,
      "brand": "SpeedRunner",
      "sku": "SR-RSP-001",
      "tags": ["running", "lightweight", "cushioned", "athletic"]
    }
  ]'

Querying Documents

Basic search:

# Search for "headphones"
curl "https://example-app.klutch.sh/solr/products/select?q=headphones&wt=json"

Search with filters:

# Search for electronics under $300
curl "https://example-app.klutch.sh/solr/products/select?q=*:*&fq=category:Electronics&fq=price:[0%20TO%20300]&wt=json"

Faceted search:

# Get products with facets on category and brand
curl "https://example-app.klutch.sh/solr/products/select?q=*:*&facet=true&facet.field=category&facet.field=brand&wt=json"

Search with highlighting:

# Search with highlighted results
curl "https://example-app.klutch.sh/solr/products/select?q=wireless+bluetooth&hl=true&hl.fl=name,description&wt=json"

Using the enhanced query handler (edismax):

# Use the custom /query handler with boosted fields
curl "https://example-app.klutch.sh/solr/products/query?q=premium+audio&wt=json"

Autocomplete:

# Use the autocomplete handler
curl "https://example-app.klutch.sh/solr/products/autocomplete?q=wire&wt=json"

Using Solr from Code

Python example:

import requests
import json

SOLR_URL = "https://example-app.klutch.sh/solr/products"

def index_document(doc):
    """Index a single document"""
    response = requests.post(
        f"{SOLR_URL}/update/json/docs?commit=true",
        headers={"Content-Type": "application/json"},
        data=json.dumps(doc)
    )
    return response.json()

def search(query, filters=None, rows=10):
    """Search for documents"""
    params = {
        "q": query,
        "wt": "json",
        "rows": rows
    }
    if filters:
        params["fq"] = filters

    response = requests.get(f"{SOLR_URL}/select", params=params)
    return response.json()

def faceted_search(query, facet_fields):
    """Search with facets"""
    params = {
        "q": query,
        "wt": "json",
        "facet": "true",
        "facet.field": facet_fields
    }
    response = requests.get(f"{SOLR_URL}/select", params=params)
    return response.json()

# Example usage
if __name__ == "__main__":
    # Index a document
    product = {
        "id": "product-005",
        "name": "Mechanical Keyboard RGB",
        "description": "Gaming mechanical keyboard with RGB backlighting",
        "category": ["Electronics", "Gaming"],
        "price": 89.99,
        "in_stock": True,
        "rating": 4.4,
        "brand": "KeyMaster",
        "tags": ["mechanical", "rgb", "gaming"]
    }
    print("Indexing:", index_document(product))

    # Search
    results = search("keyboard", rows=5)
    print(f"Found {results['response']['numFound']} documents")

    # Faceted search
    facets = faceted_search("*:*", ["category", "brand"])
    print("Facets:", facets.get("facet_counts", {}).get("facet_fields", {}))

Node.js example:

const axios = require('axios');

const SOLR_URL = 'https://example-app.klutch.sh/solr/products';

async function indexDocument(doc) {
    const response = await axios.post(
        `${SOLR_URL}/update/json/docs?commit=true`,
        doc,
        { headers: { 'Content-Type': 'application/json' } }
    );
    return response.data;
}

async function search(query, options = {}) {
    const params = {
        q: query,
        wt: 'json',
        rows: options.rows || 10,
        ...options
    };

    const response = await axios.get(`${SOLR_URL}/select`, { params });
    return response.data;
}

async function deleteDocument(id) {
    const response = await axios.post(
        `${SOLR_URL}/update?commit=true`,
        { delete: { id } },
        { headers: { 'Content-Type': 'application/json' } }
    );
    return response.data;
}

// Example usage
(async () => {
    try {
        // Search for products
        const results = await search('wireless', { rows: 5 });
        console.log(`Found ${results.response.numFound} documents`);

        results.response.docs.forEach(doc => {
            console.log(`- ${doc.name} ($${doc.price})`);
        });

        // Faceted search
        const facetedResults = await search('*:*', {
            facet: true,
            'facet.field': ['category', 'brand']
        });
        console.log('Categories:', facetedResults.facet_counts?.facet_fields?.category);

    } catch (error) {
        console.error('Error:', error.message);
    }
})();

Advanced Configuration

Custom Field Types

Add specialized field types for specific use cases:

<!-- Phonetic matching for fuzzy name searches -->
<fieldType name="phonetic" class="solr.TextField" stored="false" indexed="true">
    <analyzer>
        <tokenizer class="solr.StandardTokenizerFactory"/>
        <filter class="solr.DoubleMetaphoneFilterFactory" inject="false"/>
    </analyzer>
</fieldType>

<!-- Currency field for prices -->
<fieldType name="currency" class="solr.CurrencyFieldType"
    currencyConfig="currency.xml" defaultCurrency="USD"/>

<!-- Location field for geo-spatial search -->
<fieldType name="location" class="solr.LatLonPointSpatialField" docValues="true"/>

<!-- Text field for exact phrase matching -->
<fieldType name="text_phrase" class="solr.TextField">
    <analyzer>
        <tokenizer class="solr.KeywordTokenizerFactory"/>
        <filter class="solr.LowerCaseFilterFactory"/>
    </analyzer>
</fieldType>

Geo-Spatial Search

Add location-based search capabilities:

<!-- Add to schema -->
<field name="location" type="location" indexed="true" stored="true"/>

Index with location:

curl -X POST "https://example-app.klutch.sh/solr/products/update?commit=true" \
  -H "Content-Type: application/json" \
  -d '{
    "id": "store-001",
    "name": "Downtown Store",
    "location": "40.7128,-74.0060"
  }'

Search by distance:

# Find stores within 10km of a point
curl "https://example-app.klutch.sh/solr/products/select?q=*:*&fq={!geofilt}&sfield=location&pt=40.7128,-74.0060&d=10&wt=json"

Spell Checking

Add spell checking to your configuration:

<!-- Add to solrconfig.xml -->
<searchComponent name="spellcheck" class="solr.SpellCheckComponent">
    <str name="queryAnalyzerFieldType">text_general</str>
    <lst name="spellchecker">
        <str name="name">default</str>
        <str name="field">_text_</str>
        <str name="classname">solr.DirectSolrSpellChecker</str>
        <str name="distanceMeasure">internal</str>
        <float name="accuracy">0.5</float>
        <int name="maxEdits">2</int>
        <int name="minPrefix">1</int>
        <int name="maxInspections">5</int>
        <int name="minQueryLength">4</int>
        <float name="maxQueryFrequency">0.01</float>
    </lst>
</searchComponent>

<requestHandler name="/spell" class="solr.SearchHandler">
    <lst name="defaults">
        <str name="spellcheck">true</str>
        <str name="spellcheck.dictionary">default</str>
        <str name="spellcheck.count">5</str>
    </lst>
    <arr name="last-components">
        <str>spellcheck</str>
    </arr>
</requestHandler>

Environment Variables

Configure Solr using environment variables in the Klutch.sh dashboard:

# Memory configuration
SOLR_HEAP=1g
SOLR_JAVA_MEM=-Xms1g -Xmx1g

# JVM options
GC_LOG_OPTS=-verbose:gc -XX:+PrintGCDetails

# Solr options
SOLR_OPTS=-Dsolr.autoSoftCommit.maxTime=3000

# Timezone
TZ=UTC

# Enable verbose logging (for debugging)
VERBOSE=yes

Memory Tuning

Adjust memory based on your index size:

Index Size	Recommended Heap
< 1 GB	512m - 1g
1-5 GB	1g - 2g
5-20 GB	2g - 4g
> 20 GB	4g - 8g

Performance Optimization

Query Performance

Use filter queries (fq) for non-scoring filters:

# Good: Uses filter cache
curl "https://example-app.klutch.sh/solr/products/select?q=headphones&fq=category:Electronics&fq=in_stock:true"

# Less optimal: All in main query
curl "https://example-app.klutch.sh/solr/products/select?q=headphones AND category:Electronics AND in_stock:true"

Request only needed fields:

curl "https://example-app.klutch.sh/solr/products/select?q=*:*&fl=id,name,price&wt=json"

Use cursor-based pagination for deep paging:

# First page
curl "https://example-app.klutch.sh/solr/products/select?q=*:*&sort=id+asc&cursorMark=*&rows=100"

# Next page (use cursorMark from previous response)
curl "https://example-app.klutch.sh/solr/products/select?q=*:*&sort=id+asc&cursorMark=AoE...&rows=100"

Index Optimization

# Optimize index (use sparingly, resource-intensive)
curl "https://example-app.klutch.sh/solr/products/update?optimize=true&waitFlush=true"

# Commit pending changes
curl "https://example-app.klutch.sh/solr/products/update?commit=true"

Cache Configuration

Tune caches in solrconfig.xml based on your query patterns:

<filterCache class="solr.CaffeineCache"
    size="1024"
    initialSize="512"
    autowarmCount="128"/>

<queryResultCache class="solr.CaffeineCache"
    size="1024"
    initialSize="512"
    autowarmCount="128"/>

<documentCache class="solr.CaffeineCache"
    size="1024"
    initialSize="512"/>

Security Considerations

Authentication

For production deployments, consider implementing authentication:

<!-- Add to solr.xml or security.json -->
{
  "authentication": {
    "class": "solr.BasicAuthPlugin",
    "credentials": {
      "admin": "hashed_password_here"
    }
  },
  "authorization": {
    "class": "solr.RuleBasedAuthorizationPlugin",
    "permissions": [
      {"name": "read", "role": "user"},
      {"name": "update", "role": "admin"},
      {"name": "all", "role": "admin"}
    ],
    "user-role": {
      "admin": ["admin"],
      "reader": ["user"]
    }
  }
}

Request Rate Limiting

Add rate limiting in your application layer or use a reverse proxy.

Monitoring and Administration

Admin UI

Access the Solr Admin UI at https://example-app.klutch.sh/solr/.

Key sections:

Dashboard: Overview of Solr instance
Core Admin: Manage cores
Query: Test search queries
Analysis: Test text analysis
Schema: View/modify schema

Health Check Endpoints

# Ping endpoint
curl "https://example-app.klutch.sh/solr/products/admin/ping"

# System info
curl "https://example-app.klutch.sh/solr/admin/info/system"

# Core status
curl "https://example-app.klutch.sh/solr/admin/cores?action=STATUS"

Monitoring Queries

# Get query statistics
curl "https://example-app.klutch.sh/solr/products/admin/mbeans?cat=QUERYHANDLER&stats=true&wt=json"

# Get cache statistics
curl "https://example-app.klutch.sh/solr/products/admin/mbeans?cat=CACHE&stats=true&wt=json"

Troubleshooting

Common Issues

Core not found:

Check that the core was created during startup
Verify the configset path in the Dockerfile
Check Solr logs for schema errors

Out of memory errors:

Increase SOLR_HEAP environment variable
Optimize queries to use filter queries
Enable doc values for sorting/faceting fields

Slow queries:

Use debugQuery=true to analyze query execution
Check if appropriate fields are indexed
Review filter cache hit rates

Index corruption:

Stop Solr gracefully before redeployment
Use commit=true after bulk indexing
Enable transaction logs for recovery

Viewing Logs

Logs are available in the Klutch.sh dashboard. For more detailed logging:

# Enable debug logging
SOLR_OPTS=-Dsolr.log.level=DEBUG

Resources

Conclusion

You now have Apache Solr deployed on Klutch.sh, configured with:

Custom schema for product catalog search
Optimized query handlers with boosted fields
Autocomplete functionality with edge n-gram analysis
Persistent storage for your search index
Performance-tuned caching and indexing

Solr’s powerful full-text search capabilities make it an excellent choice for building sophisticated search experiences in your applications. Whether you’re powering e-commerce product search, document management systems, or analytics dashboards, Solr on Klutch.sh provides the scalability and performance you need.