Deploying Fess
Introduction
Fess is a powerful, open-source enterprise search server that makes it incredibly easy to build and deploy search functionality across websites, file systems, databases, and various data sources. Built on top of Elasticsearch/OpenSearch, Fess provides a user-friendly web interface for crawling, indexing, and searching content without requiring deep technical knowledge of search engines.
Whether you’re building an internal knowledge base, creating a customer-facing search portal, or implementing full-text search for your organization, Fess simplifies the complexity of enterprise search with features like:
- Web Crawling: Automatically discover and index content from websites and web applications
- File System Indexing: Search through documents in local and network file systems (PDF, Word, Excel, PowerPoint, and more)
- Database Integration: Index and search data from relational databases
- User Authentication: Built-in support for LDAP, Active Directory, and SSO
- Search Relevance Tuning: Powerful ranking and relevance configuration options
- Multi-language Support: Built-in support for Japanese, English, Chinese, Korean, and more
- RESTful API: Programmatic access to search functionality
- Role-based Access Control: Secure search results based on user permissions
Deploying Fess on Klutch.sh gives you a production-ready search platform with automatic HTTPS, persistent storage for your search indexes, and easy configuration through environment variables.
What You’ll Learn
- How to deploy Fess with a Dockerfile on Klutch.sh
- Setting up Elasticsearch/OpenSearch as the backend search engine
- Configuring persistent storage for crawled data and search indexes
- Implementing environment variables for production deployment
- Best practices for security, performance, and scaling
Prerequisites
Before you begin, ensure you have:
- A Klutch.sh account
- A GitHub account with a repository for your Fess project
- An Elasticsearch or OpenSearch instance (you can deploy one on Klutch.sh or use a managed service)
- Basic familiarity with Docker and search engine concepts
- (Optional) Familiarity with web crawling and search configuration
Understanding Fess Architecture
Fess consists of several key components:
- Fess Application Server: The main web application that provides the admin interface and search frontend
- Search Engine Backend: Elasticsearch or OpenSearch for storing and querying search indexes
- Crawler: Background job system that crawls and indexes content from various sources
- Storage Layer: Persistent storage for configuration, logs, and temporary files
The application runs on port 8080 by default and communicates with Elasticsearch/OpenSearch to store and retrieve search data.
Step 1: Prepare Your GitHub Repository
-
Create a new GitHub repository for your Fess deployment or use an existing repository.
-
Create a
Dockerfilein the root of your repository with the following content: - (Optional) Create a
.dockerignorefile to exclude unnecessary files: - Create a
README.mdfile with basic information about your deployment: - Commit and push your changes to GitHub:
FROM codelibs/fess:14.14
# Set the working directoryWORKDIR /opt/fess
# Expose the Fess web interface portEXPOSE 8080
# The Fess image includes all necessary configurations# Environment variables will be set through Klutch.sh dashboard# Data will be persisted to /var/lib/fess and /opt/fess/logsNote: This uses Fess version 14.14. You can check Docker Hub for the latest version tags.
.git.github*.mdREADME.md.env.env.localdocker-compose.yml# Fess Enterprise Search Deployment
This repository contains the configuration for deploying Fess on Klutch.sh.
## Environment Variables
The following environment variables are required:
- `FESS_DICTIONARY_PATH`: Path to dictionary files- `ES_HTTP_URL`: Elasticsearch/OpenSearch HTTP endpoint- `FESS_ADMIN_PASSWORD`: Admin user password
See the Klutch.sh dashboard for the full list of configured variables.git add Dockerfile .dockerignore README.mdgit commit -m "Add Dockerfile for Fess deployment on Klutch.sh"git push origin mainStep 2: Deploy Elasticsearch or OpenSearch
Fess requires Elasticsearch or OpenSearch as its backend. You have two options:
Option A: Use a Managed Service
Use a managed Elasticsearch/OpenSearch service like:
- AWS OpenSearch Service
- Elastic Cloud
- Aiven for OpenSearch
Make note of the HTTP endpoint URL, username, and password.
Option B: Deploy on Klutch.sh
You can deploy Elasticsearch or OpenSearch as a separate app on Klutch.sh:
Create an elasticsearch-dockerfile in a separate repository:
FROM docker.elastic.co/elasticsearch/elasticsearch:8.11.3
# Disable security for simplified setup (enable in production)ENV discovery.type=single-nodeENV xpack.security.enabled=falseENV ES_JAVA_OPTS="-Xms512m -Xmx512m"
EXPOSE 9200 9300Important considerations for Elasticsearch/OpenSearch:
- Use TCP traffic type in Klutch.sh
- Set internal port to 9200
- Attach a persistent volume to
/usr/share/elasticsearch/data - Allocate at least 2GB RAM and 10GB storage
Step 3: Create Your App on Klutch.sh
-
Log in to Klutch.sh and navigate to the dashboard.
-
Create a new project (if you don’t have one already) by clicking “New Project” and providing a project name like “Enterprise Search”.
-
Create a new app within your project by clicking “New App”.
-
Connect your GitHub repository by selecting it from the list of available repositories.
-
Configure the build settings:
- Klutch.sh will automatically detect the Dockerfile in your repository root
- The build will use this Dockerfile automatically
-
Set the internal port to
8080(Fess’s default port). This is the port that traffic will be routed to within the container. -
Select HTTP traffic for the app’s traffic type since Fess serves a web interface.
Step 4: Configure Persistent Storage
Fess requires persistent storage to retain your search configuration, crawled data, and logs across deployments.
-
In your app settings, navigate to the “Volumes” section.
-
Add the first persistent volume for Fess data:
- Mount Path:
/var/lib/fess - Size: Start with at least 20 GB (adjust based on the amount of content you’ll be indexing)
- Mount Path:
-
Add a second persistent volume for logs:
- Mount Path:
/opt/fess/logs - Size: 5 GB is usually sufficient for logs
- Mount Path:
-
Save the volume configuration. This ensures all your crawled data, configuration, and logs persist even when the container is restarted or redeployed.
The persistent volumes store:
/var/lib/fess: Fess configuration, crawl schedules, job queues, and temporary files/opt/fess/logs: Application logs, error logs, and crawl logs
For more details on managing persistent storage, see the Volumes Guide.
Step 5: Configure Environment Variables
Fess requires several environment variables to connect to Elasticsearch/OpenSearch and configure its behavior.
-
In your app settings, navigate to the “Environment Variables” section.
-
Add the following required Elasticsearch/OpenSearch variables:
- Add Fess-specific configuration variables:
- Add optional crawler configuration:
- Add optional search result configuration:
- Mark sensitive values as secrets in the Klutch.sh UI (passwords, API keys) to prevent them from appearing in logs.
# Elasticsearch/OpenSearch connection (required)ES_HTTP_URL=http://your-elasticsearch.klutch.sh:8000ES_TRANSPORT_URL=your-elasticsearch.klutch.sh:8000
# If using authenticationES_HTTP_USERNAME=elasticES_HTTP_PASSWORD=your-elasticsearch-passwordNote: If you deployed Elasticsearch on Klutch.sh using TCP traffic, the external port will be 8000. Replace your-elasticsearch.klutch.sh with your actual Elasticsearch app URL.
# Admin credentials (required - change these!)FESS_ADMIN_PASSWORD=your-strong-admin-password
# Java heap size (adjust based on your instance size)FESS_JAVA_OPTS=-Xms512m -Xmx1g
# Dictionary path (for language processing)FESS_DICTIONARY_PATH=/opt/fess/app/WEB-INF/classes/fess_dict
# Logging level (optional)FESS_LOG_LEVEL=info
# Session timeout in seconds (optional, default is 3600)FESS_SESSION_TIMEOUT=7200
# Max upload size in bytes (optional, default is 4MB)FESS_MAX_UPLOAD_SIZE=10485760# Number of crawler threadsFESS_CRAWLER_THREADS=5
# Crawler user agentFESS_CRAWLER_USER_AGENT=Fess/14.14
# Maximum depth for web crawlingFESS_CRAWLER_MAX_DEPTH=3
# Crawl interval in millisecondsFESS_CRAWLER_INTERVAL=1000# Number of search results per pageFESS_SEARCH_PAGE_SIZE=20
# Default search timeout in millisecondsFESS_SEARCH_TIMEOUT=30000
# Enable query suggestionsFESS_SEARCH_SUGGEST=trueImportant Security Notes:
- Never commit passwords or secrets to your repository
- Always use strong, randomly generated passwords for
FESS_ADMIN_PASSWORD - Use Klutch.sh environment variables for all sensitive data
- Consider enabling Elasticsearch/OpenSearch authentication in production
Step 6: Deploy Your Application
-
Review your configuration to ensure all settings are correct:
- Dockerfile is detected
- Internal port is set to
8080 - Persistent volumes are mounted to
/var/lib/fessand/opt/fess/logs - Environment variables are configured with proper Elasticsearch/OpenSearch connection details
- Traffic type is set to HTTP
-
Click “Deploy” to start the build and deployment process.
-
Monitor the build logs to ensure the deployment completes successfully. The initial build typically takes 3-5 minutes.
-
Wait for the deployment to complete. Once done, you’ll see your app URL (e.g.,
https://example-app.klutch.sh).
Step 7: Initial Setup and Configuration
-
Access your Fess instance by navigating to your app URL (e.g.,
https://example-app.klutch.sh). -
Log in to the admin interface:
- Click on the “Admin” link in the top right
- Default username:
admin - Password: The value you set for
FESS_ADMIN_PASSWORD - URL:
https://example-app.klutch.sh/admin
-
Configure your first web crawler:
- Navigate to Crawler > Web
- Click “Create New”
- Enter a name for your crawler (e.g., “My Website”)
- Enter the URL to crawl (e.g.,
https://www.example.com) - Configure crawl settings:
- Max Depth: How deep to follow links (default: 3)
- Max Access Count: Maximum pages to crawl (default: 1000)
- Interval: Delay between requests in milliseconds (default: 1000)
- Click “Create”
-
Start the crawler:
- Go to Scheduler > Default Crawler
- Click “Start Now” to begin crawling immediately
- Or configure a schedule for automatic crawling
-
Wait for indexing to complete:
- Monitor crawl progress in System Info > Crawling Info
- Check for errors in the logs if needed
-
Test your search:
- Return to the home page (
https://example-app.klutch.sh) - Enter a search query related to the content you crawled
- Verify that search results are displayed correctly
- Return to the home page (
Getting Started: Sample Usage
Here are common tasks and code examples for working with Fess:
Basic Web Search Interface
Access the search interface at your app URL:
https://example-app.klutch.shEnter queries in the search box to find indexed content.
REST API Search
Fess provides a JSON API for programmatic search access:
# Basic search querycurl "https://example-app.klutch.sh/json/?q=search+term"
# Search with filterscurl "https://example-app.klutch.sh/json/?q=fess&num=20&start=0"
# Get search suggestionscurl "https://example-app.klutch.sh/suggest/?q=fe"JavaScript Integration
Embed search results in your web application:
<!DOCTYPE html><html><head> <title>Fess Search Integration</title> <script src="https://code.jquery.com/jquery-3.6.0.min.js"></script></head><body> <input type="text" id="searchQuery" placeholder="Search..."> <button onclick="performSearch()">Search</button> <div id="results"></div>
<script> function performSearch() { const query = document.getElementById('searchQuery').value; const apiUrl = `https://example-app.klutch.sh/json/?q=${encodeURIComponent(query)}`;
fetch(apiUrl) .then(response => response.json()) .then(data => { let html = '<h2>Search Results</h2>'; data.response.result.forEach(item => { html += ` <div class="result"> <h3><a href="${item.url}" target="_blank">${item.title}</a></h3> <p>${item.content_description}</p> <small>${item.url}</small> </div> `; }); document.getElementById('results').innerHTML = html; }) .catch(error => console.error('Search error:', error)); } </script></body></html>Python API Client
Use the Fess JSON API from Python:
import requests
class FessClient: def __init__(self, base_url): self.base_url = base_url self.api_url = f"{base_url}/json/"
def search(self, query, num=10, start=0): """Perform a search query""" params = { 'q': query, 'num': num, 'start': start } response = requests.get(self.api_url, params=params) return response.json()
def suggest(self, query): """Get search suggestions""" suggest_url = f"{self.base_url}/suggest/" params = {'q': query} response = requests.get(suggest_url, params=params) return response.json()
# Usageclient = FessClient("https://example-app.klutch.sh")results = client.search("artificial intelligence", num=20)
for item in results['response']['result']: print(f"Title: {item['title']}") print(f"URL: {item['url']}") print(f"Score: {item['score']}") print("---")Node.js API Client
const axios = require('axios');
class FessClient { constructor(baseUrl) { this.baseUrl = baseUrl; this.apiUrl = `${baseUrl}/json/`; }
async search(query, options = {}) { const params = { q: query, num: options.num || 10, start: options.start || 0, ...options };
try { const response = await axios.get(this.apiUrl, { params }); return response.data; } catch (error) { console.error('Search error:', error); throw error; } }
async suggest(query) { const suggestUrl = `${this.baseUrl}/suggest/`; try { const response = await axios.get(suggestUrl, { params: { q: query } }); return response.data; } catch (error) { console.error('Suggestion error:', error); throw error; } }}
// Usageconst client = new FessClient('https://example-app.klutch.sh');
(async () => { const results = await client.search('machine learning', { num: 20 });
results.response.result.forEach(item => { console.log(`Title: ${item.title}`); console.log(`URL: ${item.url}`); console.log(`Score: ${item.score}`); console.log('---'); });})();Advanced Configuration
Custom Dockerfile for Additional Features
If you need custom dictionaries, plugins, or configurations:
FROM codelibs/fess:14.14
# Install additional system packagesUSER rootRUN apt-get update && apt-get install -y \ curl \ vim \ && rm -rf /var/lib/apt/lists/*
# Copy custom dictionariesCOPY ./custom_dicts /opt/fess/app/WEB-INF/classes/fess_dict/
# Copy custom configurationCOPY ./fess_config.properties /opt/fess/app/WEB-INF/classes/
# Switch back to fess userUSER fess
WORKDIR /opt/fessEXPOSE 8080File System Crawler Setup
To crawl local file systems or network shares:
-
Add a persistent volume for the files to crawl:
- Mount Path:
/var/fess-files - Size: Based on your file storage needs
- Mount Path:
-
Configure a file system crawler:
- Navigate to Crawler > File System
- Click “Create New”
- Enter the path:
file:///var/fess-files/ - Configure supported file types (PDF, Word, Excel, etc.)
- Save and start the crawler
-
Upload files to the volume through your deployment workflow or SFTP
Data Store Crawler Setup
To index data from databases:
-
Install JDBC drivers in a custom Dockerfile if needed
-
Configure a data store crawler:
- Navigate to Crawler > Data Store
- Click “Create New”
- Select database type (MySQL, PostgreSQL, etc.)
- Enter connection details:
url=jdbc:postgresql://your-db.klutch.sh:8000/dbnameusername=dbuserpassword=dbpass
- Define the SQL query to fetch data:
SELECT id, title, content, url, updated_atFROM articlesWHERE published = true
- Map fields to Fess fields
- Save and start the crawler
Production Best Practices
Security
- Change default admin password: Always use a strong, unique password for the admin account
- Enable Elasticsearch authentication: Configure username/password authentication for Elasticsearch
- Use HTTPS: Klutch.sh provides automatic HTTPS for all apps
- Implement user authentication: Configure LDAP, Active Directory, or SSO for multi-user access
- Restrict admin access: Use role-based permissions to limit administrative functions
- Regular security updates: Keep Fess and Elasticsearch updated to the latest versions
Performance Optimization
-
Allocate sufficient resources:
- Minimum 1 CPU core and 2GB RAM for Fess
- At least 2GB RAM for Elasticsearch
- Scale based on crawl volume and search traffic
-
Configure crawler throttling:
- Set appropriate
FESS_CRAWLER_INTERVALto avoid overwhelming target sites - Limit
FESS_CRAWLER_THREADSbased on your resources
- Set appropriate
-
Optimize Elasticsearch:
- Configure proper Java heap size (50% of available RAM, max 32GB)
- Use SSD storage for persistent volumes
- Implement index lifecycle management for old data
-
Enable caching:
- Configure HTTP caching headers for search results
- Use a CDN for static assets
-
Monitor performance:
- Track search response times
- Monitor crawler job duration
- Watch Elasticsearch cluster health
Backup Strategy
-
Backup Fess configuration:
- Export crawler configurations regularly
- Document environment variables and settings
-
Backup Elasticsearch indexes:
- Configure Elasticsearch snapshots
- Store snapshots in remote storage (S3, GCS, etc.)
- Test restoration procedures
-
Backup logs:
- Archive crawl logs for audit trails
- Monitor and rotate log files to prevent disk space issues
Scaling Considerations
For high-traffic deployments:
-
Vertical scaling:
- Increase CPU and memory allocation
- Use larger persistent volumes
-
Elasticsearch scaling:
- Consider a multi-node Elasticsearch cluster
- Use index sharding for large datasets
- Implement read replicas for search performance
-
Crawler optimization:
- Distribute crawling across multiple time windows
- Use incremental crawling for large sites
- Prioritize high-value content
Troubleshooting
Application Won’t Start
Issue: Container starts but Fess doesn’t respond
Solutions:
- Verify internal port is set to
8080 - Check Elasticsearch connection in environment variables
- Review application logs for startup errors
- Ensure
ES_HTTP_URLis accessible from the Fess container - Verify Java heap size settings aren’t exceeding available memory
Cannot Connect to Elasticsearch
Issue: Fess reports Elasticsearch connection errors
Solutions:
- Verify
ES_HTTP_URLformat is correct (includehttp://orhttps://) - Check that Elasticsearch is running and accessible
- Confirm port 8000 (or 9200) is correct for your Elasticsearch deployment
- Test Elasticsearch connectivity:
curl http://your-elasticsearch.klutch.sh:8000 - Verify authentication credentials if using secured Elasticsearch
Crawler Not Indexing Content
Issue: Crawler runs but no content appears in search results
Solutions:
- Check crawler logs in
/opt/fess/logs/fess-crawler.log - Verify the target URL is accessible from the Fess container
- Ensure robots.txt allows crawling
- Check crawler configuration for correct URL patterns
- Verify Elasticsearch has sufficient storage space
- Review crawler user agent settings if sites are blocking
Search Results Not Appearing
Issue: Search queries return no results despite successful crawling
Solutions:
- Verify Elasticsearch indexing completed successfully
- Check Elasticsearch cluster health
- Review Fess search configuration
- Ensure proper field mappings in Elasticsearch
- Clear and rebuild search indexes if necessary
- Check for Elasticsearch storage capacity issues
Out of Storage Space
Issue: Cannot crawl more content or persistent volume is full
Solutions:
- Increase persistent volume size in Klutch.sh
- Clean up old or unused crawled data
- Implement index rotation and cleanup policies
- Delete old Elasticsearch indexes
- Archive and compress old log files
Slow Search Performance
Issue: Search queries take too long to return results
Solutions:
- Increase Fess and Elasticsearch resources (CPU/RAM)
- Optimize Elasticsearch index settings
- Reduce the number of fields being searched
- Implement query result caching
- Add more Elasticsearch nodes for larger datasets
- Review and optimize crawler schedules to reduce load
Memory Issues
Issue: Container crashes with out-of-memory errors
Solutions:
- Increase the instance memory allocation
- Adjust
FESS_JAVA_OPTSheap size settings - Reduce
FESS_CRAWLER_THREADSto lower memory usage - Optimize Elasticsearch memory settings
- Review and reduce concurrent crawler jobs
Monitoring and Maintenance
Health Checks
Monitor these endpoints for system health:
# Check Fess statuscurl https://example-app.klutch.sh/admin/system
# Check Elasticsearch healthcurl http://your-elasticsearch.klutch.sh:8000/_cluster/healthLog Monitoring
Important log files to monitor:
- Application logs:
/opt/fess/logs/fess.log - Crawler logs:
/opt/fess/logs/fess-crawler.log - Audit logs:
/opt/fess/logs/audit.log - Error logs:
/opt/fess/logs/error.log
Regular Maintenance Tasks
-
Weekly:
- Review crawler logs for errors
- Check search analytics and popular queries
- Monitor storage usage
-
Monthly:
- Update Fess and Elasticsearch to latest patch versions
- Review and optimize crawler schedules
- Analyze search performance metrics
- Archive old logs
-
Quarterly:
- Review and update crawler configurations
- Optimize Elasticsearch indexes
- Test backup and restore procedures
- Review user access and permissions
Integration Examples
WordPress Plugin Integration
Create a WordPress search plugin that uses Fess:
<?phpfunction fess_search_results($query) { $fess_url = 'https://example-app.klutch.sh/json/'; $params = array( 'q' => urlencode($query), 'num' => 10 );
$url = $fess_url . '?' . http_build_query($params); $response = wp_remote_get($url);
if (is_wp_error($response)) { return array('error' => 'Search unavailable'); }
$body = wp_remote_retrieve_body($response); return json_decode($body, true);}
// Add shortcode for search resultsadd_shortcode('fess_search', function($atts) { $query = isset($_GET['q']) ? sanitize_text_field($_GET['q']) : '';
if (empty($query)) { return '<p>Please enter a search query.</p>'; }
$results = fess_search_results($query);
$html = '<div class="fess-results">'; foreach ($results['response']['result'] as $item) { $html .= sprintf( '<div class="result"><h3><a href="%s">%s</a></h3><p>%s</p></div>', esc_url($item['url']), esc_html($item['title']), esc_html($item['content_description']) ); } $html .= '</div>';
return $html;});React Search Component
import React, { useState } from 'react';
const FessSearch = () => { const [query, setQuery] = useState(''); const [results, setResults] = useState([]); const [loading, setLoading] = useState(false);
const handleSearch = async (e) => { e.preventDefault(); setLoading(true);
try { const response = await fetch( `https://example-app.klutch.sh/json/?q=${encodeURIComponent(query)}&num=20` ); const data = await response.json(); setResults(data.response.result); } catch (error) { console.error('Search error:', error); } finally { setLoading(false); } };
return ( <div className="fess-search"> <form onSubmit={handleSearch}> <input type="text" value={query} onChange={(e) => setQuery(e.target.value)} placeholder="Search..." /> <button type="submit">Search</button> </form>
{loading && <p>Searching...</p>}
<div className="results"> {results.map((item, index) => ( <div key={index} className="result-item"> <h3> <a href={item.url} target="_blank" rel="noopener noreferrer"> {item.title} </a> </h3> <p>{item.content_description}</p> <small>{item.url}</small> </div> ))} </div> </div> );};
export default FessSearch;Migrating from Other Search Platforms
From Apache Solr
If you’re migrating from Solr to Fess:
- Export Solr data using the Solr export API
- Transform data to match Fess’s expected format
- Import into Elasticsearch using bulk API
- Configure Fess crawlers to maintain data freshness
- Update application search endpoints to use Fess JSON API
From Algolia
Migrating from Algolia to Fess:
- Export Algolia indexes using their API
- Map Algolia attributes to Fess fields
- Import data into Elasticsearch
- Configure search relevance to match Algolia behavior
- Update client code to use Fess REST API
Cost Optimization
Resource Allocation
- Start with minimal resources and scale based on usage
- Monitor actual resource utilization before upgrading
- Use crawler schedules during off-peak hours
Elasticsearch Optimization
- Use index lifecycle management to delete old data
- Implement index compression
- Reduce replica count for non-critical indexes
Crawler Efficiency
- Limit crawl depth to essential levels
- Use robots.txt to exclude unnecessary pages
- Implement incremental crawling for large sites
- Schedule crawls during low-traffic periods
Resources
- Fess Official Documentation
- Fess GitHub Repository
- Fess Docker Hub
- Fess API Documentation
- Klutch.sh Quick Start Guide
- Klutch.sh Volumes Guide
- Klutch.sh Builds Guide
- Klutch.sh Deployments Guide
Conclusion
You now have a fully operational Fess enterprise search server running on Klutch.sh with persistent storage, configured Elasticsearch backend, and production-ready settings. Your search platform is ready to:
- Crawl and index websites, file systems, and databases
- Provide powerful full-text search capabilities
- Handle multi-language content and search queries
- Integrate with your applications through REST APIs
- Scale as your content and search traffic grow
With Fess deployed on Klutch.sh, you have a robust, self-hosted search solution that gives you complete control over your search data and functionality. For questions or community support, refer to the Fess GitHub Discussions or the official documentation.