Deploying Fedora Commons Repository

Introduction

Fedora Commons Repository (often simply called Fedora) is a robust, open-source digital repository system designed for the management and preservation of complex digital objects. Built on Java and leveraging proven enterprise technologies, Fedora provides libraries, museums, archives, universities, and research institutions with a flexible framework for organizing, securing, and providing access to digital collections spanning documents, images, audio, video, datasets, and more.

Originally developed at Cornell University and the University of Virginia, Fedora has become a cornerstone of digital preservation infrastructure worldwide. The system uses a modular architecture that separates storage, access, and preservation concerns, allowing organizations to build custom solutions that meet specific institutional needs while maintaining interoperability with other systems through open standards.

Key Features

Flexible Content Model - Store any type of digital content with custom metadata schemas
Linked Data Support - RDF-based resource descriptions enable semantic web integration
RESTful API - Comprehensive HTTP API for programmatic access and automation
Modular Architecture - Pluggable storage backends, indexing systems, and authentication mechanisms
Preservation Services - Versioning, fixity checking, and format validation for long-term preservation
Access Control - Fine-grained permissions using Web Access Control (WebAC)
Binary Storage - Efficient storage of large files with support for external storage systems
Transactions - Atomic operations ensure data consistency across complex updates
SPARQL Endpoint - Query repository metadata using SPARQL query language
Integration Ready - Works with Solr for search, PostgreSQL for metadata, and various storage backends
Standards Compliant - Implements Portland Common Data Model, LDP, Memento, and other specifications
Audit Trail - Complete history of all repository operations for compliance and accountability
Container Hierarchy - Organize resources in nested containers for logical structure
External Content - Reference content stored in other systems without duplication
Open Source - Apache 2.0 licensed with active community development

Use Cases

Digital Libraries - Build comprehensive digital collections with full-text search and metadata harvesting
Institutional Repositories - Manage research outputs, theses, dissertations, and scholarly communications
Digital Preservation - Long-term preservation of cultural heritage and institutional memory
Data Management - Research data repositories with version control and access management
Media Archives - Organize and provide access to audio, video, and image collections
Digital Humanities - Support scholarly projects with complex digital objects and relationships

Why Deploy Fedora Commons on Klutch.sh?

Deploying Fedora Repository on Klutch.sh provides institutional-grade digital asset management infrastructure:

Enterprise Reliability - High-availability infrastructure for mission-critical digital collections
Cost Effective - No per-object or per-storage-TB pricing common with managed services
Data Sovereignty - Complete control over your digital assets and metadata
Fast Deployment - Automatic Dockerfile detection eliminates complex server configuration
Persistent Storage - Reliable volumes for repository data, indexes, and backups
Custom Domains - Professional access at your institution’s domain
Automatic HTTPS - Built-in SSL certificates for secure repository access
Scalable Performance - Handle large collections with millions of objects
Integration Flexibility - Deploy alongside PostgreSQL, Solr, and other services
Development Environments - Spin up testing instances without infrastructure overhead

Prerequisites

Before deploying Fedora Commons on Klutch.sh, ensure you have:

A Klutch.sh account
Git repository on GitHub containing your project files
Basic understanding of Java applications and repository systems
PostgreSQL database deployed (see our PostgreSQL guide)
Solr instance deployed (see our Solr guide)
Understanding of RDF and linked data concepts (helpful but not required)

Preparing Your Fedora Commons Repository

Fedora Commons is a Java-based application that runs on the Java Platform and uses PostgreSQL for metadata storage and Solr for indexing. We’ll create a containerized deployment that includes all necessary components.

Project Structure

Create the following structure in your Git repository:

fedora-repository/
├── Dockerfile
├── config/
│   ├── application.yaml
│   └── repository.json
└── README.md

Create the Dockerfile

Create a Dockerfile in your repository root:

FROM openjdk:11-jre-slim

# Install system dependencies
RUN apt-get update && apt-get install -y \
    wget \
    curl \
    unzip \
    && rm -rf /var/lib/apt/lists/*

# Set Fedora version
ENV FEDORA_VERSION=6.2.0

# Create application directory
WORKDIR /opt/fedora

# Download Fedora Commons
RUN wget -q https://github.com/fcrepo/fcrepo/releases/download/fcrepo-${FEDORA_VERSION}/fcrepo-webapp-${FEDORA_VERSION}.jar \
    -O fcrepo-webapp.jar

# Create data and configuration directories
RUN mkdir -p /data/fcrepo \
    && mkdir -p /config

# Copy custom configuration (if exists)
COPY config/application.yaml /config/application.yaml 2>/dev/null || true
COPY config/repository.json /config/repository.json 2>/dev/null || true

# Set environment variables for configuration
ENV FCREPO_HOME=/data/fcrepo \
    FCREPO_CONFIG_FILE=/config/application.yaml \
    JAVA_OPTS="-Xmx2g -Xms1g"

# Expose Fedora port
EXPOSE 8080

# Health check
HEALTHCHECK --interval=30s --timeout=10s --start-period=60s --retries=3 \
    CMD curl -f http://localhost:8080/rest/ || exit 1

# Start Fedora
CMD java $JAVA_OPTS \
    -Dfcrepo.home=$FCREPO_HOME \
    -Dfcrepo.config.file=$FCREPO_CONFIG_FILE \
    -jar fcrepo-webapp.jar

Configuration Files

Create config/application.yaml for Fedora configuration:

# Fedora Repository Configuration
server:
  port: 8080
  servlet:
    context-path: /

spring:
  datasource:
    url: ${DATABASE_URL:jdbc:postgresql://localhost:5432/fedora}
    username: ${DATABASE_USER:fedora}
    password: ${DATABASE_PASSWORD:changeme}
    driver-class-name: org.postgresql.Driver
  jpa:
    hibernate:
      ddl-auto: update
    properties:
      hibernate:
        dialect: org.hibernate.dialect.PostgreSQLDialect

fcrepo:
  # Repository paths
  home: ${FCREPO_HOME:/data/fcrepo}

  # Storage configuration
  storage:
    type: file
    location: ${FCREPO_HOME:/data/fcrepo}/objects

  # Database configuration
  persistence:
    type: database

  # Indexing configuration
  indexing:
    enabled: true
    solr:
      url: ${SOLR_URL:http://localhost:8983/solr/fedora}

  # Authentication
  auth:
    webac:
      enabled: true
      path: ${FCREPO_HOME:/data/fcrepo}/acl

  # Versioning
  versioning:
    enabled: true

  # External content
  external-content:
    enabled: true
    allowed-schemes:
      - http
      - https
      - file

# Logging
logging:
  level:
    org.fcrepo: INFO
    org.springframework: WARN
  file:
    name: ${FCREPO_HOME:/data/fcrepo}/logs/fedora.log

Create config/repository.json for repository metadata:

{
  "@context": {
    "dc": "http://purl.org/dc/elements/1.1/",
    "fedora": "http://fedora.info/definitions/v4/repository#"
  },
  "@id": "",
  "@type": "fedora:Repository",
  "dc:title": "Fedora Repository",
  "dc:description": "Digital asset management and preservation repository"
}

Alternative: Simplified Dockerfile

If you prefer a simpler setup without custom configuration:

FROM openjdk:11-jre-slim

# Install dependencies
RUN apt-get update && apt-get install -y wget curl && rm -rf /var/lib/apt/lists/*

# Set version and working directory
ENV FEDORA_VERSION=6.2.0
WORKDIR /opt/fedora

# Download Fedora
RUN wget -q https://github.com/fcrepo/fcrepo/releases/download/fcrepo-${FEDORA_VERSION}/fcrepo-webapp-${FEDORA_VERSION}.jar \
    -O fcrepo-webapp.jar

# Create data directory
RUN mkdir -p /data/fcrepo

ENV FCREPO_HOME=/data/fcrepo

EXPOSE 8080

CMD ["java", "-jar", "-Dfcrepo.home=/data/fcrepo", "fcrepo-webapp.jar"]

Environment Variables

Fedora Commons can be configured using environment variables:

Required:

DATABASE_URL - PostgreSQL connection string (e.g., jdbc:postgresql://postgres.klutch.sh:8000/fedora)
DATABASE_USER - Database username
DATABASE_PASSWORD - Database password (mark as sensitive)

Optional:

SOLR_URL - Solr instance URL for indexing (e.g., http://solr.klutch.sh/solr/fedora)
FCREPO_HOME - Base directory for repository data (default: /data/fcrepo)
JAVA_OPTS - JVM options (e.g., -Xmx4g -Xms2g)
FCREPO_MODESHAPE_CONFIGURATION - Path to custom ModeShape configuration
FCREPO_SPRING_CONFIGURATION - Additional Spring configuration

Setting Up Database

Before deploying Fedora, create a PostgreSQL database. Connect to your PostgreSQL instance and run:

-- Create database for Fedora
CREATE DATABASE fedora;

-- Create user with appropriate permissions
CREATE USER fedora_user WITH PASSWORD 'your_secure_password';

-- Grant permissions
GRANT ALL PRIVILEGES ON DATABASE fedora TO fedora_user;

-- Connect to the database
\c fedora

-- Grant schema permissions
GRANT ALL ON SCHEMA public TO fedora_user;

Setting Up Solr

Create a Solr core for Fedora indexing. Connect to your Solr instance and create a core:

# Create Fedora core with basic configuration
curl "http://your-solr-instance:8983/solr/admin/cores?action=CREATE&name=fedora&configSet=_default"

For production deployments, you’ll want to customize the Solr schema for optimal Fedora integration. Create a custom schema that includes Fedora-specific fields.

Deploying to Klutch.sh

Push to GitHub: Commit and push your Dockerfile and configuration to your GitHub repository:
```
git add .
git commit -m "Add Fedora Commons configuration"
git push origin main
```
Create New App: Log in to Klutch.sh dashboard and click "Create New App".
Connect Repository: Select your GitHub repository containing the Fedora Commons configuration. Klutch.sh will automatically detect the Dockerfile.
Configure App Settings:
- App Name: Choose a name (e.g., fedora-repository)
- Region: Select your preferred deployment region
- Branch: Choose the branch to deploy (typically main)
Set Environment Variables: Add the following environment variables in the dashboard:
- DATABASE_URL: jdbc:postgresql://your-postgres.klutch.sh:8000/fedora
- DATABASE_USER: fedora_user
- DATABASE_PASSWORD: Your secure password (mark as sensitive)
- SOLR_URL: http://your-solr.klutch.sh/solr/fedora
- FCREPO_HOME: /data/fcrepo
- JAVA_OPTS: -Xmx2g -Xms1g -Djava.awt.headless=true
Configure Networking:
- Traffic Type: Select HTTP
- Internal Port: Set to 8080 (Fedora's default port)
Attach Persistent Volume: Fedora requires persistent storage for repository data:
- Mount Path: /data/fcrepo
- Size: Start with 10GB, scale based on collection size (repositories can grow to terabytes)
This volume stores all repository objects, metadata, versions, and access control data.
Deploy: Click "Deploy" to start the deployment. Klutch.sh will build the Docker image and launch Fedora Commons.
Monitor Deployment: Watch the deployment logs to ensure Fedora starts correctly. Look for messages indicating:
- Database connection successful
- Repository initialization complete
- REST API available
- Server started on port 8080
Access Your Repository: Once deployed, access Fedora at:
```
https://fedora-repository.klutch.sh/rest/
```
The /rest/ endpoint is the main entry point for the Fedora REST API.

Initial Configuration

Verify Repository Status

Check that your Fedora repository is running:

# Check repository root
curl -i https://fedora-repository.klutch.sh/rest/

# Response should show:
# HTTP/1.1 200 OK
# Link: <http://www.w3.org/ns/ldp#BasicContainer>; rel="type"
# Link: <http://www.w3.org/ns/ldp#Resource>; rel="type"

Create Your First Container

Create a container to organize your digital objects:

# Create a container
curl -X POST \
  https://fedora-repository.klutch.sh/rest/ \
  -H "Content-Type: text/turtle" \
  -H "Slug: collections" \
  --data-binary '@-' << EOF
@prefix dc: <http://purl.org/dc/elements/1.1/> .
@prefix dcterms: <http://purl.org/dc/terms/> .

<> dc:title "Digital Collections" ;
   dcterms:description "Top-level container for all collections" .
EOF

Add a Digital Object

Upload a digital object with metadata:

# Create an object
curl -X POST \
  https://fedora-repository.klutch.sh/rest/collections \
  -H "Content-Type: text/turtle" \
  -H "Slug: photo-001" \
  --data-binary '@-' << EOF
@prefix dc: <http://purl.org/dc/elements/1.1/> .
@prefix dcterms: <http://purl.org/dc/terms/> .

<> dc:title "Historic Photograph" ;
   dc:creator "Unknown Photographer" ;
   dcterms:created "1920-01-01"^^xsd:date ;
   dcterms:subject "Architecture", "History" .
EOF

# Upload binary content (image file)
curl -X PUT \
  https://fedora-repository.klutch.sh/rest/collections/photo-001/photo \
  -H "Content-Type: image/jpeg" \
  --data-binary @photo.jpg

Query with SPARQL

Query repository metadata using SPARQL:

# SPARQL query to find all objects
curl -X POST \
  https://fedora-repository.klutch.sh/rest/fcr:search \
  -H "Content-Type: application/sparql-query" \
  --data-binary '@-' << EOF
PREFIX dc: <http://purl.org/dc/elements/1.1/>
SELECT ?subject ?title
WHERE {
  ?subject dc:title ?title .
}
LIMIT 100
EOF

Set Access Controls

Implement Web Access Control (WebAC):

# Create ACL for a resource
curl -X PUT \
  https://fedora-repository.klutch.sh/rest/collections/photo-001/fcr:acl \
  -H "Content-Type: text/turtle" \
  --data-binary '@-' << EOF
@prefix acl: <http://www.w3.org/ns/auth/acl#> .
@prefix foaf: <http://xmlns.com/foaf/0.1/> .

<#publicRead>
  a acl:Authorization ;
  acl:accessTo <> ;
  acl:mode acl:Read ;
  acl:agentClass foaf:Agent .

<#adminControl>
  a acl:Authorization ;
  acl:accessTo <> ;
  acl:mode acl:Read, acl:Write, acl:Control ;
  acl:agent <http://example.org/users/admin> .
EOF

Working with Fedora Repository

REST API Examples

Python - Interact with Fedora using Python:

import requests
from rdflib import Graph, Namespace, Literal, URIRef
from rdflib.namespace import DC, DCTERMS, RDF

# Fedora repository URL
FEDORA_URL = "https://fedora-repository.klutch.sh/rest/"

# Create a container
def create_container(container_name, title, description):
    # Create RDF metadata
    g = Graph()
    subject = URIRef("")
    g.add((subject, DC.title, Literal(title)))
    g.add((subject, DCTERMS.description, Literal(description)))

    # Serialize to Turtle
    turtle_data = g.serialize(format='turtle')

    # POST to Fedora
    response = requests.post(
        FEDORA_URL,
        headers={
            'Content-Type': 'text/turtle',
            'Slug': container_name
        },
        data=turtle_data
    )

    if response.status_code == 201:
        print(f"Created: {response.headers['Location']}")
        return response.headers['Location']
    else:
        print(f"Error: {response.status_code}")
        return None

# Upload a binary file
def upload_binary(parent_path, filename, file_path):
    with open(file_path, 'rb') as f:
        response = requests.put(
            f"{FEDORA_URL}{parent_path}/{filename}",
            headers={'Content-Type': 'image/jpeg'},
            data=f
        )

    if response.status_code == 201:
        print(f"Uploaded: {response.headers['Location']}")
        return True
    return False

# Get resource metadata
def get_metadata(resource_path):
    response = requests.get(
        f"{FEDORA_URL}{resource_path}",
        headers={'Accept': 'text/turtle'}
    )

    if response.status_code == 200:
        g = Graph()
        g.parse(data=response.text, format='turtle')
        return g
    return None

# Search with SPARQL
def search_by_title(title_substring):
    sparql_query = f"""
    PREFIX dc: <http://purl.org/dc/elements/1.1/>
    SELECT ?subject ?title
    WHERE {{
        ?subject dc:title ?title .
        FILTER(CONTAINS(LCASE(?title), "{title_substring.lower()}"))
    }}
    """

    response = requests.post(
        f"{FEDORA_URL}fcr:search",
        headers={'Content-Type': 'application/sparql-query'},
        data=sparql_query
    )

    return response.text

# Example usage
if __name__ == "__main__":
    # Create a collection
    collection_url = create_container(
        "photographs",
        "Photograph Collection",
        "Historic photographs from the archive"
    )

    # Upload an image
    if collection_url:
        upload_binary("photographs", "img001", "photo.jpg")

    # Search for items
    results = search_by_title("photograph")
    print(results)

JavaScript/Node.js - Browser or Node.js integration:

const axios = require('axios');
const FormData = require('form-data');
const fs = require('fs');

const FEDORA_URL = 'https://fedora-repository.klutch.sh/rest/';

// Create a container with RDF metadata
async function createContainer(containerName, metadata) {
  const turtle = `
    @prefix dc: <http://purl.org/dc/elements/1.1/> .
    @prefix dcterms: <http://purl.org/dc/terms/> .

    <> dc:title "${metadata.title}" ;
       dcterms:description "${metadata.description}" .
  `;

  try {
    const response = await axios.post(FEDORA_URL, turtle, {
      headers: {
        'Content-Type': 'text/turtle',
        'Slug': containerName
      }
    });

    console.log('Created:', response.headers.location);
    return response.headers.location;
  } catch (error) {
    console.error('Error:', error.response?.status, error.message);
    return null;
  }
}

// Upload binary file
async function uploadBinary(parentPath, filename, filePath) {
  const fileStream = fs.createReadStream(filePath);

  try {
    const response = await axios.put(
      `${FEDORA_URL}${parentPath}/${filename}`,
      fileStream,
      {
        headers: {
          'Content-Type': 'application/octet-stream'
        }
      }
    );

    console.log('Uploaded:', response.headers.location);
    return true;
  } catch (error) {
    console.error('Upload error:', error.message);
    return false;
  }
}

// Get resource metadata
async function getMetadata(resourcePath) {
  try {
    const response = await axios.get(`${FEDORA_URL}${resourcePath}`, {
      headers: {
        'Accept': 'application/ld+json'
      }
    });

    return response.data;
  } catch (error) {
    console.error('Error:', error.message);
    return null;
  }
}

// Update resource metadata
async function updateMetadata(resourcePath, newTitle) {
  const sparqlUpdate = `
    PREFIX dc: <http://purl.org/dc/elements/1.1/>
    DELETE { <> dc:title ?oldTitle }
    INSERT { <> dc:title "${newTitle}" }
    WHERE { <> dc:title ?oldTitle }
  `;

  try {
    await axios.patch(
      `${FEDORA_URL}${resourcePath}`,
      sparqlUpdate,
      {
        headers: {
          'Content-Type': 'application/sparql-update'
        }
      }
    );

    console.log('Metadata updated');
    return true;
  } catch (error) {
    console.error('Update error:', error.message);
    return false;
  }
}

// Delete resource
async function deleteResource(resourcePath) {
  try {
    await axios.delete(`${FEDORA_URL}${resourcePath}`);
    console.log('Resource deleted');
    return true;
  } catch (error) {
    console.error('Delete error:', error.message);
    return false;
  }
}

// Example usage
(async () => {
  // Create collection
  const collectionUrl = await createContainer('documents', {
    title: 'Document Collection',
    description: 'Archive documents'
  });

  if (collectionUrl) {
    // Upload file
    await uploadBinary('documents', 'doc001.pdf', './sample.pdf');

    // Get metadata
    const metadata = await getMetadata('documents/doc001.pdf');
    console.log('Metadata:', metadata);
  }
})();

Java - Enterprise integration example:

import org.apache.http.client.methods.*;
import org.apache.http.entity.StringEntity;
import org.apache.http.impl.client.CloseableHttpClient;
import org.apache.http.impl.client.HttpClients;
import org.apache.http.util.EntityUtils;

public class FedoraClient {
    private static final String FEDORA_URL = "https://fedora-repository.klutch.sh/rest/";
    private final CloseableHttpClient httpClient;

    public FedoraClient() {
        this.httpClient = HttpClients.createDefault();
    }

    public String createContainer(String slug, String title, String description)
            throws Exception {
        HttpPost post = new HttpPost(FEDORA_URL);

        String turtle = String.format(
            "@prefix dc: <http://purl.org/dc/elements/1.1/> .\n" +
            "<> dc:title \"%s\" ;\n" +
            "   dc:description \"%s\" .",
            title, description
        );

        post.setEntity(new StringEntity(turtle));
        post.setHeader("Content-Type", "text/turtle");
        post.setHeader("Slug", slug);

        try (CloseableHttpResponse response = httpClient.execute(post)) {
            if (response.getStatusLine().getStatusCode() == 201) {
                return response.getFirstHeader("Location").getValue();
            }
            throw new Exception("Failed to create container");
        }
    }

    public String getMetadata(String resourcePath) throws Exception {
        HttpGet get = new HttpGet(FEDORA_URL + resourcePath);
        get.setHeader("Accept", "text/turtle");

        try (CloseableHttpResponse response = httpClient.execute(get)) {
            return EntityUtils.toString(response.getEntity());
        }
    }

    public void close() throws Exception {
        httpClient.close();
    }
}

Batch Import Script

For migrating existing collections:

import os
import requests
from pathlib import Path

FEDORA_URL = "https://fedora-repository.klutch.sh/rest/"

def batch_import(source_dir, target_container):
    """Import all files from a directory to Fedora"""

    # Create target container
    requests.post(
        FEDORA_URL,
        headers={'Slug': target_container},
        data='<> <http://purl.org/dc/elements/1.1/title> "Batch Import" .'
    )

    # Process all files
    for file_path in Path(source_dir).rglob('*'):
        if file_path.is_file():
            relative_path = file_path.relative_to(source_dir)

            # Create parent containers if needed
            parts = list(relative_path.parts[:-1])
            current_path = target_container

            for part in parts:
                parent = current_path
                current_path = f"{current_path}/{part}"

                # Create container
                requests.post(
                    f"{FEDORA_URL}{parent}",
                    headers={'Slug': part}
                )

            # Upload file
            with open(file_path, 'rb') as f:
                requests.put(
                    f"{FEDORA_URL}{target_container}/{relative_path}",
                    data=f,
                    headers={'Content-Type': 'application/octet-stream'}
                )

            print(f"Imported: {relative_path}")

# Usage
batch_import('/path/to/files', 'imported_collection')

Advanced Configuration

Custom Storage Backend

Configure alternative storage backends in application.yaml:

fcrepo:
  storage:
    type: s3
    s3:
      bucket: my-fedora-bucket
      region: us-east-1
      access-key: ${AWS_ACCESS_KEY}
      secret-key: ${AWS_SECRET_KEY}

Indexing Configuration

Advanced Solr indexing configuration:

fcrepo:
  indexing:
    enabled: true
    solr:
      url: ${SOLR_URL}
      commit-within: 1000
    predicate-filter:
      - http://www.w3.org/1999/02/22-rdf-syntax-ns#type
      - http://purl.org/dc/elements/1.1/title

Performance Tuning

Optimize Java heap and thread pools:

# Set via JAVA_OPTS environment variable
JAVA_OPTS=-Xmx4g -Xms2g -XX:+UseG1GC -XX:MaxGCPauseMillis=200 -Djava.awt.headless=true

Enable Transactions

Configure transaction timeouts:

fcrepo:
  transaction:
    timeout: 180000  # 3 minutes in milliseconds

Integrations

Islandora Integration

Fedora is the backend for Islandora, a popular digital repository framework:

Deploy Fedora as described above
Configure Islandora to point to your Fedora instance
Use Islandora’s Drupal-based frontend for user access

Hydra/Samvera Integration

Integrate with Hydra ecosystem applications:

# In your Hydra application's config
ActiveFedora::Base.configure do |config|
  config.rest_api_url = "https://fedora-repository.klutch.sh/rest/"
end

OAI-PMH Harvesting

Expose Fedora content via OAI-PMH for metadata harvesting by other systems.

Backup and Disaster Recovery

Backup Strategy

Database Backup (PostgreSQL):

# Backup PostgreSQL database
pg_dump -h postgres.klutch.sh -p 8000 -U fedora_user fedora > fedora_backup.sql

Binary Content Backup:

# Backup repository files from persistent volume
tar -czf fedora_binaries_$(date +%Y%m%d).tar.gz /data/fcrepo

Export Repository (using Fedora API):

# Export all resources
curl -X GET \
  "https://fedora-repository.klutch.sh/rest/fcr:export" \
  -o repository_export.zip

Restore Procedures

Restore Database:

psql -h postgres.klutch.sh -p 8000 -U fedora_user fedora < fedora_backup.sql

Restore Binary Content:

tar -xzf fedora_binaries_20241220.tar.gz -C /

Import Repository:

curl -X POST \
  "https://fedora-repository.klutch.sh/rest/fcr:import" \
  -F "file=@repository_export.zip"

Automated Backup Script

#!/bin/bash
BACKUP_DIR="/backups/fedora"
DATE=$(date +%Y%m%d_%H%M%S)

# Backup database
pg_dump -h postgres.klutch.sh -p 8000 -U fedora_user fedora | \
  gzip > "$BACKUP_DIR/db_$DATE.sql.gz"

# Backup binaries via API
curl -X GET \
  "https://fedora-repository.klutch.sh/rest/fcr:export" \
  -o "$BACKUP_DIR/binaries_$DATE.zip"

# Rotate old backups (keep 30 days)
find "$BACKUP_DIR" -name "*.gz" -mtime +30 -delete
find "$BACKUP_DIR" -name "*.zip" -mtime +30 -delete

echo "Backup completed: $DATE"

Production Best Practices

Security

Enable Authentication: Configure authentication providers in application.yaml
Use WebAC: Implement fine-grained access control for all resources
Secure Credentials: Store database passwords as sensitive environment variables
HTTPS Only: Use Klutch.sh’s automatic HTTPS for all repository access
Regular Updates: Keep Fedora version updated for security patches

Performance

Resource Limits: Set appropriate heap size based on repository size (-Xmx4g for large repositories)
Database Tuning: Optimize PostgreSQL connection pool and query performance
Solr Optimization: Configure Solr for optimal indexing performance
Content Delivery: Use CDN for frequently accessed binary content
Monitoring: Track repository size, API response times, and indexing lag

Monitoring

Health Checks:

# Check repository health
curl -I https://fedora-repository.klutch.sh/rest/

# Check Solr integration
curl https://fedora-repository.klutch.sh/rest/fcr:search

# Check database connectivity
curl https://fedora-repository.klutch.sh/rest/fcr:health

Metrics to Monitor:

Repository size (number of objects and total storage)
API request rate and response times
Database query performance
Solr indexing lag
Failed transactions
Authentication failures

Scaling

Horizontal Scaling: Deploy multiple Fedora instances behind a load balancer
Storage Scaling: Increase persistent volume size as collection grows
Database Scaling: Use PostgreSQL replication for read-heavy workloads
Caching: Implement HTTP caching for frequently accessed resources

Troubleshooting

Common Issues

Problem: Fedora fails to start with “Database connection failed”

Solution: Verify DATABASE_URL, DATABASE_USER, and DATABASE_PASSWORD are correct.
Check that PostgreSQL is running and accessible on port 8000.

Problem: “Out of memory” errors

Solution: Increase heap size in JAVA_OPTS:
JAVA_OPTS=-Xmx4g -Xms2g

Also check that your Klutch.sh plan has sufficient memory.

Problem: Slow indexing or search not working

Solution: Verify SOLR_URL points to a running Solr instance.
Check Solr logs for errors.
Manually trigger reindexing: curl -X POST https://fedora-repository.klutch.sh/rest/fcr:reindex

Problem: “Transaction timeout” errors

Solution: Increase transaction timeout in application.yaml:
fcrepo:
  transaction:
    timeout: 300000  # 5 minutes

Problem: Cannot access uploaded binaries

Solution: Ensure persistent volume is properly mounted at /data/fcrepo.
Check that files exist: ls -la /data/fcrepo/objects
Verify file permissions allow read access.

Problem: SPARQL queries return no results

Solution: Verify Solr is indexing properly.
Check indexing configuration in application.yaml.
Trigger manual reindex if needed.

Problem: High memory usage over time

Solution: This is normal for Java applications.
Tune garbage collection: -XX:+UseG1GC -XX:MaxGCPauseMillis=200
Consider restarting Fedora periodically during maintenance windows.

Problem: Cannot create resources (401/403 errors)

Solution: Check authentication configuration.
Verify WebAC permissions are set correctly.
For testing, disable authentication temporarily.

Debugging Tips

Enable Debug Logging:

logging:
  level:
    org.fcrepo: DEBUG
    org.springframework: DEBUG

Check Repository Integrity:

# Run fixity check on all binaries
curl -X POST https://fedora-repository.klutch.sh/rest/fcr:fixity

Inspect Resource Details:

# Get full resource information including metadata
curl -H "Accept: application/ld+json" \
  https://fedora-repository.klutch.sh/rest/path/to/resource

Migration from Other Systems

From Fedora 3.x

Fedora 6 uses a different architecture than Fedora 3. Migration requires:

Export Fedora 3 objects using fcrepo-export-tools
Transform FOXML to modern RDF/LDP format
Import into Fedora 6 using batch import scripts
Verify checksums and relationships

From DSpace

# Sample migration script from DSpace
def migrate_from_dspace(dspace_export_dir):
    """Migrate DSpace items to Fedora"""
    for item_dir in Path(dspace_export_dir).glob('item_*'):
        # Read DSpace metadata
        metadata = parse_dublin_core(item_dir / 'dublin_core.xml')

        # Create Fedora object
        create_fedora_object(metadata)

        # Upload bitstreams
        for bitstream in (item_dir / 'bitstreams').glob('*'):
            upload_to_fedora(bitstream)

Additional Resources

You now have a production-ready Fedora Commons Repository deployment on Klutch.sh. This flexible foundation supports everything from small departmental repositories to large-scale institutional digital preservation systems. Start with basic containers and objects, then expand with advanced features like versioning, access control, and external integrations as your needs grow.