clusterF

command module

v0.0.0-...-932123a Latest Latest Go to latest Published: Dec 30, 2025 License: AGPL-3.0 Imports: 50 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/donomii/clusterF

Links

Open Source Insights

README ¶

🐸 ClusterF 🐸

The F stands for frog

A self-organizing peer-to-peer distributed file storage cluster with CRDT-based replication.

Features

Zero-Configuration P2P Architecture: Nodes automatically discover each other via UDP broadcast and form a cluster
CRDT-Based Replication: Conflict-free replicated data types ensure eventual consistency without coordination
Configurable Replication Factor: At any time during operations, set the replication factor from 1 - single copy up to full mirroring on every node.
Partition-Based Storage: Files are distributed across partitions with automatic balancing
HTTP/REST API: Complete programmatic access to cluster operations
Web UI: Built-in monitoring dashboard, file browser, and cluster visualizer
WebDAV Server: Mount cluster storage as a network drive
Full-Text Search: Built-in indexer for finding files by name and metadata
Media Transcoding: Automatic ffmpeg-based transcoding for streaming
Local Import/Export: Synchronize between cluster storage and local filesystems
Simulation Mode: Test cluster behavior with multiple nodes in one process
Profiling Support: Built-in pprof and flamegraph generation

Installation

go install github.com/donomii/clusterF@latest

Or build from source:

git clone https://github.com/donomii/clusterF
cd clusterF
go build

Quick Start

Start a single node:

./clusterF

The node will:

Automatically generate a node ID
Create a data directory (./data/<node-id>)
Start HTTP API on a random port (typically 30000-60000)
Begin broadcasting for peer discovery on UDP port 9999
Open a web dashboard

Access the dashboard at http://localhost:<port>/monitor (port shown in startup output).

Usage Examples

Basic Operations

Start a node with specific configuration:

./clusterF --node-id mynode --data-dir /var/clusterF --http-port 8080

Upload a file:

curl -X PUT --data-binary @photo.jpg http://localhost:8080/api/files/photos/photo.jpg

Download a file:

curl http://localhost:8080/api/files/photos/photo.jpg -o photo.jpg

List files:

curl http://localhost:8080/api/files/photos/

Search for files:

curl "http://localhost:8080/api/search?q=vacation"

Advanced Features

WebDAV Server

Serve cluster files over WebDAV:

./clusterF --webdav /photos

Mount on macOS:

open "http://localhost:8080"

Import/Export

Mirror cluster files to a local directory:

./clusterF --export-dir /mnt/share --cluster-dir /photos

Import files from local directory to cluster:

./clusterF --import-dir /home/user/photos --cluster-dir /backup

Client Mode

Join cluster without storing data locally:

./clusterF --no-store

Simulation Mode

Test cluster with multiple nodes:

./clusterF --sim-nodes 10 --base-port 30000

Architecture

Components

CRDT Layer (frogpond): Manages distributed state with eventual consistency
Discovery Manager: UDP broadcast-based peer discovery
Partition Manager: Distributes files across partitions with configurable replication
File System: Unified interface for file operations across the cluster
Indexer: Full-text search and metadata indexing
File Sync: Bidirectional synchronization with local filesystems
Thread Manager: Lifecycle management for background subsystems
Metrics Collector: Performance monitoring and statistics

Storage Options

clusterF currently supports file-based disk storage, files are visible and accessible from the command line. Specialised data stores are possible but not integrated yet.

Select backend with --storage-major:

./clusterF --storage-major bolt

Replication

Files are distributed across partitions based on path hash. Each partition is replicated to RF nodes (default RF=3). The system automatically:

Detects under-replicated partitions
Selects replication targets
Synchronizes partition data between nodes
Handles node failures gracefully

Adjust replication factor via API:

curl -X PUT -H "Content-Type: application/json" \
  -d '{"replication_factor": 5}' \
  http://localhost:8080/api/replication-factor

API Reference

File Operations

GET /api/files/<path> - Download file
PUT /api/files/<path> - Upload file
DELETE /api/files/<path> - Delete file
POST /api/files/<path> - Create directory (with X-Create-Directory: true header)
GET /api/metadata/<path> - Get file metadata

Search

GET /api/search?q=<query> - Search files by name/metadata

Cluster Management

GET /status - Node status and statistics
GET /api/cluster-stats - Cluster-wide statistics
GET /api/partition-stats - Partition distribution
GET /api/replication-factor - Get RF
PUT /api/replication-factor - Set RF
GET /api/under-replicated - List under-replicated partitions
POST /api/integrity-check - Verify stored file integrity

Monitoring

GET /monitor - Web-based monitoring dashboard
GET /api/metrics - Prometheus-compatible metrics
GET /cluster-visualizer.html - Network topology visualization

Profiling

GET /profiling - Profiling control panel
GET /flamegraph - CPU flame graph
GET /memorygraph - Memory flame graph
GET /debug/pprof/* - Go pprof endpoints

Configuration

Command-Line Options

--node-id           Node identifier (auto-generated if not specified)
--data-dir          Base data directory (default: ./data)
--http-port         HTTP API port (0 = auto)
--discovery-port    UDP discovery port (default: 9999)
--webdav            Serve cluster path over WebDAV
--export-dir        Mirror cluster files to local directory
--import-dir        Import files from local directory
--cluster-dir       Cluster path prefix for import/export
--exclude-dirs      Comma-separated directories to exclude from import
--no-store          Client mode: don't store partitions locally
--storage-major     Storage format (extent|bolt|sqlite|rawfile)
--storage-minor     Storage format minor version
--encryption-key    Encryption key for at-rest encryption
--no-desktop        Don't open desktop UI
--debug             Enable verbose debug logging
--profiling         Enable profiling at startup
--version           Print version and exit

Simulation Mode

--sim-nodes         Number of nodes to simulate
--base-port         Base HTTP port for simulation nodes

Web UI

The web interface provides:

Dashboard (/monitor): Real-time cluster metrics, peer status, partition distribution
File Browser (/files/): Navigate and manage cluster files
Visualizer (/cluster-visualizer.html): Interactive network topology
CRDT Inspector (/crdt): Examine distributed state
Metrics (/metrics): Performance graphs and statistics
Profiling (/profiling): CPU and memory profiling tools

Development

Building

go build

Testing

go test ./...

Run large-scale cluster tests:

go test -run TestLargeCluster -v

Project Structure

clusterF/
├── main.go                 # Entry point and cluster lifecycle
├── cluster.go              # Core cluster implementation
├── discovery/              # Peer discovery
├── partitionmanager/       # Partition distribution and replication
├── filesystem/             # File system abstraction
├── filesync/               # Import/export synchronization
├── indexer/                # Search indexing
├── metrics/                # Performance monitoring
├── frontend/               # Web UI
├── webdav/                 # WebDAV server
└── types/                  # Shared types and interfaces

Performance

Nodes handle thousands of concurrent connections
Partitions sync in parallel across multiple nodes

Troubleshooting

Nodes not discovering each other

Verify UDP port 9999 is not blocked by firewall
Check nodes are on same subnet for broadcast discovery
Try explicit discovery port: --discovery-port 9999

Under-replicated partitions

Check /api/under-replicated for report
Verify sufficient nodes are online
Increase partition sync interval: curl -X PUT -d '{"partition_sync_interval_seconds": 30}' http://localhost:8080/api/partition-sync-interval

High memory usage

Reduce partition sync parallelism (currently hardcoded)
Enable profiling: --profiling and check /memorygraph
Consider client mode for some nodes: --no-store

Data directory errors

Ensure write permissions on data directory
Storage format is locked after first start (cannot change --storage-major)
Verify encryption key matches if repository was created with encryption

License

GNU Affero General Public License v3.0 (AGPL-3.0)

See LICENSE file for full text.

Contributing

This project follows strict coding conventions:

Links

Repository: https://github.com/donomii/clusterF
Issues: https://github.com/donomii/clusterF/issues

Documentation ¶

Rendered for

Overview ¶

cluster.go - Self-organizing P2P storage cluster

desktop_ui.go - Simple desktop window for drag-and-drop using system WebView

frogpond_integration.go - Additional methods for CRDT coordination

main.go - Simple cluster node launcher

search.go - Search API for directory browsing and file finding

transcode_api.go - HTTP API for media transcoding

transcoder.go - Server-side media transcoding for web compatibility

Source Files ¶

View all Source files

Directories ¶

Path	Synopsis
discovery discovery.go - Auto-discovery and peer networking	discovery.go - Auto-discovery and peer networking
exporter module
filesync Package exporter provides a filesystem watcher that mirrors changes between the cluster file system and a local directory for OS-level sharing.	Package exporter provides a filesystem watcher that mirrors changes between the cluster file system and a local directory for OS-level sharing.
filesystem filesystem.go - Distributed file system layer on top of partition system	filesystem.go - Distributed file system layer on top of partition system
frontend module
httpclient
indexer indexer.go - In-memory file index for fast searching with unique document IDs	indexer.go - In-memory file index for fast searching with unique document IDs
metrics metrics.go - Performance metrics collection and storage	metrics.go - Performance metrics collection and storage
partitionmanager partitions.go Partitioning system for scalable file storage using existing KV stores	partitions.go Partitioning system for scalable file storage using existing KV stores
syncmap module
testenv
threadmanager threadmanager.go - Centralized goroutine lifecycle management	threadmanager.go - Centralized goroutine lifecycle management
types
urlutil
webdav Package webdav provides WebDAV server functionality for the cluster filesystem	Package webdav provides WebDAV server functionality for the cluster filesystem

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL