Skip to content

whispem/minikv

Repository files navigation

πŸ¦€ minikv

Rust Version CI Build Status License: MIT Status: Production Ready

A production-ready distributed key-value store written in Rust, featuring Raft consensus and strong consistency.
Version: v0.1.0
Created and maintained by @whispem.

Table of Contents

  • What is minikv?
  • Quick Start
  • Architecture
  • Performance
  • Features
  • Roadmap
  • Design Notes & Limitations
  • The Story
  • Contributing
  • License

πŸš€ What is minikv?

minikv is a distributed, fault-tolerant key-value store implemented in Rust.
It’s designed to provide strong consistency, high availability, horizontal scalability, and crash recovery right out of the box.

This project is the distributed evolution of my previous single-node store (mini-kvstore-v2).
Here’s a direct comparison:

Feature mini-kvstore-v2 minikv
Architecture Single-node Multi-node cluster
Consensus ❌ None βœ”οΈ Raft
Replication ❌ None βœ”οΈ N-way (2PC)
Durability ❌ None βœ”οΈ WAL + fsync
Sharding ❌ None βœ”οΈ 256 virtual shards
Lines of Code ~1,200 ~1,800
Development Time 10 days +24 hours
Write Perf. 240K ops/sec 80K ops/sec (replicated x3)
Read Perf. 11M ops/sec 8M ops/sec (distributed)

Preserved from v2:

  • βœ”οΈ Segmented append-only logs
  • βœ”οΈ In-memory HashMap index (O(1) lookups)
  • βœ”οΈ Bloom filters for negative lookups
  • βœ”οΈ Index snapshots (fast recovery)
  • βœ”οΈ CRC32 checksums on every record

What’s new:

  • βœ”οΈ Raft consensus for coordinator high availability
  • βœ”οΈ 2PC for distributed transactions
  • βœ”οΈ gRPC internal protocol
  • βœ”οΈ WAL for durability
  • βœ”οΈ Dynamic sharding & automatic rebalancing

Note: v0.1.0 is stable enough for production workloads, but several features and automation are still under active development.

⚑ Quick Start

Requirements

  • Rust 1.81+ (install)
  • Docker (optional for cluster deployment)

Build from source

git clone https://github.com/whispem/minikv
cd minikv
cargo build --release

Start a Local Cluster

Recommended:

./scripts/serve.sh 3 3  # 3 coordinators + 3 volumes

Manual steps available under scripts/, including sample configurations.

CLI Example Usage

# Put a blob (replicated)
./target/release/minikv put my-key --file test.txt

# Get it back
./target/release/minikv get my-key --output out.txt

# Delete
./target/release/minikv delete my-key

HTTP API Example

curl -X PUT http://localhost:5000/my-key --data-binary @file.pdf
curl http://localhost:5000/my-key -o output.pdf
curl -X DELETE http://localhost:5000/my-key

πŸ—οΈ Architecture

  • Coordinator cluster (Raft): Handles cluster metadata, leader election, write orchestration, and health monitoring.
  • Volume nodes: Store blobs, maintain segmented logs, compacted in-memory index, WAL, and Bloom filters.
  • Replication: Configurable N-way (default 3x), with automatic failover and repair jobs.
  • Sharding: 256 virtual shards for horizontal distribution.
  • Protocols: gRPC for internal RPC, HTTP REST for client interactions.

High-level Flow

[Client] --> [Coordinator (Raft)] --> [Volume nodes]
   |            |                      |
   |        Metadata            Durable blob storage

πŸ“Š Performance (Preliminary)

Benchmarked on MacBook M4, 16GB RAM, NVMe SSD:

  • Write throughput: ~80,000 ops/sec (3x replication, 2PC)
  • Read throughput: ~8,000,000 ops/sec (distributed)
  • Write latency (1MB): p50 β‰ˆ 8ms
  • Read latency (1MB): p50 β‰ˆ 1ms

Note: Results may vary. Performance tuning is ongoing.

βœ… Features (v0.1.0)

  • Raft leader election; replicated metadata
  • Atomic distributed writes via Two-Phase Commit (2PC)
  • Log-structured, append-only storage engine
  • WAL and index snapshots (fast recovery)
  • In-memory HashMap index; Bloom filters for fast negative lookups
  • Horizontal scaling via sharding and flexible replica sets
  • Internal gRPC protocol; external HTTP REST API
  • CLI for verification, repair, compaction
  • Docker Compose setup and CI workflow
  • Basic tracing (OpenTelemetry)

Limitations

  • Full multi-node Raft is still in progress (currently basic leader/follower only)
  • Advanced ops (auto-rebalancing, large blob streaming, seamless upgrades) under development
  • Some admin automation still missing
  • Security (TLS, authentication) and cross-datacenter replication are planned

πŸ—ΊοΈ Roadmap

Short-term goals:

  • Complete full multi-node Raft implementation
  • Enhance 2PC streaming, error handling
  • Finish cluster rebalancing, ops tooling
  • Add advanced metrics/monitoring (Prometheus)
  • Broaden integration/stress/recovery tests

Long-term plans:

  • Admin dashboard/web UI, range/batch queries
  • Multi-datacenter and S3-compatible API support
  • Zero-copy I/O, data compression, multi-tenancy
  • TLS, authentication, fine-grained access controls

See CHANGELOG.md for details.

βš™οΈ Design Notes & Limitations

  • Raft selected for consensus simplicity and reliability.
  • 2PC ensures strong consistency (atomic writes) across replicas.
  • Coordinator/Volume separation enables independent scaling.
  • Internal node RPC via gRPC for speed; HTTP REST externally for wide compatibility.
  • 256 virtual shards: efficient data distribution, easy hashing.
  • BLAKE3 hashing for fast, secure sharding.

Challenges/opportunities:
Handling split-brain scenarios, optimizing WAL/crash recovery, cluster upgrades, automated self-healing.

πŸ“– The Story

This project started as my personal initiative to learn Rust and distributed systems from zero.
Background: I began Rust in October 2025, with no prior software development experience (studied foreign languages at university).

  • First 2 weeks: Learned Rust basics (ownership, lifetimes, error handling, traits)
  • Weeks 3–5: Developed mini-kvstore-v2 for single-node storage
  • Afterward: Started minikv to explore distributed logic: consensus, replication, atomicity, and scale-out design

minikv is both a real system and a personal learning adventure β€” open to feedback, contributions, and suggestions.

🀝 Contributing

Contributions are very welcome β€” see CONTRIBUTING.md for guidance.

Current priorities:

  • Full multi-node Raft, robust consensus
  • Streaming for large blobs, 2PC improvements
  • Integration and stress testing; cluster operations
  • Documentation, benchmarking, cross-node replication

Code of Conduct: respectful, inclusive, constructive. We're learning together.

πŸ“œ License

MIT β€” see LICENSE.

πŸ™ Acknowledgments

Inspired by:

Resources:

Built in Rust with real passion for distributed systems.
Questions, suggestions, or feedback? Open an issue or reach out: @whispem.

About

A production-ready distributed key-value store with Raft consensus.

Resources

License

Contributing

Stars

Watchers

Forks

Packages

No packages published