Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
39 changes: 30 additions & 9 deletions docs/managing-dragonfly/tiering.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,14 +4,12 @@ sidebar_position: 5

# SSD Data Tiering

Dragonfly v1.21.0 introduces a powerful new feature: SSD data tiering. With it, Dragonfly
Dragonfly v1.35.0 introduces a powerful new feature: SSD data tiering. With it, Dragonfly
can leverage SSD/NVMe devices as a secondary storage tier that complements
RAM. By intelligently offloading specific data to fast disk storage,
Dragonfly can significantly reduce physical memory usage, potentially
achieving a 2x-5x improvement while maintaining sub-millisecond average latency.

## How It Works

Dragonfly's data tiering focuses on string values exceeding 64 characters in size.
When enabled, these longer strings are offloaded to the SSD tier.
Shorter strings and other data types, along with the primary hashtable index,
Expand All @@ -20,13 +18,32 @@ while reducing memory consumption. When accessed, offloaded data is seamlessly r
SSD and integrated back into memory. Write, delete, and expire operations are managed
entirely in-memory, leveraging disk-based keys for efficient operation.


## Enabling Data tiering
## Configuring Data tiering
The feature can be enable by passing `--tiered_prefix <nvme_path>/<basename>` flag.
Dragonfly will automatically check the free disk space on the partition hosting `<nvme_path>` and
will deduce the maximum capacity it can use. In order to explicitly set the maximum
disk space capacity, for data tiering you can use `--tiered_max_file_size=<size>`. For example,
`--tiered_max_file_size=96G`.
will deduce the maximum capacity it can use. It will create a storage file for every thread.

Main flags:
* **tiered_experimental_cooling** - whether to use experimental cooling, see below for more details
* **tiered_offload_threshold** - ratio of free memory, below which values will be actively offloaded to disk by a background process.
* **tiered_upload_threshold** - ratio of free memory, below which values are no longer returned to memory when read.
* **tiered_storage_write_depth** - maximum number of concurrent disk writes to avoid overloading disk
* **tiered_max_file_size** - maxium file size in bytes, must be multiples of 256MB, usually determined automatically.
* **tiered_min_value_size** - can be used to raise the 64 byte value limit to offload only larger values.
* **registered_buffer_size** - size of registered buffers to use for zero-copy read and writes with io_uring

For example,
```
./dragonfly
--tiered_prefix=/mnt/fast-ssd/tiered-file
--maxmemory=20G
--tiered_offload_threshold=0.4
--tiered_upload_threshold=0.2
```
will configure Dragonfly to run with a memory limit of 20Gb.
* When memory usage is above 16Gb (less than 20% free), values read from disk will be no longer returned back to memory.
* New values are offloaded to disk immediately, but active background offloading starts only when 12Gb of memory is used.
* In between 12 and 16 gigabytes of memory usage, a continuous uploading/offloading process will keep most recently used items in memory and older ones on disk

## Checking Data tiering metrics

Expand All @@ -50,6 +67,10 @@ the server is bottlenecked on disk write i/o.
* **tiered_ram_misses:** - how many times an entry lookup resulted in a disk read
* **tiered_ram_cool_hits:** - how many times an entry lookup resulted in cooling buffer hit.

### Experimental cooling

As mentioned above, new values are always placed on disk immedaitely after creation. When **tiered_experimental_cooling** is enabled, a copy is also kept in memory, making the value exist both on disk and memory. This allows to quickly change its state to either in-memory or on-disk without any pending IO operations. Those duplicated values are accounted in both tiered_entries_bytes and well as used_memory, however offload and upload thresholds are determined without taking them into account, so to determine the "real" free memory amount, one has to subtract tiered_cold_storage_bytes from used_memory

## Performance
Performance benchmarks against Elasticache instances and Memcached,
conducted on AWS instances, demonstrate Dragonfly's superior performance.
Expand Down Expand Up @@ -91,4 +112,4 @@ limited functionality or stability. If you encounter any issues while using data
please report them by [filing an issue](https://github.com/dragonflydb/dragonfly/issues/).

**Limitations:**
* Data tiering is not currently supported by BITOP and HLL operations.
* Data tiering is not currently supported by BITOP and HLL operations.