Skip to content

Spatio-Temporal-Lab/Crush

Repository files navigation

CRUSH: Lossless Floating-Point Compression via Controlled Realignment with Uniform Shifts

CRUSH is a novel lossless compression algorithm for floating-point data. It leverages a controlled realignment strategy with uniform shifts to achieve high compression ratios.

🛠️ Setup and Installation

We recommend using Ubuntu 22.04 with the clang compiler for the best compatibility.

  1. Install essential build tools and clang:

    sudo apt-get update
    sudo apt-get install -y build-essential clang cmake pkg-config liblzma-dev
  2. Clone the repository:

    git clone git@github.com:Spatio-Temporal-Lab/Crush.git
    cd Crush

🏗️ Building the Project

The project uses CMake for building.

  1. Create a build directory and enter it:

    mkdir build && cd build
  2. Configure the project with CMake (from the build directory):

    cmake ..
  3. Compile the project:

    make PerformanceProgram -j4

    This will build the main performance testing executable PerformanceProgram inside the build directory.

🐳 IoTDB Integration

We have integrated CRUSH and other baseline algorithms into Apache IoTDB v1.1 using the Java Native Interface (JNI). The implementation can be found in the jni/ directory.

A pre-built Docker image with these integrations is available on Docker Hub:

  • Image: sagann/iotdb:latest

You can pull the image using the following command:

docker pull sagann/iotdb:latest

📂 Code Structure

The core logic for CRUSH is located in baselines/crush/.

baselines/crush/
├── crush.h                 # Main header for CRUSH
├── CrushCompressor.cpp     # Compression logic
├── CrushDecompressor.cpp   # Decompression logic
├── CrushAbCompressor.cpp   # Ablation study compressor
├── CrushAbDecompressor.cpp # Ablation study decompressor
├── utils.cc                # Utility functions
└── BitStream/              # Bit manipulation utilities
  • crush.h: Declares the interfaces for the CRUSH compressor and decompressor.
  • CrushCompressor.cpp: Implements the CRUSH compression algorithm.
  • CrushDecompressor.cpp: Implements the CRUSH decompression algorithm.
  • Perf_test.cc: The main file for running performance tests.

🚀 Running Tests

The PerformanceProgram executable created in the build directory is used to run the performance evaluations.

cd build
./PerformanceProgram

You can also run specific test suites using filters:

./PerformanceProgram --gtest_filter=Perf.BatchSize
./PerformanceProgram --gtest_filter=Perf.Beta
./PerformanceProgram --gtest_filter=Perf.Ablation

📊 Datasets

The datasets used for evaluation are located in the data_set/ directory. These include a variety of real-world floating-point data sources.

  • Air-pressure.csv
  • Air-sensor.csv
  • Basel-temp.csv
  • Bitcoin-price.csv
  • ... and many more.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors