Skip to content

Collaboration proposal: Rust-powered QVD engine (qvdrs) — up to 350x faster #14

@bintocher

Description

@bintocher

Hi Constantin,

First of all — great work on PyQvd! It's the most well-known Python library for QVD files, and the API design is clean and well-documented.

I'm the author of qvdrs — a QVD library with the core engine written in Rust and Python bindings via PyO3 + Arrow zero-copy bridge (pip install qvdrs). I'd like to explore the possibility of collaboration.

Performance comparison

Benchmarks on the same machine with real QVD files, PyQvd 2.3.2 vs qvdrs 0.5.0:

Read

File Rows Cols PyQvd qvdrs Speedup
11 KB 12 4 0.013s 0.000s 29x
62 KB 125 45 0.012s 0.001s 11x
2.3 MB 21,523 10 0.214s 0.011s 20x
35 MB 1,695,048 7 5.96s 0.26s 23x
480 MB 11,994,296 4 64.5s 2.1s 31x
560 MB 5,458,618 24 65.2s 3.9s 17x
1.7 GB 87,617,047 8 >10 min (killed) 23.4s >25x

Write

File Rows Cols PyQvd qvdrs Speedup
35 MB 1,695,048 7 7.8s 0.022s 351x
480 MB 11,994,296 4 50.9s 0.61s 83x

Features only in qvdrs

Feature qvdrs
Streaming EXISTS() filtered read: 1.7GB, 87.6M rows -> 20.4M rows x 3 cols 9.0s
EXISTS() + save to QVD 13.3s
Parquet <-> QVD conversion yes
DuckDB native integration (register QVD as SQL tables) yes
DataFusion SQL queries on QVD yes
Arrow RecordBatch zero-copy (pandas, Polars, DuckDB) yes
CLI tool (convert, inspect, head, filter) yes
Binary-identical output to Qlik Sense (MD5 verified) yes

What each project brings

PyQvd — mature, clean API with QvdTable (filter_by, join, sort, append, insert), 25 stars, established user base, good documentation on readthedocs, pure Python — easy to understand and debug.

qvdrs — Rust core (17-350x faster), handles multi-GB files, streaming reader, EXISTS() filter (2.5x faster than Qlik Sense), Parquet/Arrow/DuckDB/DataFusion integration, binary-identical QVD output to Qlik Sense.

Development approach

The qvdrs codebase is developed with the help of Claude (Opus) — Anthropic's AI coding assistant. This significantly accelerates development: implementing new features, writing tests, debugging, and maintaining code quality. If you're open to collaboration, this is a powerful tool that can be used for joint development as well — it handles Rust, Python, and the QVD binary format equally well.

Proposal

A few possible directions:

  1. Contribute to qvdrs — your QVD format expertise and API design skills would be very valuable. We could adopt PyQvd's richer QvdTable API (filter_by, join, sort, etc.) for the Python bindings.
  2. qvdrs as an optional backend for PyQvd — keep PyQvd's API, but optionally use qvdrs for I/O (similar to how pandas uses pyarrow). Users get the familiar API with Rust performance.
  3. Joint development — combine efforts under one project.

No pressure at all — just reaching out since we're both solving the same problem. Happy to discuss.

Stanislav
https://github.com/bintocher/qvdrs

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions