Skip to content

Conversation

@etseidl
Copy link
Contributor

@etseidl etseidl commented Dec 4, 2025

Which issue does this PR close?

Rationale for this change

Baseline for future improvements.

What changes are included in this PR?

Adds new benchmarks for reading and writing. Currently uses a fixed number of row groups, pages, and rows. Cycles through data types and encodings.

Are these changes tested?

N/A

Are there any user-facing changes?

No

@github-actions github-actions bot added the parquet Changes to the parquet crate label Dec 4, 2025
@etseidl
Copy link
Contributor Author

etseidl commented Dec 4, 2025

@alamb I borrowed liberally from your parquet footer code 😉

@etseidl
Copy link
Contributor Author

etseidl commented Dec 4, 2025

Example run

group                                 base
-----                                 ----
read Binary(100) delta_byte_array     1.00     21.7±0.49ms        ? ?/sec
read Binary(100) delta_length         1.00     11.4±0.18ms        ? ?/sec
read Binary(100) dict                 1.00     12.3±0.16ms        ? ?/sec
read Binary(100) plain                1.00     10.6±0.28ms        ? ?/sec
read Binary(20) delta_byte_array      1.00     12.6±0.18ms        ? ?/sec
read Binary(20) delta_length          1.00      8.3±0.19ms        ? ?/sec
read Binary(20) dict                  1.00      7.5±0.20ms        ? ?/sec
read Binary(20) plain                 1.00      7.4±0.25ms        ? ?/sec
read Fixed(16) byte_stream_split      1.00      6.8±0.38ms        ? ?/sec
read Fixed(16) delta_byte_array       1.00      8.0±0.20ms        ? ?/sec
read Fixed(16) dict                   1.00  1775.6±49.86µs        ? ?/sec
read Fixed(16) plain                  1.00  1757.0±43.28µs        ? ?/sec
read Fixed(2) byte_stream_split       1.00  1770.9±30.40µs        ? ?/sec
read Fixed(2) delta_byte_array        1.00      8.3±0.15ms        ? ?/sec
read Fixed(2) dict                    1.00  1223.6±28.11µs        ? ?/sec
read Fixed(2) plain                   1.00  1229.1±25.39µs        ? ?/sec
read f32 byte_stream_split            1.00      5.2±0.16ms        ? ?/sec
read f32 dict                         1.00      4.2±0.09ms        ? ?/sec
read f32 plain                        1.00      3.1±0.27ms        ? ?/sec
read f64 byte_stream_split            1.00      9.1±0.61ms        ? ?/sec
read f64 dict                         1.00      4.4±0.06ms        ? ?/sec
read f64 plain                        1.00      3.4±0.14ms        ? ?/sec
read int32 byte_stream_split          1.00      5.1±0.14ms        ? ?/sec
read int32 delta_binary               1.00      4.4±0.08ms        ? ?/sec
read int32 dict                       1.00      4.9±0.68ms        ? ?/sec
read int32 plain                      1.00      3.1±0.17ms        ? ?/sec
read int64 byte_stream_split          1.00      9.2±0.67ms        ? ?/sec
read int64 delta_binary               1.00      5.0±0.09ms        ? ?/sec
read int64 dict                       1.00      4.4±0.06ms        ? ?/sec
read int64 plain                      1.00      3.4±0.04ms        ? ?/sec
write Binary(100) delta_byte_array    1.00     64.1±1.77ms        ? ?/sec
write Binary(100) delta_length        1.00     55.6±1.15ms        ? ?/sec
write Binary(100) dict                1.00     39.4±0.82ms        ? ?/sec
write Binary(100) plain               1.00     51.1±1.04ms        ? ?/sec
write Binary(20) delta_byte_array     1.00     31.9±0.35ms        ? ?/sec
write Binary(20) delta_length         1.00     24.7±0.71ms        ? ?/sec
write Binary(20) dict                 1.00     32.4±0.78ms        ? ?/sec
write Binary(20) plain                1.00     24.1±0.22ms        ? ?/sec
write Fixed(16) byte_stream_split     1.00     67.5±0.67ms        ? ?/sec
write Fixed(16) delta_byte_array      1.00    148.0±2.06ms        ? ?/sec
write Fixed(16) dict                  1.00     62.2±0.53ms        ? ?/sec
write Fixed(16) plain                 1.00     62.6±1.38ms        ? ?/sec
write Fixed(2) byte_stream_split      1.00     57.7±0.76ms        ? ?/sec
write Fixed(2) delta_byte_array       1.00    144.4±0.98ms        ? ?/sec
write Fixed(2) dict                   1.00     59.7±0.73ms        ? ?/sec
write Fixed(2) plain                  1.00     59.7±0.63ms        ? ?/sec
write f32 byte_stream_split           1.00     17.6±0.44ms        ? ?/sec
write f32 dict                        1.00     31.5±0.33ms        ? ?/sec
write f32 plain                       1.00     18.0±1.41ms        ? ?/sec
write f64 byte_stream_split           1.00     21.0±0.25ms        ? ?/sec
write f64 dict                        1.00     31.8±0.38ms        ? ?/sec
write f64 plain                       1.00     19.6±0.20ms        ? ?/sec
write int32 byte_stream_split         1.00     21.7±0.32ms        ? ?/sec
write int32 delta_binary              1.00     29.0±0.33ms        ? ?/sec
write int32 dict                      1.00     38.4±2.54ms        ? ?/sec
write int32 plain                     1.00     22.2±1.35ms        ? ?/sec
write int64 byte_stream_split         1.00     21.7±0.47ms        ? ?/sec
write int64 delta_binary              1.00     27.6±0.40ms        ? ?/sec
write int64 dict                      1.00     32.4±0.42ms        ? ?/sec
write int64 plain                     1.00     20.2±0.22ms        ? ?/sec

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

parquet Changes to the parquet crate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add round trip benchmark for Parquet writer/reader

1 participant