Skip to content

Conversation

@SkyFan2002
Copy link
Member

@SkyFan2002 SkyFan2002 commented Oct 26, 2025

I hereby agree to the terms of the CLA available at: https://docs.databend.com/dev/policies/cla/

Summary

This PR improves the runtime filter mechanism with several optimizations to enhance join performance and reduce unnecessary filtering overhead.

Changes

  1. Push down runtime filters to probe key equivalence classes - Extends runtime filter coverage by propagating filters through equivalent expressions, enabling more join scenarios to benefit from early filtering. Performance Optimization: Propagate Runtime Filters Through Join Equivalence Classes #18857

  2. Scan waits for runtime filter construction - Synchronizes table scans with runtime filter generation to ensure filters are applied before reading data, maximizing filtering effectiveness.

  3. Runtime-based filter construction decision - Dynamically determines whether to build runtime filters based on runtime statistics, avoiding overhead when filters provide minimal benefit.

Performance

TPC-DS SF100, Large Warehouse:

Query baseline_time_s new_time_s change_percent
Q93 3.899 2.3 -41.01
Q95 5.431 3.925 -27.73
Q17 5.075 4.135 -18.52
Q34 2.801 2.37 -15.39
Q6 2.754 2.422 -12.06
Q64 11.608 10.34 -10.92

Tests

  • Unit Test
  • Logic Test
  • Benchmark Test
  • No Test - Explain why

Type of change

  • Bug Fix (non-breaking change which fixes an issue)
  • New Feature (non-breaking change which adds functionality)
  • Breaking Change (fix or feature that could cause existing functionality not to work as expected)
  • Documentation Update
  • Refactoring
  • Performance Improvement
  • Other (please describe):

This change is Reviewable

@github-actions github-actions bot added the pr-feature this PR introduces a new feature to the codebase label Oct 26, 2025
@databendlabs databendlabs deleted a comment from github-actions bot Oct 27, 2025
@SkyFan2002 SkyFan2002 added the ci-cloud Build docker image for cloud test label Oct 28, 2025
@SkyFan2002 SkyFan2002 force-pushed the improve_runtime_filter branch from 4707bac to a8bd6cb Compare October 30, 2025 06:17
@databendlabs databendlabs deleted a comment from github-actions bot Oct 30, 2025
@databendlabs databendlabs deleted a comment from github-actions bot Oct 30, 2025
@SkyFan2002 SkyFan2002 added ci-cloud Build docker image for cloud test and removed ci-cloud Build docker image for cloud test labels Oct 30, 2025
@github-actions
Copy link
Contributor

Docker Image for PR

  • tag: pr-18893-dc39b31-1761813262

note: this image tag is only available for internal use.

@SkyFan2002 SkyFan2002 added ci-cloud Build docker image for cloud test and removed ci-cloud Build docker image for cloud test labels Oct 30, 2025
@github-actions
Copy link
Contributor

Docker Image for PR

  • tag: pr-18893-920dea5-1761845415

note: this image tag is only available for internal use.

@SkyFan2002 SkyFan2002 added ci-cloud Build docker image for cloud test and removed ci-cloud Build docker image for cloud test labels Oct 31, 2025
@databendlabs databendlabs deleted a comment from github-actions bot Oct 31, 2025
@github-actions
Copy link
Contributor

Docker Image for PR

  • tag: pr-18893-5d4ce35-1761897311

note: this image tag is only available for internal use.

@SkyFan2002
Copy link
Member Author

Marked as draft due to some tests failing.

@SkyFan2002 SkyFan2002 marked this pull request as ready for review November 6, 2025 01:00
zhang2014 and others added 8 commits November 6, 2025 11:24
# Conflicts:
#	src/query/storages/fuse/src/operations/read/native_data_source_deserializer.rs
#	src/query/storages/fuse/src/operations/read/parquet_data_source_deserializer.rs
# Conflicts:
#	src/query/service/tests/it/sql/planner/optimizer/data/results/obfuscated/01_multi_join_avg_case_expression_physical.txt
#	src/query/service/tests/it/sql/planner/optimizer/data/results/obfuscated/01_multi_join_sum_case_expression_physical.txt
#	src/query/service/tests/it/sql/planner/optimizer/data/results/tpcds/Q01_physical.txt
#	src/query/service/tests/it/sql/planner/optimizer/data/results/tpcds/Q03_physical.txt
#	src/query/storages/fuse/src/operations/read/native_data_source_deserializer.rs
#	src/query/storages/fuse/src/operations/read/parquet_data_source_deserializer.rs
#	tests/sqllogictests/suites/mode/cluster/shuffle.test
#	tests/sqllogictests/suites/mode/standalone/ee/explain_virtual_column.test
#	tests/sqllogictests/suites/mode/standalone/explain/explain.test
#	tests/sqllogictests/suites/mode/standalone/explain/index/explain_agg_index.test
#	tests/sqllogictests/suites/mode/standalone/explain/index/explain_vector_index.test
#	tests/sqllogictests/suites/mode/standalone/explain/lateral.test
#	tests/sqllogictests/suites/mode/standalone/explain/limit.test
#	tests/sqllogictests/suites/mode/standalone/explain/push_down_filter/push_down_filter_eval_scalar.test
#	tests/sqllogictests/suites/mode/standalone/explain/select.test
#	tests/sqllogictests/suites/mode/standalone/explain/subquery.test
#	tests/sqllogictests/suites/mode/standalone/explain_native/limit.test
#	tests/sqllogictests/suites/mode/standalone/explain_native/push_down_filter/push_down_filter_eval_scalar.test
#	tests/sqllogictests/suites/mode/standalone/explain_native/subquery.test
#	tests/sqllogictests/suites/no_table_meta_cache/col_stats_of_all_null.test
@zhang2014 zhang2014 merged commit 3efed0c into databendlabs:main Nov 11, 2025
87 of 88 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci-cloud Build docker image for cloud test pr-feature this PR introduces a new feature to the codebase

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants