-
Notifications
You must be signed in to change notification settings - Fork 318
perf: Implement query chunking for charts #1233
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
🦋 Changeset detectedLatest commit: de6ae37 The changes in this PR will be included in the next version bump. This PR includes changesets to release 2 packages
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
The latest updates on your projects. Learn more about Vercel for GitHub.
|
E2E Test Results✅ All tests passed • 25 passed • 3 skipped • 228s
|
e077e57
to
5620ca8
Compare
a3b22e9
to
2d0e0b7
Compare
2d0e0b7
to
6229d52
Compare
6229d52
to
3e73289
Compare
943ddb8
to
3e899c8
Compare
3e899c8
to
76d2753
Compare
76d2753
to
ec976e5
Compare
ec976e5
to
ca69e95
Compare
ca69e95
to
61afdb4
Compare
PR Review: Query Chunking for ChartsCritical Issues✅ No critical issues found. Code Quality ObservationsStrong Points:
Minor Recommendations:
Testing Note: The comprehensive test suite is excellent, but consider adding an integration test that verifies actual query results match between chunked and non-chunked modes for the same config. Overall: Well-implemented feature with good architectural decisions. The opt-in design and thorough testing inspire confidence. |
d854c1f
to
e90e957
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just the one comment regarding streaming using similar existing mechanisms, this looks really good overall
260ab0f
to
ade5381
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
async function* fetchDataInChunks( | ||
config: ChartConfigWithOptDateRange, | ||
clickhouseClient: ClickhouseClient, | ||
signal: AbortSignal, | ||
disableQueryChunking: boolean = false, | ||
) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
style:I suggest converting it into an object when the method takes more than three args
const windows = | ||
!disableQueryChunking && shouldUseChunking(config) | ||
? getGranularityAlignedTimeWindows(config) | ||
: ([undefined] as const); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure why we need type casting here
...(window ?? {}), | ||
}; | ||
|
||
const result = await clickhouseClient.queryChartConfig({ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
double check that we want to abort fetching if queryChartConfig
throws? and what happens on the UI side?
config: ChartConfigWithOptDateRange, | ||
clickhouseClient: ClickhouseClient, | ||
signal: AbortSignal, | ||
disableQueryChunking: boolean = false, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if it's safer to make it disabled by default and opt in feature by feature
Summary
Closes HDX-2310
Closes HDX-2616
This PR implements chunking of chart queries to improve performance of charts on large data sets and long time ranges. Recent data is loaded first, then older data is loaded one-chunk-at-a-time until the full chart date range has been queried.
Screen.Recording.2025-10-03.at.1.11.09.PM.mov
Performance Impacts
Expectations
This change is intended to improve performance in a few ways:
Measured Results
Overall, the results match the expectations outlined above.
Scenarios and data
In each of the following tests:
Scenario: Log Search Histogram in Staging V2, 2 Day Range, No Filter
Scenario: Log Search Histogram in Staging V2, 14 Day Range, No Filter
Scenario: Chart Explorer Line Chart with p90 and p99 trace durations, Staging V2 Traces, Filtering for "GET" spans, 7 Day range
Implementation Notes
When is chunking used?
Chunking is used when all of the following are true:granularity
andtimestampValueExpression
are defined in the config. This ensures that the query is already being bucketed. Without bucketing, chunking would break aggregation queries, since groups can span multiple chunks.dateRange
is defined in the config. Without a date range, we'd need an unbounded set of chunks or the start and end chunks would have to be unbounded at their start and end, respectively.useQueriedChartConfig
does not pass thedisableQueryChunking: true
option. This option is provided to disable chunking when necessary.How are time windows chosen?
Which order are the chunks queried in?
Chunks are queried sequentially, most-recent first, due to the expectation that more recent data is typically more important to the user. Unlike with
useOffsetPaginatedSearch
, we are not paginating the data beyond the chunks, and all data is typically displayed together, so there is no need to support "ascending" order.Does this improve client-side caching behavior?
One theoretical way in which query chunking could improve performance to enable client-side caching of individual chunks, which could then be re-used if the same query is run over a longer time range.
Unfortunately, using streamedQuery, react-query stores the entire time range as one item in the cache, so it does not re-use individual chunks or "pages" from another query.
We could accomplish this improvement by using useQueries instead of streamedQuery or useInfiniteQuery. In that case, we'd treat each chunk as its own query. This would require a number of changes:
enabled
but requires some additional state to figure out whether the previous query is done.