-
Notifications
You must be signed in to change notification settings - Fork 100
chore(planner): consider limit when planning
#5048
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
|
| Branch | mlaw/planner-limit |
| Testbed | Linux |
Click to view all benchmark results
| Benchmark | File Size | Benchmark Result kilobytes (KB) (Result Δ%) | Upper Boundary kilobytes (KB) (Limit %) |
|---|---|---|---|
| zero-package.tgz | 📈 view plot 🚷 view threshold | 1,389.43 KB(0.00%)Baseline: 1,389.43 KB | 1,417.22 KB (98.04%) |
| zero.js | 📈 view plot 🚷 view threshold | 228.62 KB(0.00%)Baseline: 228.62 KB | 233.19 KB (98.04%) |
| zero.js.br | 📈 view plot 🚷 view threshold | 63.70 KB(0.00%)Baseline: 63.70 KB | 64.97 KB (98.04%) |
|
| Branch | mlaw/planner-limit |
| Testbed | self-hosted |
Click to view all benchmark results
| Benchmark | Throughput | Benchmark Result operations / second (ops/s) x 1e3 (Result Δ%) | Lower Boundary operations / second (ops/s) x 1e3 (Limit %) |
|---|---|---|---|
| src/client/custom.bench.ts > big schema | 📈 view plot 🚷 view threshold | 905.89 ops/s x 1e3(+1.66%)Baseline: 891.10 ops/s x 1e3 | 824.86 ops/s x 1e3 (91.05%) |
| src/client/zero.bench.ts > basics > All 1000 rows x 10 columns (numbers) | 📈 view plot 🚷 view threshold | 2.97 ops/s x 1e3(+3.38%)Baseline: 2.88 ops/s x 1e3 | 2.76 ops/s x 1e3 (92.74%) |
| src/client/zero.bench.ts > pk compare > pk = N | 📈 view plot 🚷 view threshold | 47.37 ops/s x 1e3(+4.48%)Baseline: 45.34 ops/s x 1e3 | 43.11 ops/s x 1e3 (91.01%) |
| src/client/zero.bench.ts > with filter > Lower rows 500 x 10 columns (numbers) | 📈 view plot 🚷 view threshold | 3.93 ops/s x 1e3(-5.47%)Baseline: 4.16 ops/s x 1e3 | 3.89 ops/s x 1e3 (98.87%) |
|
| Branch | mlaw/planner-limit |
| Testbed | self-hosted |
🚨 7 Alerts
Click to view all benchmark results
| Benchmark | Throughput | Benchmark Result operations / second (ops/s) (Result Δ%) | Lower Boundary operations / second (ops/s) (Limit %) |
|---|---|---|---|
| planned: playlist.exists(tracks) | 📈 view plot 🚷 view threshold | 952.28 ops/s(+898.19%)Baseline: 95.40 ops/s | -428.18 ops/s (-44.96%) |
| planned: track.exists(album) OR exists(genre) | 📈 view plot 🚷 view threshold | 22.33 ops/s(-2.16%)Baseline: 22.83 ops/s | 21.93 ops/s (98.20%) |
| planned: track.exists(album) where title="Big Ones" | 📈 view plot 🚷 view threshold | 8,934.04 ops/s(+6.15%)Baseline: 8,416.58 ops/s | 7,717.96 ops/s (86.39%) |
| planned: track.exists(album).exists(genre) | 📈 view plot 🚷 view threshold | 25.45 ops/s(-2.56%)Baseline: 26.12 ops/s | 25.23 ops/s (99.14%) |
| planned: track.exists(album).exists(genre) with filters | 📈 view plot 🚷 view threshold | 5,375.64 ops/s(-1.46%)Baseline: 5,455.44 ops/s | 5,263.10 ops/s (97.91%) |
| planned: track.exists(playlists) | 📈 view plot 🚷 view threshold | 6.40 ops/s(+709.10%)Baseline: 0.79 ops/s | -2.64 ops/s (-41.19%) |
| unplanned: playlist.exists(tracks) | 📈 view plot 🚷 view threshold 🚨 view alert (🔔) | 916.32 ops/s(-5.01%)Baseline: 964.60 ops/s | 924.74 ops/s (100.92%) |
| unplanned: track.exists(album) OR exists(genre) | 📈 view plot 🚷 view threshold | 22.29 ops/s(-1.92%)Baseline: 22.72 ops/s | 21.84 ops/s (98.02%) |
| unplanned: track.exists(album) where title="Big Ones" | 📈 view plot 🚷 view threshold 🚨 view alert (🔔) | 33.37 ops/s(-6.09%)Baseline: 35.53 ops/s | 33.77 ops/s (101.21%) |
| unplanned: track.exists(album).exists(genre) | 📈 view plot 🚷 view threshold 🚨 view alert (🔔) | 20.80 ops/s(-4.39%)Baseline: 21.75 ops/s | 20.96 ops/s (100.80%) |
| unplanned: track.exists(album).exists(genre) with filters | 📈 view plot 🚷 view threshold | 35.17 ops/s(-2.22%)Baseline: 35.97 ops/s | 34.58 ops/s (98.31%) |
| unplanned: track.exists(playlists) | 📈 view plot 🚷 view threshold | 6.36 ops/s(-1.23%)Baseline: 6.44 ops/s | 6.18 ops/s (97.25%) |
| zpg: (pk lookup) select * from track where id = 3163 | 📈 view plot 🚷 view threshold | 1,141.40 ops/s(+4.96%)Baseline: 1,087.51 ops/s | 894.84 ops/s (78.40%) |
| zpg: (secondary index lookup) select * from track where album_id = 248 | 📈 view plot 🚷 view threshold | 1,080.43 ops/s(-4.60%)Baseline: 1,132.47 ops/s | 1,048.97 ops/s (97.09%) |
| zpg: (table scan) select * from album | 📈 view plot 🚷 view threshold | 715.36 ops/s(-4.05%)Baseline: 745.54 ops/s | 675.06 ops/s (94.37%) |
| zpg: OR with empty branch and limit | 📈 view plot 🚷 view threshold | 947.32 ops/s(+4.16%)Baseline: 909.50 ops/s | 730.42 ops/s (77.10%) |
| zpg: OR with empty branch and limit with exists | 📈 view plot 🚷 view threshold | 741.48 ops/s(-4.51%)Baseline: 776.54 ops/s | 701.05 ops/s (94.55%) |
| zpg: all playlists | 📈 view plot 🚷 view threshold | 5.59 ops/s(+111.20%)Baseline: 2.65 ops/s | 0.85 ops/s (15.17%) |
| zpg: scan with one depth related | 📈 view plot 🚷 view threshold | 431.82 ops/s(+54.75%)Baseline: 279.04 ops/s | 180.83 ops/s (41.88%) |
| zql: (pk lookup) select * from track where id = 3163 | 📈 view plot 🚷 view threshold | 120,766.66 ops/s(-11.66%)Baseline: 136,703.84 ops/s | 117,715.25 ops/s (97.47%) |
| zql: (secondary index lookup) select * from track where album_id = 248 | 📈 view plot 🚷 view threshold | 2,046.13 ops/s(-2.07%)Baseline: 2,089.47 ops/s | 1,607.18 ops/s (78.55%) |
| zql: (table scan) select * from album | 📈 view plot 🚷 view threshold | 6,639.40 ops/s(-1.66%)Baseline: 6,751.62 ops/s | 5,987.38 ops/s (90.18%) |
| zql: OR with empty branch and limit | 📈 view plot 🚷 view threshold | 57,344.02 ops/s(-0.53%)Baseline: 57,651.71 ops/s | 48,886.70 ops/s (85.25%) |
| zql: OR with empty branch and limit with exists | 📈 view plot 🚷 view threshold | 12,627.06 ops/s(-1.63%)Baseline: 12,836.70 ops/s | 12,048.09 ops/s (95.41%) |
| zql: all playlists | 📈 view plot 🚷 view threshold | 4.25 ops/s(-5.13%)Baseline: 4.48 ops/s | 4.22 ops/s (99.30%) |
| zql: edit for limited query, inside the bound | 📈 view plot 🚷 view threshold 🚨 view alert (🔔) | 224,287.41 ops/s(-4.43%)Baseline: 234,682.92 ops/s | 224,517.25 ops/s (100.10%) |
| zql: edit for limited query, outside the bound | 📈 view plot 🚷 view threshold | 230,140.77 ops/s(-5.36%)Baseline: 243,183.15 ops/s | 221,519.32 ops/s (96.25%) |
| zql: push into limited query, inside the bound | 📈 view plot 🚷 view threshold | 108,098.44 ops/s(-6.51%)Baseline: 115,626.54 ops/s | 107,421.94 ops/s (99.37%) |
| zql: push into limited query, outside the bound | 📈 view plot 🚷 view threshold | 440,950.32 ops/s(-4.75%)Baseline: 462,934.62 ops/s | 414,006.01 ops/s (93.89%) |
| zql: push into unlimited query | 📈 view plot 🚷 view threshold | 330,729.14 ops/s(-9.44%)Baseline: 365,220.35 ops/s | 330,166.87 ops/s (99.83%) |
| zql: scan with one depth related | 📈 view plot 🚷 view threshold | 477.23 ops/s(-2.67%)Baseline: 490.33 ops/s | 461.04 ops/s (96.61%) |
| zqlite: (pk lookup) select * from track where id = 3163 | 📈 view plot 🚷 view threshold | 44,710.12 ops/s(-1.23%)Baseline: 45,267.34 ops/s | 41,979.45 ops/s (93.89%) |
| zqlite: (secondary index lookup) select * from track where album_id = 248 | 📈 view plot 🚷 view threshold | 10,771.32 ops/s(-5.61%)Baseline: 11,411.98 ops/s | 10,108.04 ops/s (93.84%) |
| zqlite: (table scan) select * from album | 📈 view plot 🚷 view threshold 🚨 view alert (🔔) | 1,207.03 ops/s(-11.25%)Baseline: 1,360.07 ops/s | 1,239.37 ops/s (102.68%) |
| zqlite: OR with empty branch and limit | 📈 view plot 🚷 view threshold | 19,160.67 ops/s(+0.75%)Baseline: 19,017.90 ops/s | 17,895.97 ops/s (93.40%) |
| zqlite: OR with empty branch and limit with exists | 📈 view plot 🚷 view threshold | 5,704.73 ops/s(-0.07%)Baseline: 5,708.60 ops/s | 5,201.32 ops/s (91.18%) |
| zqlite: all playlists | 📈 view plot 🚷 view threshold 🚨 view alert (🔔) | 1.35 ops/s(-8.77%)Baseline: 1.48 ops/s | 1.37 ops/s (102.00%) |
| zqlite: edit for limited query, inside the bound | 📈 view plot 🚷 view threshold | 121,880.77 ops/s(-1.45%)Baseline: 123,673.17 ops/s | 117,084.73 ops/s (96.06%) |
| zqlite: edit for limited query, outside the bound | 📈 view plot 🚷 view threshold | 123,479.29 ops/s(-3.58%)Baseline: 128,063.58 ops/s | 121,495.81 ops/s (98.39%) |
| zqlite: push into limited query, inside the bound | 📈 view plot 🚷 view threshold | 4,206.57 ops/s(-2.35%)Baseline: 4,307.91 ops/s | 4,165.04 ops/s (99.01%) |
| zqlite: push into limited query, outside the bound | 📈 view plot 🚷 view threshold | 141,060.11 ops/s(-5.61%)Baseline: 149,444.44 ops/s | 140,650.11 ops/s (99.71%) |
| zqlite: push into unlimited query | 📈 view plot 🚷 view threshold 🚨 view alert (🔔) | 117,958.85 ops/s(-10.52%)Baseline: 131,833.20 ops/s | 121,501.50 ops/s (103.00%) |
| zqlite: scan with one depth related | 📈 view plot 🚷 view threshold | 161.54 ops/s(-2.00%)Baseline: 164.85 ops/s | 151.97 ops/s (94.07%) |
a0b78c4 to
e74e534
Compare
limit when planning
|
@copilot - what's with the |
27520fe to
8d5031b
Compare
8d5031b to
8a8240f
Compare
8a8240f to
7a8604f
Compare
7a8604f to
1c4a973
Compare
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Overview
The current limit algorithm is pretty simple:
est_rows_with_filters / est_rows_without_filtersThe intuition here is that if a child is very selective we must iterate more rows on the parent side before finding a match on the child side.
Example:
issue.whereExists('creator').limit(10)Every issue has a creator so the selectivity of the child is 1. This means we have:
scan_est = 10/1 = 10. I.e., we'll have to scan 10 rows before fulfilling our limit of 10.If it were:
issue.whereExists('creator', q => q.name('ff'))The cost model will give us the estimated num of rows for a creator with a given name. Maybe it is 10 out of 1,000. This means our selectivity is: 10/1000 -> .01.
.01 selectivity means we have:
scan_est = 10/.01 = 1,000-> we will potentially have to scan 1,000 parent rows before finding 10 matches from the child.Problem
The simple algorithm isn't quite complete. Think of this query:
The selectivity on
commentsmay be high but if users have created many comments, this increases the likelihood of finding a match.Example
Say we have users
1,2,3,4,5And a comment table. Below are user ids from the comment table, assuming every user created 2 comments.
If our filters decimated the table to only half the rows:
We have 0.5 selectivity so an estimated scan of
1/0.5 = 2overuserto find a match.But we see that all user ids are present in the set! We only scan 1 user row to find a matching comment. This faster matching is because we have a higher fanout from user->comment than 1.
So we need to do:
selectivity = 1 - Math.pow(1 - filterSelectivity, fanout) = 0.75 -> scan_est = 1 / 0.75 = 1.331.33 being closer to 1.
More on this is documented in packages/zql/src/planner/SELECTIVITY_PLAN.md
Future notes to self:
parentCost.limit / childCost.selectivityalgorithm. Postgres has a separate cost which is "startup_cost" that captures things like creating temp indices.