Skip to content

Conversation

@samwillis
Copy link
Collaborator

@samwillis samwillis commented Oct 1, 2025

This fixes this bug report from Discord: https://discord.com/channels/719702312431386674/1369767025723052134/1422935844175614072

That report identified that a self query with a where clause on the main alias would result in the join not seeing the whole collection. It also tries to tidy up the terminology we use, dropping "table" and using source & collection.

This has grown in to a larger refactor of the compiler that changes it to do a subscription per source alias rather than a single subscription per collection.


Per-Alias Subscriptions: Fix Self-Joins and Enable Independent Filtering

The Problem

Live queries previously subscribed once per collection ID. This caused critical bugs when the same collection appeared multiple times in a query with different aliases:

// This query would FAIL before this PR
const query = db.query((q) => 
  q.from({ employee: employeesCollection })
   .join({ manager: employeesCollection }, 
     ({ employee, manager }) => eq(employee.managerId, manager.id))
   .where(({ employee }) => eq(employee.department, 'Engineering'))
)

What went wrong:

  • Both employee and manager aliases mapped to the same collection
  • The live query layer created only ONE subscription for employeesCollection
  • The WHERE filter meant for employee was incorrectly applied to manager too
  • Lazy loading couldn't distinguish between the two aliases
  • Result: managers were filtered by department when they shouldn't be

The Solution

Subscribe once per source alias, not once per collection.

Now employee and manager each get their own independent subscription with:

  • Their own filters
  • Their own lazy loading state
  • Their own input streams into the query pipeline

Key Changes

1. Compiler Output Tracking

Added two new fields to CompilationResult:

interface CompilationResult {
  // NEW: Maps every alias to its collection ID
  // Example: { employee: 'employees-id', manager: 'employees-id' }
  aliasToCollectionId: Record<string, string>
  
  // NEW: Maps outer subquery aliases to inner aliases
  // Example: { activeUser: 'user' } when join alias differs from subquery's internal alias
  aliasRemapping: Record<string, string>
}

2. Alias-Keyed Everything

Before:

inputs[collection.id]  // One input per collection
subscriptions[collection.id]  // One subscription per collection

After:

inputs[alias]  // One input per alias: { employee: input1, manager: input2 }
subscriptions[alias]  // One subscription per alias

3. Subquery Alias Resolution

When a subquery uses different aliases than the parent query:

const activeUsers = q.from({ user: usersCollection })
                     .where(({ user }) => eq(user.active, true))

q.from({ post: posts })
 .join({ author: activeUsers }, ...)  // Outer: 'author', Inner: 'user'

The compiler now tracks: aliasRemapping['author'] = 'user' so lazy loading can find the correct subscription.

4. Better Error Messages

// Before
throw new Error(`Input for collection "users" not found`)

// After  
throw new CollectionInputNotFoundError(
  'manager',  // The missing alias
  'employees-col-id',  // The collection it refers to
  ['employee', 'post']  // Available aliases for debugging
)
// Error: Input for alias "manager" (collection "employees-col-id") not found in inputs map. 
//        Available keys: employee, post

5. Terminology Consistency

  • collectionWhereClausessourceWhereClauses
  • "table alias" → "source alias" throughout comments
  • tablessources in pipeline code

6. Code Simplification

Removed unnecessary two-phase compilation that was guarding against optimizer-generated aliases (which never actually happened). The compiler now runs once with cleaner logic.

What Now Works

✅ Self-Joins

// Employees who manage other employees
q.from({ employee: employeesCollection })
 .join({ manager: employeesCollection }, 
   ({ employee, manager }) => eq(employee.managerId, manager.id))
 .where(({ employee }) => eq(employee.department, 'Engineering'))
 .select(({ employee, manager }) => ({
   employeeName: employee.name,
   managerName: manager.name  // ✅ Manager not filtered by department
 }))

✅ Multiple Aliases with Independent Filters

q.from({ current: usersCollection })
 .join({ previous: usersCollection }, ...)
 .where(({ current }) => eq(current.active, true))
 .where(({ previous }) => eq(previous.active, false))
 // ✅ Each alias has its own filter

✅ Subquery Alias Resolution

const admins = q.from({ user: usersCollection })
                .where(({ user }) => eq(user.role, 'admin'))

q.from({ post: posts })
 .join({ author: admins }, ...)  
 // ✅ Compiler maps 'author' → 'user' for subscription resolution

✅ Lazy Loading in Self-Joins

// Only loads managers for employees that pass the WHERE clause
q.from({ employee: employeesCollection })
 .join({ manager: employeesCollection }, ...)
 .where(({ employee }) => eq(employee.active, true))
 // ✅ Lazy loads only needed managers, using correct alias subscription

Breaking Changes

None for users. This is entirely internal to the query compilation and live layer.

The only "breaking" aspect is that previously broken queries (self-joins) now work correctly.

Performance Impact

Neutral to positive:

  • Avoids incorrect filter application across aliases
  • Enables more precise lazy loading (only load what each alias needs)
  • Slightly more memory (one subscription per alias vs per collection)
  • Opens door for future optimization: deduplicate identical alias pipelines

Testing

All existing tests pass. The fact that tests passed throughout development confirms:

  • No optimizer actually generates new aliases (validated with logging)
  • All aliases come from user declarations
  • The defensive checks never trigger in practice

Self-join scenarios that previously failed now work correctly.

Files Changed

  • packages/db/src/query/compiler/index.ts - Core compilation changes
  • packages/db/src/query/compiler/joins.ts - Join processing with alias tracking
  • packages/db/src/query/live/collection-config-builder.ts - Per-alias subscription creation
  • packages/db/src/query/live/collection-subscriber.ts - Alias-aware subscriptions
  • packages/db/src/errors.ts - New error classes with better context

Migration Guide

No migration needed. Existing queries work unchanged. Self-joins that were broken now work.

@changeset-bot
Copy link

changeset-bot bot commented Oct 1, 2025

🦋 Changeset detected

Latest commit: 48769e0

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 12 packages
Name Type
@tanstack/db Patch
@tanstack/angular-db Patch
@tanstack/electric-db-collection Patch
@tanstack/query-db-collection Patch
@tanstack/react-db Patch
@tanstack/rxdb-db-collection Patch
@tanstack/solid-db Patch
@tanstack/svelte-db Patch
@tanstack/trailbase-db-collection Patch
@tanstack/vue-db Patch
todos Patch
@tanstack/db-example-react-todo Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@samwillis samwillis changed the title change to having a subscription per collection alias rather than collection inside a live query change to a subscription per collection alias rather than collection inside a live query Oct 1, 2025
@pkg-pr-new
Copy link

pkg-pr-new bot commented Oct 1, 2025

More templates

@tanstack/angular-db

npm i https://pkg.pr.new/@tanstack/angular-db@625

@tanstack/db

npm i https://pkg.pr.new/@tanstack/db@625

@tanstack/db-ivm

npm i https://pkg.pr.new/@tanstack/db-ivm@625

@tanstack/electric-db-collection

npm i https://pkg.pr.new/@tanstack/electric-db-collection@625

@tanstack/query-db-collection

npm i https://pkg.pr.new/@tanstack/query-db-collection@625

@tanstack/react-db

npm i https://pkg.pr.new/@tanstack/react-db@625

@tanstack/rxdb-db-collection

npm i https://pkg.pr.new/@tanstack/rxdb-db-collection@625

@tanstack/solid-db

npm i https://pkg.pr.new/@tanstack/solid-db@625

@tanstack/svelte-db

npm i https://pkg.pr.new/@tanstack/svelte-db@625

@tanstack/trailbase-db-collection

npm i https://pkg.pr.new/@tanstack/trailbase-db-collection@625

@tanstack/vue-db

npm i https://pkg.pr.new/@tanstack/vue-db@625

commit: 48769e0

@github-actions
Copy link
Contributor

github-actions bot commented Oct 1, 2025

Size Change: +1.5 kB (+1.94%)

Total Size: 78.4 kB

Filename Size Change
./packages/db/dist/esm/errors.js 3.5 kB +401 B (+12.94%) ⚠️
./packages/db/dist/esm/index.js 1.63 kB +50 B (+3.16%)
./packages/db/dist/esm/query/compiler/index.js 2.19 kB +149 B (+7.3%) 🔍
./packages/db/dist/esm/query/compiler/joins.js 2.63 kB +120 B (+4.79%) 🔍
./packages/db/dist/esm/query/compiler/order-by.js 1.26 kB +45 B (+3.71%)
./packages/db/dist/esm/query/live/collection-config-builder.js 3.87 kB +844 B (+27.88%) 🚨
./packages/db/dist/esm/query/live/collection-subscriber.js 1.81 kB -112 B (-5.84%)
./packages/db/dist/esm/query/optimizer.js 3.26 kB -1 B (-0.03%)
ℹ️ View Unchanged
Filename Size
./packages/db/dist/esm/collection/change-events.js 963 B
./packages/db/dist/esm/collection/changes.js 1.01 kB
./packages/db/dist/esm/collection/events.js 660 B
./packages/db/dist/esm/collection/index.js 3.31 kB
./packages/db/dist/esm/collection/indexes.js 1.16 kB
./packages/db/dist/esm/collection/lifecycle.js 1.8 kB
./packages/db/dist/esm/collection/mutations.js 2.52 kB
./packages/db/dist/esm/collection/state.js 3.79 kB
./packages/db/dist/esm/collection/subscription.js 1.83 kB
./packages/db/dist/esm/collection/sync.js 1.68 kB
./packages/db/dist/esm/deferred.js 230 B
./packages/db/dist/esm/indexes/auto-index.js 794 B
./packages/db/dist/esm/indexes/base-index.js 835 B
./packages/db/dist/esm/indexes/btree-index.js 2 kB
./packages/db/dist/esm/indexes/lazy-index.js 1.21 kB
./packages/db/dist/esm/indexes/reverse-index.js 577 B
./packages/db/dist/esm/local-only.js 967 B
./packages/db/dist/esm/local-storage.js 2.33 kB
./packages/db/dist/esm/optimistic-action.js 294 B
./packages/db/dist/esm/proxy.js 3.86 kB
./packages/db/dist/esm/query/builder/functions.js 615 B
./packages/db/dist/esm/query/builder/index.js 4.04 kB
./packages/db/dist/esm/query/builder/ref-proxy.js 938 B
./packages/db/dist/esm/query/compiler/evaluators.js 1.55 kB
./packages/db/dist/esm/query/compiler/expressions.js 760 B
./packages/db/dist/esm/query/compiler/group-by.js 2.04 kB
./packages/db/dist/esm/query/compiler/select.js 1.28 kB
./packages/db/dist/esm/query/ir.js 785 B
./packages/db/dist/esm/query/live-query-collection.js 340 B
./packages/db/dist/esm/SortedMap.js 1.24 kB
./packages/db/dist/esm/transactions.js 3 kB
./packages/db/dist/esm/utils.js 1.01 kB
./packages/db/dist/esm/utils/browser-polyfills.js 365 B
./packages/db/dist/esm/utils/btree.js 6.01 kB
./packages/db/dist/esm/utils/comparison.js 754 B
./packages/db/dist/esm/utils/index-optimization.js 1.73 kB

compressed-size-action::db-package-size

@github-actions
Copy link
Contributor

github-actions bot commented Oct 1, 2025

Size Change: 0 B

Total Size: 1.46 kB

ℹ️ View Unchanged
Filename Size
./packages/react-db/dist/esm/index.js 152 B
./packages/react-db/dist/esm/useLiveQuery.js 1.31 kB

compressed-size-action::react-db-package-size

@samwillis samwillis force-pushed the samwillis/fix-self-join branch from c3e7c93 to 0f7798b Compare October 2, 2025 12:40
Copy link
Contributor

@kevin-dp kevin-dp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went through the PR and it looks good! Wasn't an easy change is the subscription and collection ID were spread everywhere, good job refactoring that! Left a few minor comments?

// Track alias remapping for subqueries (outer alias → inner alias)
// e.g., when .join({ activeUser: subquery }) where subquery uses .from({ user: collection })
// we store: aliasRemapping['activeUser'] = 'user'
const aliasRemapping: Record<string, string> = {}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure how this works. What if the subquery is also a query with a join?
Does that matter for this alias?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It always follows to the main path following the from.

id,
autoIndex: `off`,
config: {
autoIndex: `off`,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How come we need autoIndex in the top level object and also in the config property?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a mock of a collection object, and autoIndex is exposed as a prop on the top level as well as in the config prop.

@samwillis samwillis merged commit e52be92 into main Oct 13, 2025
6 checks passed
@samwillis samwillis deleted the samwillis/fix-self-join branch October 13, 2025 11:34
@github-actions github-actions bot mentioned this pull request Oct 13, 2025
@github-actions
Copy link
Contributor

🎉 This PR has been released!

Thank you for your contribution!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants