-
Notifications
You must be signed in to change notification settings - Fork 25.6k
Multi-relation support, a pre-requisite for views and sub-queries #136780
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Pinging @elastic/es-analytical-engine (Team:Analytics) |
ℹ️ Important: Docs version tagging👋 Thanks for updating the docs! Just a friendly reminder that our docs are now cumulative. This means all 9.x versions are documented on the same page and published off of the main branch, instead of creating separate pages for each minor version. We use applies_to tags to mark version-specific features and changes. Expand for a quick overviewWhen to use applies_to tags:✅ At the page level to indicate which products/deployments the content applies to (mandatory) What NOT to do:❌ Don't remove or replace information that applies to an older version 🤔 Need help?
|
aa533ad
to
1fc334e
Compare
This is extracted from the views prototype, which also includes necessary support for sub-queries.
1fc334e
to
9352610
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The changes to Analyzer
, AnalyzerContext
and PreAnalyzer
make sense to me. I applied the changes to the subquery PR, and subqueries can be resolved properly, thank you @craigtaverner !
The changes to EsqlSession
and EsqlCCSUtils
also look fine to me.
; | ||
|
||
country:text |language_name:keyword |MAX(language_code):integer |language_code:integer | ||
country:keyword |language_name:keyword |MAX(language_code):integer |language_code:integer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am curious - why this change is happening in this pull?
} | ||
} | ||
|
||
static void handleFieldCapsFailures( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure I get this part - why do we need two functions?
} | ||
|
||
private LogicalPlan resolveInsist(Insist insist, List<Attribute> childrenOutput, IndexResolution indexResolution) { | ||
private LogicalPlan resolveInsist(Insist insist, List<Attribute> childrenOutput, Collection<IndexResolution> indexResolution) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we should use plural here if we're passing a collection now?
|
||
// Field is partially unmapped. | ||
if (resolvedCol instanceof FieldAttribute fa && indexResolution.get().isPartiallyUnmappedField(fa.name())) { | ||
// TODO: Should the check for partially unmapped fields be done specific to each sub-query in a fork? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I understand this part correctly - if multiple resolutions come with subqueries, aren't they scoped within the subqueries? How does it work - if I have a mapping in index in one subquery, I can't use it in another subquery, can I? What about propagating mappings up/down the subquery tree?
ignoreOrder:true | ||
language_code:integer | country:text | language_name:keyword | ||
2 | [Austria, Germany] | German | ||
2 | [Germany, Austria] | German |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why the order change?
public record PreAnalysis( | ||
IndexMode indexMode, | ||
IndexPattern indexPattern, | ||
Map<IndexPattern, IndexMode> indexes, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think Elastic code uses "indices" not "indexes"?
EsqlFunctionRegistry functionRegistry, | ||
IndexResolution indexResolution, | ||
Map<IndexPattern, IndexResolution> indexResolution, | ||
Map<String, IndexResolution> lookupResolution, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think here a comment why we index one by String
and another by IndexPattern
would help. I think I understand why (because lookup only uses single index while general case can have arbitrary expressions) but it took me some work to figure it out.
indexMode.set(p.indexMode()); | ||
indexPattern.set(p.indexPattern()); | ||
} else if (indexes.containsKey(p.indexPattern()) == false || indexes.get(p.indexPattern()) == p.indexMode()) { | ||
indexes.put(p.indexPattern(), p.indexMode()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure I get why we do the put if we already have exactly the same element already. Can't we simplify this as:
} else {
IndexMode previous = indexes.put(p.indexPattern(), p.indexMode());
if(previous != null && previous != p.indexMode()) {
throw new IllegalStateException(
"index pattern '" + p.indexPattern() + "' found with with different index mode: " + previous + " != " + p.indexMode()
);
}
}
Also, when exactly can such a situation happen anyway? And why lookup mode is excluded from the check?
Holder<Boolean> supportsDenseVector = new Holder<>(false); | ||
indexes.forEach((ip, mode) -> { | ||
if (mode == IndexMode.TIME_SERIES) { | ||
supportsAggregateMetricDouble.set(true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here I am not sure how it's supposed to work with subqueries. Should this be enabled for the whole query, or only for the subquery that this particular pattern belongs to? Maybe this is out of scope for this patch, just trying to figure out the concept.
@@ -0,0 +1,127 @@ | |||
/* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A bit confused about how UnionAll relates to the rest of the patch. It's here but nothing else is mentioning or using it. Am I missing something in the picture?
This is extracted from the views prototype at #134995. It provides the underlying support for multiple
FROM ...
commands within the same ES|QL query, something that both non-materialized views and subqueries require.A key feature of this work is that all
UnresolvedRelation
instances are treated equivalently, which means they all support CPS and CPS just as before. This implies that sub-queries using this will support CCS/CPS directly, and for views it means:For reviewing this PR, it should be noted that most of the changes are in tests that had mistakes that went unnoticed because they only ever expected a single IndexPattern, and so could get away with mocking a single IndexResolution even if it was for a different index pattern than expressed in the test. The mismatch in index name between FROM and IndexResolution had no consequences. This is no longer possible since we need the IndexPattern to differentiate between different FROM commands. If you want to focus on the functional changes in this PR, look mostly at
EsqlSession
,AnalyzerContext
,Analyzer
andEsqlCCSUtils
.