Skip to content

Conversation

@snuyanzin
Copy link
Contributor

@snuyanzin snuyanzin commented Nov 28, 2025

What is the purpose of the change

The PR is to fix StackOverflow for queries like

SELECT coalesce(SELECT invalid);

Brief change log

The way how StackOverflow appears:

  1. Validation reached SqlValidatorImpl#validateNamespace
  2. At some point deeper in stack it reaches SqlValidatorImpl#inferUnknownTypes
  3. In case of functions (in this case coalesce) it invokes Flink's TypeInferenceOperandInference#inferOperandTypes
  4. Then deeper it fails in DelegatingScope#fullyQualify with Column 'invalid' not found in any table
  5. Now the problem is that this failure is swallowed in bullet 3.
  6. Thus instead of failing the whole validation it continues endlessly failing and swallowing the error.

Verifying this change

There are tests added

Does this pull request potentially affect one of the following parts:

  • Dependencies (does it add or upgrade a dependency): (no)
  • The public API, i.e., is any changed class annotated with @Public(Evolving): ( no)
  • The serializers: (no)
  • The runtime per-record code paths (performance sensitive): (no)
  • Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (no)
  • The S3 file system connector: (no)

Documentation

  • Does this pull request introduce a new feature? (no)
  • If yes, how is the feature documented? (not applicable)

… invoked under select fails with StackOverflow
@flinkbot
Copy link
Collaborator

flinkbot commented Nov 28, 2025

CI report:

Bot commands The @flinkbot bot supports the following commands:
  • @flinkbot run azure re-run the last Azure build

@snuyanzin snuyanzin changed the title [FLINK-38750][table] Validation of queries with functions erroneously invoked under select fails with StackOverflow <!-- [FLINK-38750][table] Validation of queries with functions erroneously invoked under select fails with StackOverflow Nov 28, 2025
@snuyanzin snuyanzin changed the title [FLINK-38750][table] Validation of queries with functions erroneously invoked under select fails with StackOverflow [FLINK-38750][table] Validation of queries with functions erroneously invoked under SELECT fails with StackOverflow Nov 28, 2025
@snuyanzin snuyanzin changed the title [FLINK-38750][table] Validation of queries with functions erroneously invoked under SELECT fails with StackOverflow [FLINK-38750][table] Validation of queries with functions erroneously invoked under SELECT fails with StackOverflow Nov 28, 2025
Comment on lines +83 to 84
} catch (ValidationException e) {
// let operand checker fail
Copy link
Contributor Author

@snuyanzin snuyanzin Nov 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rethrowing here as CalciteContextException wrapped into ValidationException will not help for complex cases and will lead again to StackOverflow: first it rethrows as ValidationException and then swallows.

For that reason there is testNestedCoalesceOnInvalidField highlighting this case with a query

SELECT coalesce(SELECT coalesce(SELECT coalesce(invalid)))

} catch (CalciteContextException e) {
throw e;
} catch (Throwable t) {
if (t.getCause() instanceof CalciteContextException) {
Copy link
Contributor

@davidradl davidradl Dec 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am curious, do we need to check down the chain of causes for a match, rather than the first cause? I am not sure if we can get into this situation, where the cause of the cause is a CalciteContextException or the like. Maybe using this code

Copy link
Contributor Author

@snuyanzin snuyanzin Dec 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so far I'm not aware of such cases and can not imagine how I can check that it helps any query...
however even if they are present the only problem thing which might happen is a not user friendly error message since the user friendly will be inside CalciteContextException.

Thus it will not be StackOverflow

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After looking one more time here I was not able to find a test entering this code
so I removed these lines, the PR should be a bit shorter now
thanks for attracting attention to these lines

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

community-reviewed PR has been reviewed by the community.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants