Skip to content

Conversation

@devin-ai-integration
Copy link
Contributor

@devin-ai-integration devin-ai-integration bot commented Nov 26, 2025

fix(low-code): handle all ScannerError exceptions in ConfigComponentsResolver

Summary

This PR fixes a bug where custom GAQL queries containing tab characters (\t) would cause discovery to fail with a YAML ScannerError. The _parse_yaml_if_possible method in ConfigComponentsResolver was only catching ScannerError exceptions containing the % character, but re-raising all other ScannerError types.

The fix simplifies the exception handling to catch all ScannerError exceptions and return the original value unchanged, which is consistent with how ParserError is already handled.

Root cause: When strings containing tab characters are passed through yaml.safe_load(), YAML raises a ScannerError because tabs cannot start tokens in YAML. The previous code only handled the % character case.

Related issue: https://github.com/airbytehq/oncall/issues/10280

Review & Testing Checklist for Human

  • Verify broader exception catching is safe: The change catches ALL ScannerError exceptions now instead of just the % character case. Confirm this won't mask legitimate YAML errors that should be surfaced to users.
  • Run the full test suite for test_config_components_resolver.py to ensure no regressions
  • Consider edge cases: Are there other ScannerError scenarios that should be tested beyond % and tab characters?

Recommended test plan:

  1. Run pytest unit_tests/sources/declarative/resolvers/test_config_components_resolver.py -v
  2. Optionally test with a Google Ads connector config containing tab-indented custom GAQL queries

Notes

…Resolver

Previously, the _parse_yaml_if_possible method only caught ScannerError
exceptions containing '%' characters, but re-raised other ScannerError
types. This caused failures when strings containing tab characters were
passed through the YAML parser, as tabs cannot start tokens in YAML.

This fix catches all ScannerError exceptions and returns the original
value unchanged, which is the expected behavior for strings that are
not valid YAML.

Fixes: airbytehq/oncall#10280
Co-Authored-By: unknown <>
@devin-ai-integration
Copy link
Contributor Author

Original prompt from API User
Issue #10280 by @vikram661: Google Ads: Custom Streams discovery failing\n\nIssue URL: https://github.com/airbytehq/oncall/issues/10280\n\nPlease use playbook macro: !issue_triage

PLAYBOOK_md:
# `/ai-triage` Slash Command Playbook

You are AI Triage Devin, an expert at analyzing Airbyte-related issues and providing actionable insights. You are responding to a GitHub slash command request. After reading the provided context, you should post a comment to confirm you understand the request and stating what your next steps will be, along with a link to your session. Once your triage and analysis is complete, update your comment with the full results of your triage. Collapse all of your comments under expandable sections.

IMPORTANT: Expect that your user has no access to the session and cannot talk with you directly. Do not wait for feedback or confirmation on any action.

## Context

You are analyzing the issue provided to you above. You will need to pull comment history on this issue to ensure you have full context.

## Your Task: Static Analysis and Triage

1. **Issue Analysis and Confirmation**: Read the complete issue content including all comments for full context.
   - **Post an initial comment immediately** (within 1-2 minutes) to confirm you understand the assignment and that you are looking into it. Include your session URL.
   - If you are missing any critical information or context (e.g., workspace UUID, connector version, error logs, reproduction steps, customer environment details), include in your initial comment a request for additional context. (Do not block waiting for an answer, but instead continue as if you will not get any more information in your current session.)

2. **Research**: Check the internet for similar errors, symptoms, or issues reported by the community. Look for:
   - Similar error messages or stack traces in Airbyte documentation.
   - Known issues in Airbyte GitHub repositories.
   - Community discussions about related problems.
  ... (8690 chars truncated...)

@devin-ai-integration
Copy link
Contributor Author

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
  • Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

  • Disable automatic comment and CI monitoring

@github-actions github-actions bot added the bug Something isn't working label Nov 26, 2025
@github-actions
Copy link

👋 Greetings, Airbyte Team Member!

Here are some helpful tips and reminders for your convenience.

Testing This CDK Version

You can test this version of the CDK using the following:

# Run the CLI from this branch:
uvx 'git+https://github.com/airbytehq/airbyte-python-cdk.git@devin/1764137058-fix-scanner-error-tab-chars#egg=airbyte-python-cdk[dev]' --help

# Update a connector to use the CDK from this branch ref:
cd airbyte-integrations/connectors/source-example
poe use-cdk-branch devin/1764137058-fix-scanner-error-tab-chars

Helpful Resources

PR Slash Commands

Airbyte Maintainers can execute the following slash commands on your PR:

  • /autofix - Fixes most formatting and linting issues
  • /poetry-lock - Updates poetry.lock file
  • /test - Runs connector tests with the updated CDK
  • /prerelease - Triggers a prerelease publish with default arguments
  • /poe build - Regenerate git-committed build artifacts, such as the pydantic models which are generated from the manifest JSON schema in YAML.
  • /poe <command> - Runs any poe command in the CDK environment

📝 Edit this welcome message.

Co-Authored-By: unknown <>
@github-actions
Copy link

github-actions bot commented Nov 26, 2025

PyTest Results (Fast)

3 814 tests  +1   3 802 ✅ +1   6m 11s ⏱️ -20s
    1 suites ±0      12 💤 ±0 
    1 files   ±0       0 ❌ ±0 

Results for commit 6c4ca2c. ± Comparison against base commit 80b7668.

♻️ This comment has been updated with latest results.

@github-actions
Copy link

github-actions bot commented Nov 26, 2025

PyTest Results (Full)

3 817 tests  +1   3 805 ✅ +1   10m 50s ⏱️ -6s
    1 suites ±0      12 💤 ±0 
    1 files   ±0       0 ❌ ±0 

Results for commit 6c4ca2c. ± Comparison against base commit 80b7668.

♻️ This comment has been updated with latest results.

Per reviewer feedback, update the fix to specifically handle the tab
character error case rather than catching all ScannerError exceptions.
This maintains the existing pattern of handling specific error cases.

Co-Authored-By: unknown <>
@tolik0 tolik0 marked this pull request as ready for review November 26, 2025 16:50
Copilot AI review requested due to automatic review settings November 26, 2025 16:50
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes a bug in the ConfigComponentsResolver where YAML ScannerError exceptions caused by tab characters in configuration values (such as custom GAQL queries) would cause discovery to fail. The fix extends the exception handling in _parse_yaml_if_possible to catch tab-related ScannerError exceptions in addition to the existing % character case.

Key changes:

  • Extended ScannerError exception handling to include tab character errors
  • Added test case for tab character scenario
  • Updated inline comment to reflect broader coverage

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
airbyte_cdk/sources/declarative/resolvers/config_components_resolver.py Added handling for tab character ScannerError with additional condition check
unit_tests/sources/declarative/resolvers/test_config_components_resolver.py Added test manifest and test case for tab character scenario

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +207 to 212
except ScannerError as e: # "%Y-%m-%d" or strings with tabs - not valid YAML
if "expected alphabetic or numeric character, but found '%'" in str(e):
return value
if "found character '\\t' that cannot start any token" in str(e):
return value
raise e
Copy link

Copilot AI Nov 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The pattern of checking specific error message strings and returning value for each case suggests a fragile approach. According to the PR description's root cause analysis, the intent is to catch ALL ScannerError exceptions since they indicate values that cannot be parsed as YAML. The current implementation still only catches two specific cases. Consider simplifying to except ScannerError: return value which would be more maintainable and align with the PR's stated goal of 'catch all ScannerError exceptions'.

Suggested change
except ScannerError as e: # "%Y-%m-%d" or strings with tabs - not valid YAML
if "expected alphabetic or numeric character, but found '%'" in str(e):
return value
if "found character '\\t' that cannot start any token" in str(e):
return value
raise e
except ScannerError: # any ScannerError means not valid YAML
return value

Copilot uses AI. Check for mistakes.
@tolik0 tolik0 changed the title fix(low-code): handle all ScannerError exceptions in ConfigComponentsResolver (do not merge) fix(low-code): handle all ScannerError exceptions in ConfigComponentsResolver Nov 26, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants