Skip to content

Conversation

@gkevinzheng
Copy link
Contributor

Additional system tests for testing shim compatibility with the classic client interface.

Also fixes a bug where encountering a retriable error after another retriable error mid-stream raised an exception instead of retrying.

@gkevinzheng gkevinzheng requested review from a team as code owners October 14, 2025 18:50
@product-auto-label product-auto-label bot added the size: l Pull request size is large. label Oct 14, 2025
@product-auto-label product-auto-label bot added the api: bigtable Issues related to the googleapis/python-bigtable API. label Oct 14, 2025
@gkevinzheng gkevinzheng changed the base branch from main to v3_staging October 15, 2025 14:40
Copy link
Contributor

@daniel-sanche daniel-sanche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple comments for now, but I'll look at the tests in more detail soon

env_vars: {
key: "NOX_SESSION"
value: "system-3.12"
} No newline at end of file
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pulled in main, so this shouldn't be necessary anymore

self._row_merger = _RowMerger(self._row_merger.last_seen_row_key)
self.response_iterator = self.read_method(retry_request)
self.response_iterator = self.read_method(retry_request, retry=self.retry)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why was this change needed?

Copy link
Contributor Author

@gkevinzheng gkevinzheng Oct 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change was needed because the read_rows method that gets passed in gets called with the GAPIC default retry instead of the retry that the user passed in or DEFAULT_READ_RETRY for this instance. If there's an error with this specific instance of self.read_method it doesn't get retried and errors out.

For example, if a retriable InternalError occurred mid-stream, then in this on_error handler the same error occurred, the error in the on_error handler would not be retried because the specific retry was not passed in for that RPC call.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit torn, because this does feel like a very real bug. But the code here is also very complex, so I'm hesitant to make changes to the retry policy in case there are side-effects we're not considering

But I guess this is coming in the context of adding more system tests. Do you think the tests we have now are sufficient to catch any errors?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, on second thought, this is a PR into the v3_staging branch, not main. We will be doing much more extensive changes here as part of the shim, before cutting a release. So I think adding this fix makes sense

class SelectiveMethodsErrorInjector(UnaryStreamClientInterceptor):
def __init__(self):
# As long as there are more items on this list, the items on the list
# are as follows:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume after the list is exhausted, it always behaves as normal?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it should. I'll put a comment for that.

data_table, rows_to_delete, row_keys, columns, CELL_VAL_READ_ROWS_RETRY
)

yield data_table
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this re-populating the table before each test function? Is that necessary?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me refactor the read_rows system tests to reconfigure the error injector instead of re-populating the table before each test function.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit confused by the organization here. Why are some tests in a class, and others stand-alone? Do we need a separate fixture to populate the table? Do we really want to re-populate the table more than once per run?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe take a look at the metrics tests, and see if you can structure things that way. Instead of building a new data_table_read_rows_with_error_injector for each function, you can build it once and clear its state between functions



def _populate_table(data_table, rows_to_delete, row_keys):
def _add_test_error_handler(retry):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a docstring could be helpful here

It looks like it overrides the on_error function to assert that backoff values are within expected bounds?

self._row_merger = _RowMerger(self._row_merger.last_seen_row_key)
self.response_iterator = self.read_method(retry_request)
self.response_iterator = self.read_method(retry_request, retry=self.retry)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit torn, because this does feel like a very real bug. But the code here is also very complex, so I'm hesitant to make changes to the retry policy in case there are side-effects we're not considering

But I guess this is coming in the context of adding more system tests. Do you think the tests we have now are sufficient to catch any errors?

self._row_merger = _RowMerger(self._row_merger.last_seen_row_key)
self.response_iterator = self.read_method(retry_request)
self.response_iterator = self.read_method(retry_request, retry=self.retry)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, on second thought, this is a PR into the v3_staging branch, not main. We will be doing much more extensive changes here as part of the shim, before cutting a release. So I think adding this fix makes sense

data_table, rows_to_delete, row_keys, columns, CELL_VAL_READ_ROWS_RETRY
)

yield data_table
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit confused by the organization here. Why are some tests in a class, and others stand-alone? Do we need a separate fixture to populate the table? Do we really want to re-populate the table more than once per run?

for row in rows_to_delete:
row.clear()
row.delete()
row.commit()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should use a try...finally block to make sure resources are fully deleted, anywhere you spin up new resources. (Especially for clusters and tables)

data_table, rows_to_delete, row_keys, columns, CELL_VAL_READ_ROWS_RETRY
)

yield data_table
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe take a look at the metrics tests, and see if you can structure things that way. Instead of building a new data_table_read_rows_with_error_injector for each function, you can build it once and clear its state between functions

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

api: bigtable Issues related to the googleapis/python-bigtable API. size: l Pull request size is large.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants