-
Notifications
You must be signed in to change notification settings - Fork 53
Fix not claiming cluster handler for some quirks v2 entities #548
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: dev
Are you sure you want to change the base?
Fix not claiming cluster handler for some quirks v2 entities #548
Conversation
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## dev #548 +/- ##
=======================================
Coverage 96.94% 96.95%
=======================================
Files 63 63
Lines 10518 10525 +7
=======================================
+ Hits 10197 10204 +7
Misses 321 321 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me! I've left a few comments.
Long-term, I would like to explore a way to get rid of cluster handlers. Maybe we can have class-based entities (internal to ZHA) provide a nicer way to set up binding and reporting config.
| .command_button( | ||
| FakeManufacturerCluster.ServerCommandDefs.self_test.name, | ||
| OppleCluster.cluster_id, | ||
| command_args=(5,), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tiny nitpick:
| command_args=(5,), | |
| command_kwargs={"identify_time": 5}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch. Copied that from some other test, but I'll change that.
We'll probably want to adjust this in a lot of existing tests as well at a later date, including (completely) using kwargs for creating quirks v2 entities.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FakeManufacturerCluster.ServerCommandDefs.self_test.name should also be changed to OppleCluster[...]. It's the same command name and we never execute this, since we just need it for testing the quirks v2 metadata processing in the device discovery, but it should still be adjusted.
| .add_to_registry() | ||
| ) | ||
|
|
||
| zigpy_device = create_mock_zigpy_device( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another nitpick, feel free to ignore if it's too much trouble. Is there any way to test using JSON diagnostics from a real device? I'd like to phase out use of create_mock_zigpy_device and other synthetic testing setups in favor of using real devices.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I explicitly used create_mock_zigpy_device, as that's what the similar tests in test_sensor used and especially here, we really want to make sure that we get this exact device with that exact cluster setup.
I feel like we always run into a bit of a risk with using JSON diagnostics in tests, as it's not clear how endpoints and clusters are layed out by just looking at the tests. You have to look at the (huge) JSON file. It's also harder to get a device exactly in the way you want it, without adding unnecessary endpoints/clusters/quirk interference.
Some parts of the device/diagnostics can change over time as well (i.e. ZHA or quirk doing something different, firmware update, ...), when we may not want that for all tests where the diagnostics are used in.
| [cluster_handler.name], | ||
| ) | ||
|
|
||
| # if the cluster handler is unclaimed, claim it and set BIND accordingly, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder: should we move this to happen before the yield? That way, the cluster handler is fully set up before the entity initialization happens.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can't really do that in a nice way, since we need to iterate over all quirks v2 entity metadata for a specific cluster handler before we can make the final decision on whether we can set BIND to False.
This is all still done "pre initialization", since they're just added to Device._pending_entities in discovery here:
Lines 939 to 950 in b172549
| new_entities = discovery.DEVICE_PROBE.discover_device_entities(self) | |
| # Discover all applicable entities | |
| for entity in new_entities: | |
| if self._is_entity_removed_by_quirk(entity): | |
| continue | |
| # Apply any metadata changes from quirks v2 | |
| self._apply_entity_metadata_changes(entity) | |
| entity.on_add() | |
| self._pending_entities.append(entity) |
Nothing in
entity.on_add() can also ever rely on the cluster handler being claimed or not.The actual cluster handler initialization is done way later, where it then matters if a cluster handler is claimed or not. I think it's fine to keep it like this tbh.
self._discover_new_entities() does everything we do here. Only afterwards is endpoint.async_initialize called:
Lines 956 to 969 in b172549
| self._discover_new_entities() | |
| await self._zdo_handler.async_initialize(from_cache) | |
| self._zdo_handler.debug("'async_initialize' stage succeeded") | |
| # We intentionally do not use `gather` here! This is so that if, for example, | |
| # three `device.async_initialize()`s are spawned, only three concurrent requests | |
| # will ever be in flight at once. Startup concurrency is managed at the device | |
| # level. | |
| for endpoint in self._endpoints.values(): | |
| try: | |
| await endpoint.async_initialize(from_cache) | |
| except Exception: # pylint: disable=broad-exception-caught | |
| self.debug("Failed to initialize endpoint", exc_info=True) |
| # if the cluster handler is unclaimed, claim it and set BIND accordingly, | ||
| # so ZHA configures the cluster handler: reporting + reads attributes | ||
| if (attribute_initialization_found or reporting_found) and ( | ||
| cluster_handler not in endpoint.claimed_cluster_handlers.values() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any way for the cluster handler to already be claimed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The cluster handler can already be claimed by entities from EndpointProbe. Since those ZHA entities may rely on the default of BIND = True, I've added the cluster_handler not in endpoint.claimed_cluster_handlers.values() check to make sure we only ever set BIND = False for cluster handlers only claimed for quirks v2 entities.
This is also checked by the tests.
Yeah, I think they're not too bad, but they could definitely use some improvement (or yeah, possibly be removed entirely). For quirks v2, we now automatically handle setting up cluster handlers under the hood (claiming + attribute initialization + reporting), so we can definitely change how we do that, without breaking the quirks v2 API. |
Proposed change
This fixes an issue where ZHA doesn't claim the cluster handler for quirks v2 entities that only require the specified attribute to be read during device initialization (pairing), but not bound, causing the cluster handler to not be configured, thus never reading the required attributes, leading to "unknown" state for affected quirks v2 entities after initial pairing.
Two new tests are also added to (1) test that the cluster handler is claimed in the above mentioned scenario, where no binding is configured, and (2) to test that setting up a quirks v2 "command button" entity doesn't claim the cluster handler unnecessarily.
Note
We currently claim cluster handlers for all quirks v2 metadata that contains
attribute_name, which I believe basically all do, except for command button. We don't really need to claim a cluster handler + read the attribute for "write attribute buttons" though. We may wanna change this behavior in this PR.Additional information
For more information on the underlying issues this fixes, see: zigpy/backlog#52
Note on tests
The tests may be a bit misplaced in
test_sensorandtest_button. They do (need to) utilize quirks v2 entities, but it's really the underlying (quirks v2 device) discovery that is tested.For now, I'd leave them in-place to minimize the diff, and they do test quirks v2 entities somewhat. In the future, we may wanna move the tests though.