Fix not claiming cluster handler for some quirks v2 entities #548

TheJulianJES · 2025-10-22T11:34:35Z

Proposed change

This fixes an issue where ZHA doesn't claim the cluster handler for quirks v2 entities that only require the specified attribute to be read during device initialization (pairing), but not bound, causing the cluster handler to not be configured, thus never reading the required attributes, leading to "unknown" state for affected quirks v2 entities after initial pairing.

Two new tests are also added to (1) test that the cluster handler is claimed in the above mentioned scenario, where no binding is configured, and (2) to test that setting up a quirks v2 "command button" entity doesn't claim the cluster handler unnecessarily.

Note

We currently claim cluster handlers for all quirks v2 metadata that contains attribute_name, which I believe basically all do, except for command button. We don't really need to claim a cluster handler + read the attribute for "write attribute buttons" though. We may wanna change this behavior in this PR.

Additional information

For more information on the underlying issues this fixes, see: zigpy/backlog#52

Note on tests

The tests may be a bit misplaced in test_sensor and test_button. They do (need to) utilize quirks v2 entities, but it's really the underlying (quirks v2 device) discovery that is tested.

For now, I'd leave them in-place to minimize the diff, and they do test quirks v2 entities somewhat. In the future, we may wanna move the tests though.

…ndler

codecov · 2025-10-22T16:03:06Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 96.95%. Comparing base (b172549) to head (bcf7b37).

Additional details and impacted files

@@           Coverage Diff           @@
##              dev     #548   +/-   ##
=======================================
  Coverage   96.94%   96.95%           
=======================================
  Files          63       63           
  Lines       10518    10525    +7     
=======================================
+ Hits        10197    10204    +7     
  Misses        321      321

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

puddly

Looks good to me! I've left a few comments.

Long-term, I would like to explore a way to get rid of cluster handlers. Maybe we can have class-based entities (internal to ZHA) provide a nicer way to set up binding and reporting config.

puddly · 2025-10-22T16:11:06Z

tests/test_button.py

+        .command_button(
+            FakeManufacturerCluster.ServerCommandDefs.self_test.name,
+            OppleCluster.cluster_id,
+            command_args=(5,),


Tiny nitpick:

Suggested change

command_args=(5,),

command_kwargs={"identify_time": 5},

Good catch. Copied that from some other test, but I'll change that.
We'll probably want to adjust this in a lot of existing tests as well at a later date, including (completely) using kwargs for creating quirks v2 entities.

FakeManufacturerCluster.ServerCommandDefs.self_test.name should also be changed to OppleCluster[...]. It's the same command name and we never execute this, since we just need it for testing the quirks v2 metadata processing in the device discovery, but it should still be adjusted.

puddly · 2025-10-22T16:13:00Z

tests/test_button.py

+        .add_to_registry()
+    )
+
+    zigpy_device = create_mock_zigpy_device(


Another nitpick, feel free to ignore if it's too much trouble. Is there any way to test using JSON diagnostics from a real device? I'd like to phase out use of create_mock_zigpy_device and other synthetic testing setups in favor of using real devices.

Hmm, I explicitly used create_mock_zigpy_device, as that's what the similar tests in test_sensor used and especially here, we really want to make sure that we get this exact device with that exact cluster setup.

I feel like we always run into a bit of a risk with using JSON diagnostics in tests, as it's not clear how endpoints and clusters are layed out by just looking at the tests. You have to look at the (huge) JSON file. It's also harder to get a device exactly in the way you want it, without adding unnecessary endpoints/clusters/quirk interference.

Some parts of the device/diagnostics can change over time as well (i.e. ZHA or quirk doing something different, firmware update, ...), when we may not want that for all tests where the diagnostics are used in.

puddly · 2025-10-22T16:32:09Z

zha/application/discovery.py

                    [cluster_handler.name],
                )

+            # if the cluster handler is unclaimed, claim it and set BIND accordingly,


I wonder: should we move this to happen before the yield? That way, the cluster handler is fully set up before the entity initialization happens.

We can't really do that in a nice way, since we need to iterate over all quirks v2 entity metadata for a specific cluster handler before we can make the final decision on whether we can set BIND to False.

This is all still done "pre initialization", since they're just added to Device._pending_entities in discovery here:

zha/zha/zigbee/device.py

Lines 939 to 950 in b172549

new_entities = discovery.DEVICE_PROBE.discover_device_entities(self)

# Discover all applicable entities

for entity in new_entities:

if self._is_entity_removed_by_quirk(entity):

continue

# Apply any metadata changes from quirks v2

self._apply_entity_metadata_changes(entity)

entity.on_add()

self._pending_entities.append(entity)

Nothing in entity.on_add() can also ever rely on the cluster handler being claimed or not.
The actual cluster handler initialization is done way later, where it then matters if a cluster handler is claimed or not. I think it's fine to keep it like this tbh.

self._discover_new_entities() does everything we do here. Only afterwards is endpoint.async_initialize called:

zha/zha/zigbee/device.py

Lines 956 to 969 in b172549

self._discover_new_entities()

await self._zdo_handler.async_initialize(from_cache)

self._zdo_handler.debug("'async_initialize' stage succeeded")

# We intentionally do not use `gather` here! This is so that if, for example,

# three `device.async_initialize()`s are spawned, only three concurrent requests

# will ever be in flight at once. Startup concurrency is managed at the device

# level.

for endpoint in self._endpoints.values():

try:

await endpoint.async_initialize(from_cache)

except Exception: # pylint: disable=broad-exception-caught

self.debug("Failed to initialize endpoint", exc_info=True)

puddly · 2025-10-22T16:33:15Z

zha/application/discovery.py

+            # if the cluster handler is unclaimed, claim it and set BIND accordingly,
+            # so ZHA configures the cluster handler: reporting + reads attributes
+            if (attribute_initialization_found or reporting_found) and (
+                cluster_handler not in endpoint.claimed_cluster_handlers.values()


Is there any way for the cluster handler to already be claimed?

The cluster handler can already be claimed by entities from EndpointProbe. Since those ZHA entities may rely on the default of BIND = True, I've added the cluster_handler not in endpoint.claimed_cluster_handlers.values() check to make sure we only ever set BIND = False for cluster handlers only claimed for quirks v2 entities.

This is also checked by the tests.

TheJulianJES · 2025-10-22T16:58:10Z

Long-term, I would like to explore a way to get rid of cluster handlers. Maybe we can have class-based entities (internal to ZHA) provide a nicer way to set up binding and reporting config.

Yeah, I think they're not too bad, but they could definitely use some improvement (or yeah, possibly be removed entirely).

For quirks v2, we now automatically handle setting up cluster handlers under the hood (claiming + attribute initialization + reporting), so we can definitely change how we do that, without breaking the quirks v2 API.
As a first step, we might be able to introduce something similar to that for the "ZHA entities", so that relevant attributes are automatically read and configured from reporting, just by defining the entity alone.

TheJulianJES added 9 commits October 22, 2025 03:26

Add test to check for proper implementation

37390cf

WIP: first basic implementation

0ebbce8

Rework implementation

f98323d

Change test to use existing attribute

1a7d243

Test command button doesn't claim cluster handler unnecessarily

19c9279

Remove comment about write_attr_button causing claimage

a9f8141

Temporarily add comment about "write attr button" claiming cluster ha…

fc53e7a

…ndler

Change comment to emphasize that BIND = True by default already

4911051

Adjust three device tests to represent v2 entities initializing chs

bcf7b37

puddly approved these changes Oct 22, 2025

View reviewed changes

This was referenced Oct 23, 2025

Add Frient EMI 2 LED pulse configuration zigpy/zha-device-handlers#4421

Draft

EMIZB-141 - Frient Electricity Meter Interface 2 LED home-assistant/core#152075

Open

	new_entities = discovery.DEVICE_PROBE.discover_device_entities(self)

	# Discover all applicable entities
	for entity in new_entities:
	if self._is_entity_removed_by_quirk(entity):
	continue

	# Apply any metadata changes from quirks v2
	self._apply_entity_metadata_changes(entity)

	entity.on_add()
	self._pending_entities.append(entity)

	self._discover_new_entities()

	await self._zdo_handler.async_initialize(from_cache)
	self._zdo_handler.debug("'async_initialize' stage succeeded")

	# We intentionally do not use `gather` here! This is so that if, for example,
	# three `device.async_initialize()`s are spawned, only three concurrent requests
	# will ever be in flight at once. Startup concurrency is managed at the device
	# level.
	for endpoint in self._endpoints.values():
	try:
	await endpoint.async_initialize(from_cache)
	except Exception: # pylint: disable=broad-exception-caught
	self.debug("Failed to initialize endpoint", exc_info=True)

Fix not claiming cluster handler for some quirks v2 entities #548

Are you sure you want to change the base?

Fix not claiming cluster handler for some quirks v2 entities #548

Uh oh!

Conversation

TheJulianJES commented Oct 22, 2025

Proposed change

Note

Additional information

Note on tests

Uh oh!

codecov bot commented Oct 22, 2025

Codecov Report

Uh oh!

puddly left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

puddly Oct 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

TheJulianJES commented Oct 22, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

puddly Oct 22, 2025 •

edited

Loading