Skip to content

[MSD-494][feat] driver tmcm: if device fails to reply, immediately retry#3418

Draft
pieleric wants to merge 1 commit intodelmic:masterfrom
pieleric:feat-driver-tmcm-if-device-fails-to-reply-immediately-retry
Draft

[MSD-494][feat] driver tmcm: if device fails to reply, immediately retry#3418
pieleric wants to merge 1 commit intodelmic:masterfrom
pieleric:feat-driver-tmcm-if-device-fails-to-reply-immediately-retry

Conversation

@pieleric
Copy link
Member

@pieleric pieleric commented Mar 24, 2026

In some very rare cases, mainly during referencing, the TMCM-6110 fails
to reply to instructions. When this happens, typically the next or
second next instruction work again. It's not clear why this happens, but instead
of completely failing the action (eg, referencing), detect such
communication issue and retry.

Example error that this solves:

Traceback (most recent call last):
  File "/home/meteor/development/odemis/src/odemis/driver/tmcm.py", line 2100, in _doReference
    self._waitReferencing(aid)  # block until it's over
  File "/home/meteor/development/odemis/src/odemis/driver/tmcm.py", line 1604, in _waitReferencingStd
    self._releaseRefSwitch(axis)
  File "/home/meteor/development/odemis/src/odemis/driver/tmcm.py", line 1507, in _releaseRefSwitch
    self.SetIO(2, refswitch, 0)
  File "/home/meteor/development/odemis/src/odemis/driver/tmcm.py", line 1071, in SetIO
    self.SendInstruction(14, port, bank, value)
  File "/home/meteor/development/odemis/src/odemis/driver/tmcm.py", line 928, in SendInstruction
    raise IOError("Received only %d bytes after %s" %
OSError: Received only 0 bytes after 8, 14, 0, 2, 0 (24)

Copilot AI review requested due to automatic review settings March 24, 2026 15:51
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR improves resilience of the TMCM-6110 serial driver by detecting the “0 bytes received” transient failure mode and retrying communication before failing higher-level operations (e.g., referencing).

Changes:

  • Refactors SendInstruction into a single-attempt helper (_send_instruction_once) plus a 0-byte recovery path that pings GetVersion and retries once.
  • Adds unit tests covering the new 0-byte retry/ping behavior using the simulator and mocking.
  • Updates the serial timeout comment to clarify expected device response time.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 5 comments.

File Description
src/odemis/driver/tmcm.py Adds 0-byte detection, ping-based recovery, and a one-time retry path for TMCL instructions.
src/odemis/driver/test/tmcm_test.py Adds tests validating retry success/failure behavior and ensuring partial replies are not retried.

raise IOError("Failed to read from TMCM, restarted serial connection.")
if len(res) == 0: # Something went wrong, maybe the device discarded the message
raise TimeoutError("No bytes received after %s" % self._instr_to_str(msg))
elif len(res) < 9: # Only part of a message was received, sign something ever more wrong
Copy link

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spelling/grammar: “sign something ever more wrong” should be “signs something even more wrong” (or similar).

Suggested change
elif len(res) < 9: # Only part of a message was received, sign something ever more wrong
elif len(res) < 9: # Only part of a message was received, a sign something is even more wrong

Copilot uses AI. Check for mistakes.
def _make_mock_once(self, fail_non_ping_count: int, fail_ping_count: int = 0):
"""
Returns a replacement for _send_instruction_once that raises
IOError("Received only 0 bytes ...") for the first N calls of each
Copy link

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The _make_mock_once docstring says it raises IOError("Received only 0 bytes ..."), but the mock actually raises TimeoutError. Update the docstring to match the behavior to avoid confusion.

Suggested change
IOError("Received only 0 bytes ...") for the first N calls of each
TimeoutError("Received only 0 bytes ...") for the first N calls of each

Copilot uses AI. Check for mistakes.
Comment on lines +921 to +923
# Failure here should mean that the device didn't get the (complete)
# instruction, so it's safe to send the command again.
return self.SendInstruction(n, typ, mot, val)
Copy link

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_send_instruction_once is documented as having “no retry logic”, but on serial write IOError it calls SendInstruction(), which re-enters the retry/ping logic and introduces recursion. Consider keeping _send_instruction_once side-effect free (raise a dedicated exception or re-raise) and performing the reconnect/resend logic in SendInstruction using an explicit bounded loop to avoid recursion.

Suggested change
# Failure here should mean that the device didn't get the (complete)
# instruction, so it's safe to send the command again.
return self.SendInstruction(n, typ, mot, val)
# The caller is responsible for retrying the instruction if desired.
raise IOError("Failed to send command to TMCM, restarted serial connection.")

Copilot uses AI. Check for mistakes.
:param mot: motor/bank number (0-255)
:param val: value to send (-2**31 to 2**31-1)
:returns: reply value
:raises IOError: if problem with sending/receiving data over the serial port
Copy link

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SendInstruction can still raise TimeoutError (e.g., if the post-ping retry receives 0 bytes), but the docstring only lists IOError/TMCLError. Please update the docstring (and/or explicitly wrap TimeoutError) so callers know what to expect.

Suggested change
:raises IOError: if problem with sending/receiving data over the serial port
:raises IOError: if problem with sending/receiving data over the serial port
:raises TimeoutError: if the controller does not respond even after recovery attempts

Copilot uses AI. Check for mistakes.
Comment on lines +998 to +1002
except IOError: # includes TimeoutError (which derive from IOError)
logging.warning("Ping attempt %d/%d failed after 0-byte response",
ping_attempt + 1, MAX_PING_ATTEMPTS)
else:
raise IOError(f"Device not responding to instructions")
Copy link

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When all ping attempts fail, this raises a generic IOError without including the last failure details. Capturing the last exception and chaining it (raise ... from last_exc) and/or including the instruction id and attempt count in the message would make diagnosis easier.

Copilot uses AI. Check for mistakes.
@pieleric pieleric changed the title [feat] driver tmcm: if device fails to reply, immediately retry [MSD-494][feat] driver tmcm: if device fails to reply, immediately retry Mar 24, 2026
@pieleric pieleric marked this pull request as draft March 24, 2026 16:26
In some very rare cases, mainly during referencing, the TMCM-6110 fails
to reply to instructions. When this happens, typically the next or
second next instruction work again. It's not clear why this happens, but instead
of completely failing the action (eg, referencing), detect such
communication issue and retry.

Example error that this solves:
Traceback (most recent call last):
  File "/home/meteor/development/odemis/src/odemis/driver/tmcm.py", line 2100, in _doReference
    self._waitReferencing(aid)  # block until it's over
  File "/home/meteor/development/odemis/src/odemis/driver/tmcm.py", line 1604, in _waitReferencingStd
    self._releaseRefSwitch(axis)
  File "/home/meteor/development/odemis/src/odemis/driver/tmcm.py", line 1507, in _releaseRefSwitch
    self.SetIO(2, refswitch, 0)
  File "/home/meteor/development/odemis/src/odemis/driver/tmcm.py", line 1071, in SetIO
    self.SendInstruction(14, port, bank, value)
  File "/home/meteor/development/odemis/src/odemis/driver/tmcm.py", line 928, in SendInstruction
    raise IOError("Received only %d bytes after %s" %
OSError: Received only 0 bytes after 8, 14, 0, 2, 0 (24)
@pieleric pieleric force-pushed the feat-driver-tmcm-if-device-fails-to-reply-immediately-retry branch from 9ee503b to e5ca90c Compare March 26, 2026 15:45
@coderabbitai
Copy link

coderabbitai bot commented Mar 26, 2026

📝 Walkthrough

Walkthrough

The changes refactor the TMCL serial command handling by splitting the monolithic SendInstruction() method into two private helpers: _send_query() for building and sending instructions, and _receive_answer() for reading and validating device replies. The refactored SendInstruction() introduces a new retry mechanism that issues TimeoutError-triggered recoveries using up to three GetVersion (ping) attempts before retrying the original instruction. Concurrently, a comprehensive test suite (TestSendInstructionRetry) validates this retry behavior across success paths, failure cases, and edge scenarios involving ping failures and partial replies.

Sequence Diagram(s)

sequenceDiagram
    actor Caller
    participant SI as SendInstruction()
    participant SQ as _send_query()
    participant RA as _receive_answer()
    participant Dev as Device/Socket

    Caller->>SI: SendInstruction(n, typ, mot, val)
    SI->>SQ: _send_query(n, typ, mot, val)
    SQ->>Dev: write 9-byte instruction
    SQ-->>SI: return message buffer
    SI->>RA: _receive_answer(n, msg)
    RA->>Dev: read reply
    RA-->>SI: return status (success)
    SI-->>Caller: return status

    rect rgba(255, 200, 100, 0.5)
    Note over SI,Dev: Retry Path (on TimeoutError)
    SI->>SI: attempt_count < limit
    SI->>SQ: _send_query(136, 0, 0, 0) [ping]
    SQ->>Dev: write GetVersion (ping)
    SQ-->>SI: return msg
    SI->>RA: _receive_answer(136, msg)
    RA->>Dev: read ping reply
    alt Ping succeeds
        RA-->>SI: return status
    else Ping timeout
        SI->>SI: retry ping (up to 3 times)
    end
    SI->>SQ: _send_query(n, typ, mot, val) [retry original]
    SQ->>Dev: write instruction (retry)
    SQ-->>SI: return msg
    SI->>RA: _receive_answer(n, msg)
    RA->>Dev: read reply
    RA-->>SI: return status
    SI-->>Caller: return status/raise error
    end
Loading
🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 70.59% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly describes the main change: implementing retry logic for failed device replies in the TMCM driver, which matches the core objective of handling communication failures during instruction transmission.
Description check ✅ Passed The description is directly related to the changeset, explaining the problem (rare device reply failures during referencing), the solution (retry mechanism), and providing a concrete error example from the codebase.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
src/odemis/driver/test/tmcm_test.py (1)

162-166: Annotate and document the new fixture hooks.

setUp() and tearDown() are new functions in a Python file, but they still miss -> None and docstrings.

Proposed fix
-    def setUp(self):
+    def setUp(self) -> None:
+        """Create a simulator-backed controller for each retry test."""
         self.dev = CLASS(**KWARGS_SIM)
@@
-    def tearDown(self):
+    def tearDown(self) -> None:
+        """Terminate the simulator-backed controller created in :meth:`setUp`."""
         self.dev.terminate()

As per coding guidelines "Always use type hints for function parameters and return types in Python code" and "Include docstrings for all functions and classes, following the reStructuredText style guide (without type information)".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/odemis/driver/test/tmcm_test.py` around lines 162 - 166, Add explicit
return type hints and reStructuredText-style docstrings to the new test fixture
hooks: annotate both setUp(self) and tearDown(self) with "-> None" and add short
docstrings explaining their roles (e.g., setUp creates the test device from
CLASS using KWARGS_SIM and assigns it to self.dev; tearDown terminates/cleans up
self.dev). Update the methods in src/odemis/driver/test/tmcm_test.py (setUp,
tearDown, and references to CLASS/KWARGS_SIM/self.dev) to include these
additions so they follow the project's typing and docstring guidelines.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/odemis/driver/tmcm.py`:
- Around line 935-938: The current logic that treats a 0-byte read as a timeout
and then auto-resends the instruction (the branch guarding rn != n and the retry
path used by SendInstruction()) can double-apply non-idempotent commands like
MoveRelPos(); change SendInstruction() so it only retries automatically for a
small whitelist of safe/idempotent operations (e.g., read-only or
explicit-idempotent commands) or, alternatively, before resending attempt to
recover a late original reply from the incoming buffer/ping path (consume and
validate any pending reply matching the original sequence number rn) and only
resend if no valid late reply is found; update the rn != n handling and the
retry decision in SendInstruction() to implement one of these two behaviors and
ensure MoveRelPos() and other mutating commands are excluded from blind
auto-retry.
- Around line 876-897: The functions _send_query and _receive_answer use the
undefined string type annotation "Buffer"; replace that with the concrete
numpy.ndarray type (e.g. change return type of _send_query from "Buffer" to
numpy.ndarray and the msg parameter of _receive_answer from "Buffer" to
numpy.ndarray) so annotations are valid under Python 3.10+, keeping existing
numpy usage and without adding new unrelated imports; update both function
signatures and any other occurrences of "Buffer" in this module to
numpy.ndarray.

---

Nitpick comments:
In `@src/odemis/driver/test/tmcm_test.py`:
- Around line 162-166: Add explicit return type hints and reStructuredText-style
docstrings to the new test fixture hooks: annotate both setUp(self) and
tearDown(self) with "-> None" and add short docstrings explaining their roles
(e.g., setUp creates the test device from CLASS using KWARGS_SIM and assigns it
to self.dev; tearDown terminates/cleans up self.dev). Update the methods in
src/odemis/driver/test/tmcm_test.py (setUp, tearDown, and references to
CLASS/KWARGS_SIM/self.dev) to include these additions so they follow the
project's typing and docstring guidelines.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: acb04166-ca45-4387-9ad5-313e2ea8f634

📥 Commits

Reviewing files that changed from the base of the PR and between f9f275c and e5ca90c.

📒 Files selected for processing (2)
  • src/odemis/driver/test/tmcm_test.py
  • src/odemis/driver/tmcm.py

Comment on lines +876 to +897
def _send_query(self, n: int, typ: int, mot: int, val: int) -> "Buffer":
"""
Sends one query (instruction) without waiting for the reply and without retry logic.
The caller must already hold self._ser_access.

:param n: instruction ID (0-255)
:param typ: instruction type (0-255)
:param mot: motor/bank number (0-255)
:param val: value to send (-2**31 to 2**31-1)
:return: the message sent (9 bytes)
:raises IOError: if the serial write fails or the reply is incomplete
:raises TMCLError: if the device reports a bad status
"""
msg = numpy.empty(9, dtype=numpy.uint8)
struct.pack_into('>BBBBiB', msg, 0, self._target, n, typ, mot, val, 0)
msg[-1] = numpy.sum(msg[:-1], dtype=numpy.uint8)
logging.debug("Sending %s", self._instr_to_str(msg))
self._serial.write(msg)
self._serial.flush() # Wait for the data to be full sent (to detect any connection error)
return msg

# One possible reason is that the device disappeared because the
# cable was pulled out, or the power got cut (unlikely, as it's
# powered via 2 sources).
def _receive_answer(self, n: int, msg: "Buffer") -> int:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, let's check the imports at the top of the file
head -50 src/odemis/driver/tmcm.py

Repository: delmic/odemis

Length of output: 2084


🏁 Script executed:

# Search for 'Buffer' definition or import
rg "Buffer" src/odemis/driver/tmcm.py -n

Repository: delmic/odemis

Length of output: 202


🏁 Script executed:

# Look at the specific lines mentioned (876-897)
sed -n '870,905p' src/odemis/driver/tmcm.py

Repository: delmic/odemis

Length of output: 1714


🏁 Script executed:

# Check Python version support and typing imports
rg "from typing|import typing|from collections.abc" src/odemis/driver/tmcm.py

Repository: delmic/odemis

Length of output: 95


🏁 Script executed:

# Check if numpy is imported
rg "import numpy" src/odemis/driver/tmcm.py -n | head -5

Repository: delmic/odemis

Length of output: 72


🏁 Script executed:

# Verify the full context around line 876-897 to ensure numpy.ndarray is the right type
sed -n '1,100p' src/odemis/driver/tmcm.py | rg "import numpy"

Repository: delmic/odemis

Length of output: 69


🏁 Script executed:

# Search for any 'Buffer' definition anywhere in the codebase
rg "^Buffer\s*=" src/odemis/driver/tmcm.py

Repository: delmic/odemis

Length of output: 39


🏁 Script executed:

# Check if Buffer is defined as a type alias elsewhere in the file
rg "Buffer.*=" src/odemis/driver/tmcm.py

Repository: delmic/odemis

Length of output: 39


🏁 Script executed:

# Verify the actual return value and parameter type by examining how these functions are used
rg "_send_query\|_receive_answer" src/odemis/driver/tmcm.py -A 2 | head -40

Repository: delmic/odemis

Length of output: 39


🏁 Script executed:

# Check if there are any type aliases defined at module level
sed -n '50,150p' src/odemis/driver/tmcm.py | grep -E "^[A-Za-z_].*\s*=" | head -20

Repository: delmic/odemis

Length of output: 720


🏁 Script executed:

# Look at what _send_query actually returns
sed -n '876,897p' src/odemis/driver/tmcm.py | tail -5

Repository: delmic/odemis

Length of output: 268


🏁 Script executed:

# Check if there's a 'Buffer' in typing or collections.abc for Python 3.10
python3 <<'EOF'
# Check Python 3.10+ typing module for Buffer
import sys
print(f"Python version: {sys.version}")

# Check if Buffer exists in typing module
try:
    from typing import Buffer
    print("Buffer found in typing module")
except ImportError:
    print("Buffer NOT in typing module")

# Check if it's in collections.abc
try:
    from collections.abc import Buffer
    print("Buffer found in collections.abc")
except ImportError:
    print("Buffer NOT in collections.abc")

# Check numpy.ndarray
import numpy
print(f"numpy.ndarray exists: {hasattr(numpy, 'ndarray')}")
EOF

Repository: delmic/odemis

Length of output: 207


🏁 Script executed:

# Search for any other "Buffer" references in the file
rg '"Buffer"' src/odemis/driver/tmcm.py

Repository: delmic/odemis

Length of output: 194


🏁 Script executed:

# Check what Python version this file targets by looking for any setup.py or pyproject.toml
find . -maxdepth 2 -name "setup.py" -o -name "pyproject.toml" | xargs grep -l "python_requires\|requires-python" 2>/dev/null

Repository: delmic/odemis

Length of output: 41


Replace undefined "Buffer" type annotation with numpy.ndarray.

Buffer is not imported and is not a valid type in Python 3.10 or later. The functions return and accept numpy.ndarray objects, so use that concrete type instead of the undefined string annotation.

Proposed fix
-    def _send_query(self, n: int, typ: int, mot: int, val: int) -> "Buffer":
+    def _send_query(self, n: int, typ: int, mot: int, val: int) -> numpy.ndarray:
@@
-    def _receive_answer(self, n: int, msg: "Buffer") -> int:
+    def _receive_answer(self, n: int, msg: numpy.ndarray) -> int:

This violates the coding guideline "Ensure code is valid for Python 3.10 and above" and "Always use type hints for function parameters and return types in Python code".

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def _send_query(self, n: int, typ: int, mot: int, val: int) -> "Buffer":
"""
Sends one query (instruction) without waiting for the reply and without retry logic.
The caller must already hold self._ser_access.
:param n: instruction ID (0-255)
:param typ: instruction type (0-255)
:param mot: motor/bank number (0-255)
:param val: value to send (-2**31 to 2**31-1)
:return: the message sent (9 bytes)
:raises IOError: if the serial write fails or the reply is incomplete
:raises TMCLError: if the device reports a bad status
"""
msg = numpy.empty(9, dtype=numpy.uint8)
struct.pack_into('>BBBBiB', msg, 0, self._target, n, typ, mot, val, 0)
msg[-1] = numpy.sum(msg[:-1], dtype=numpy.uint8)
logging.debug("Sending %s", self._instr_to_str(msg))
self._serial.write(msg)
self._serial.flush() # Wait for the data to be full sent (to detect any connection error)
return msg
# One possible reason is that the device disappeared because the
# cable was pulled out, or the power got cut (unlikely, as it's
# powered via 2 sources).
def _receive_answer(self, n: int, msg: "Buffer") -> int:
def _send_query(self, n: int, typ: int, mot: int, val: int) -> numpy.ndarray:
"""
Sends one query (instruction) without waiting for the reply and without retry logic.
The caller must already hold self._ser_access.
:param n: instruction ID (0-255)
:param typ: instruction type (0-255)
:param mot: motor/bank number (0-255)
:param val: value to send (-2**31 to 2**31-1)
:return: the message sent (9 bytes)
:raises IOError: if the serial write fails or the reply is incomplete
:raises TMCLError: if the device reports a bad status
"""
msg = numpy.empty(9, dtype=numpy.uint8)
struct.pack_into('>BBBBiB', msg, 0, self._target, n, typ, mot, val, 0)
msg[-1] = numpy.sum(msg[:-1], dtype=numpy.uint8)
logging.debug("Sending %s", self._instr_to_str(msg))
self._serial.write(msg)
self._serial.flush() # Wait for the data to be full sent (to detect any connection error)
return msg
def _receive_answer(self, n: int, msg: numpy.ndarray) -> int:
🧰 Tools
🪛 Ruff (0.15.7)

[error] 876-876: Undefined name Buffer

(F821)


[error] 897-897: Undefined name Buffer

(F821)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/odemis/driver/tmcm.py` around lines 876 - 897, The functions _send_query
and _receive_answer use the undefined string type annotation "Buffer"; replace
that with the concrete numpy.ndarray type (e.g. change return type of
_send_query from "Buffer" to numpy.ndarray and the msg parameter of
_receive_answer from "Buffer" to numpy.ndarray) so annotations are valid under
Python 3.10+, keeping existing numpy usage and without adding new unrelated
imports; update both function signatures and any other occurrences of "Buffer"
in this module to numpy.ndarray.

Comment on lines +935 to +938
if rn != n:
logging.info("Skipping a message about instruction %d (waiting for %d)",
rn, n)
continue
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Don't auto-resend every timed-out instruction.

A 0-byte read here only proves that nothing arrived before the 100 ms serial timeout at Line 2306, not that the controller discarded the command. If the original reply is merely late, the ping path will consume it through the rn != n skip logic and then resend the instruction anyway. Since SendInstruction() also backs non-idempotent operations like MoveRelPos(), that can double-apply hardware actions. Please limit the retry to an explicit safe/idempotent subset, or recover the late original reply before reissuing the command.

Also applies to: 992-1005

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/odemis/driver/tmcm.py` around lines 935 - 938, The current logic that
treats a 0-byte read as a timeout and then auto-resends the instruction (the
branch guarding rn != n and the retry path used by SendInstruction()) can
double-apply non-idempotent commands like MoveRelPos(); change SendInstruction()
so it only retries automatically for a small whitelist of safe/idempotent
operations (e.g., read-only or explicit-idempotent commands) or, alternatively,
before resending attempt to recover a late original reply from the incoming
buffer/ping path (consume and validate any pending reply matching the original
sequence number rn) and only resend if no valid late reply is found; update the
rn != n handling and the retry decision in SendInstruction() to implement one of
these two behaviors and ensure MoveRelPos() and other mutating commands are
excluded from blind auto-retry.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants