-
Notifications
You must be signed in to change notification settings - Fork 17
docs: improve docstrings for clarity and consistency across client.py #5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
docs: improve docstrings for clarity and consistency across client.py #5
Conversation
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the Note Other AI code review bot(s) detectedCodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review. WalkthroughA new synchronous wrapper class Changes
Sequence DiagramsequenceDiagram
participant Client as Application Code
participant SyncClient as TransferQueueClient
participant AsyncClient as AsyncTransferQueueClient
Client->>SyncClient: put(data, metadata, global_step)
SyncClient->>AsyncClient: run async_put()
AsyncClient-->>SyncClient: result
SyncClient-->>Client: return
Client->>SyncClient: get_meta(data_fields, batch_size, ...)
SyncClient->>AsyncClient: run async_get_meta()
AsyncClient-->>SyncClient: BatchMeta
SyncClient-->>Client: BatchMeta
Client->>SyncClient: get_data(metadata)
SyncClient->>AsyncClient: run async_get_data()
AsyncClient-->>SyncClient: TensorDict
SyncClient-->>Client: TensorDict
Client->>SyncClient: clear(global_step)
SyncClient->>AsyncClient: run async_clear()
AsyncClient-->>SyncClient: result
SyncClient-->>Client: return
Estimated code review effort🎯 2 (Simple) | ⏱️ ~12 minutes The changes follow a consistent wrapper pattern with straightforward documentation enhancements. Four similar synchronous methods delegate to existing async implementations, and docstring additions across multiple methods are repetitive and low-complexity improvements. Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR enhances the documentation quality in transfer_queue/client.py by standardizing docstrings across all classes and methods. The changes focus on improving clarity, consistency, and completeness of API documentation while removing outdated TODO comments.
Key Changes
- Removed outdated TODO comment about docstring review
- Added comprehensive class and method docstrings following Google style guide conventions
- Reformatted examples to use more concise and readable syntax
- Standardized parameter descriptions and added missing Returns/Raises sections
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
|
|
||
| def process_zmq_server_info( | ||
| handlers: dict[Any, "TransferQueueController | TransferQueueStorageManager | SimpleStorageUnit"], | ||
| handlers: dict[Any, TransferQueueController | TransferQueueStorageManager | SimpleStorageUnit], |
Copilot
AI
Oct 22, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The quotes around the type annotation have been removed, but the imports for TransferQueueController, TransferQueueStorageManager, and SimpleStorageUnit may not be available in this module's scope. If these types are not imported, this will cause a NameError at runtime. Verify that these types are properly imported or revert to using string literals for forward references.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (2)
transfer_queue/client.py (2)
160-163: Simplify redundant socket close operations.The socket is closed twice with different linger settings. The first close (line 162) sets
LINGERto -1, while the second close (line 163) setslinger=0. This redundancy may cause unexpected behavior.Apply this diff to simplify the cleanup:
try: - if not sock.closed: - sock.setsockopt(zmq.LINGER, -1) - sock.close() - sock.close(linger=0) + if not sock.closed: + sock.close(linger=0)
390-402: Improve return type description formatting.Line 398-399 has awkward phrasing. The return description should be more consistent with other docstrings in the file.
Apply this diff to improve clarity:
Returns: - BatchMeta Records the metadata of a batch of data samples. + Batch metadata for the specified global step to be cleared
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
transfer_queue/client.py(12 hunks)
🔇 Additional comments (17)
transfer_queue/client.py (17)
48-52: LGTM! Clear and concise class documentation.The class docstring effectively describes the purpose and key capabilities of the AsyncTransferQueueClient.
59-64: LGTM! Clear constructor documentation.The docstring appropriately documents the initialization parameters.
74-79: LGTM! Clear method documentation.The docstring appropriately documents the storage manager initialization.
86-90: LGTM! Clear method documentation.The docstring appropriately documents the controller registration process.
109-122: LGTM! Excellent decorator documentation.The docstring thoroughly explains the decorator's purpose, lifecycle management, and usage requirements.
185-227: LGTM! Comprehensive method documentation with helpful examples.The docstring is thorough, documenting all parameters, return values, exceptions, and providing clear usage examples for different modes.
268-301: LGTM! Comprehensive documentation with valuable warning.The docstring effectively documents the method with clear examples. The warning about potential data overwriting in Example 2 is particularly valuable.
324-350: LGTM! Clear documentation with helpful context.The docstring effectively documents the method and the Note section provides valuable context about the
global_indexesfield's purpose.
359-366: LGTM! Clear and concise documentation.The docstring appropriately documents the clear operation.
423-432: LGTM! Clear documentation.The docstring appropriately documents the controller clear operation.
482-485: LGTM! Clear class documentation.The synchronous wrapper class is well-documented, clearly indicating its purpose as a convenience wrapper.
492-497: LGTM! Clear constructor documentation.The docstring appropriately documents the synchronous client initialization.
504-510: LGTM! Clear method documentation.The docstring appropriately documents the synchronous put operation.
521-532: LGTM! Clear method documentation.The docstring appropriately documents the synchronous get_meta operation.
544-551: LGTM! Clear method documentation.The docstring appropriately documents the synchronous get_data operation.
555-559: LGTM! Clear method documentation.The docstring appropriately documents the synchronous clear operation.
566-573: LGTM! Clear function documentation.The docstring appropriately documents the utility function for extracting ZMQ server information.
transfer_queue/client.py
Outdated
| def check_current_step_consumption(self, task_name: str, global_step: int): | ||
| """Check if all samples for current step have been consumed. | ||
| Args: | ||
| task_name: Name of the task to check consumption for | ||
| global_step: Step to check consumption status for | ||
| """ | ||
| # TODO: Implement this method to check if all samples for the current step has been consumed | ||
| pass |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Critical: Missing async keyword causes decorator incompatibility.
The method is decorated with @dynamic_socket which expects an async function (see line 154 where it calls await func(...)), but the method is not declared as async. This will cause a runtime error when the method is called.
Apply this diff to fix the issue:
@dynamic_socket(socket_name="request_handle_socket")
- def check_current_step_consumption(self, task_name: str, global_step: int):
+ async def check_current_step_consumption(self, task_name: str, global_step: int):
"""Check if all samples for current step have been consumed.
Args:📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| def check_current_step_consumption(self, task_name: str, global_step: int): | |
| """Check if all samples for current step have been consumed. | |
| Args: | |
| task_name: Name of the task to check consumption for | |
| global_step: Step to check consumption status for | |
| """ | |
| # TODO: Implement this method to check if all samples for the current step has been consumed | |
| pass | |
| async def check_current_step_consumption(self, task_name: str, global_step: int): | |
| """Check if all samples for current step have been consumed. | |
| Args: | |
| task_name: Name of the task to check consumption for | |
| global_step: Step to check consumption status for | |
| """ | |
| # TODO: Implement this method to check if all samples for the current step has been consumed | |
| pass |
🤖 Prompt for AI Agents
In transfer_queue/client.py around lines 459 to 467, the method
check_current_step_consumption is missing the async keyword but is decorated
with @dynamic_socket which awaits the function; change the definition to "async
def check_current_step_consumption(self, task_name: str, global_step: int):"
(keep the docstring and body), and ensure any internal calls use await as needed
so the function is an async coroutine compatible with the decorator.
transfer_queue/client.py
Outdated
| def check_current_step_production(self, data_fields: list[str], global_step: int): | ||
| """Check if all samples for current step are ready for consumption. | ||
| Args: | ||
| data_fields: Data fields to check production status for | ||
| global_step: Step to check production status for | ||
| """ | ||
| # TODO: Implement this method to check if all samples for the current step is ready for consumption | ||
| pass |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Critical: Missing async keyword causes decorator incompatibility.
The method is decorated with @dynamic_socket which expects an async function (see line 154 where it calls await func(...)), but the method is not declared as async. This will cause a runtime error when the method is called.
Apply this diff to fix the issue:
@dynamic_socket(socket_name="request_handle_socket")
- def check_current_step_production(self, data_fields: list[str], global_step: int):
+ async def check_current_step_production(self, data_fields: list[str], global_step: int):
"""Check if all samples for current step are ready for consumption.
Args:📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| def check_current_step_production(self, data_fields: list[str], global_step: int): | |
| """Check if all samples for current step are ready for consumption. | |
| Args: | |
| data_fields: Data fields to check production status for | |
| global_step: Step to check production status for | |
| """ | |
| # TODO: Implement this method to check if all samples for the current step is ready for consumption | |
| pass | |
| @dynamic_socket(socket_name="request_handle_socket") | |
| async def check_current_step_production(self, data_fields: list[str], global_step: int): | |
| """Check if all samples for current step are ready for consumption. | |
| Args: | |
| data_fields: Data fields to check production status for | |
| global_step: Step to check production status for | |
| """ | |
| # TODO: Implement this method to check if all samples for the current step is ready for consumption | |
| pass |
🤖 Prompt for AI Agents
In transfer_queue/client.py around lines 470 to 478, the method definition must
be declared async to be compatible with the @dynamic_socket decorator that
awaits the function; change the signature from "def
check_current_step_production(...)" to "async def
check_current_step_production(...)" and leave the docstring/logic intact (or
implement awaited calls) so the decorator can correctly await the coroutine.
Co-authored-by: Copilot <[email protected]>
| def __init__( | ||
| self, | ||
| client_id: str, | ||
| controller_infos: ZMQServerInfo | dict[Any, ZMQServerInfo], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we need to init the client with all controller_infos now? maybe we can simplify it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we still need it.. clear data needs to notify all the controllers
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe it's time to make a decision: whether we need multiple controllers
| Args: | ||
| client_id: Unique identifier for this client instance | ||
| controller_infos: Single controller info or dictionary mapping controller IDs to their ZMQ server information |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see above
| manager_type: str, | ||
| config: dict[str, Any], | ||
| ): | ||
| """Initialize the storage manager. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
describe all the supported manager_type and the detailed config they needed here
|
|
||
| def _register_controllers( | ||
| self, | ||
| server_infos: ZMQServerInfo | dict[Any, ZMQServerInfo], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see above
transfer_queue/client.py
Outdated
| metadata (BatchMeta, optional): Optional metadata containing index and storage unit information | ||
| global_step (int, optional): Current step (required if no metadata is provided) | ||
| data: Data to write as TensorDict | ||
| metadata: Optional metadata containing index and storage unit information. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
need to point out that this is only for the initial data fields for a global batch, not a general usage
| >>> # This will create metadata in "insert" mode internally. | ||
| >>> asyncio.run(client.async_put(data=prompts_repeated_batch, global_step=current_step)) | ||
| >>> # Example 2: Initial data insertion without pre-existing metadata |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add back some comments for this example?
transfer_queue/client.py
Outdated
| - "global_indexes" key: Maps each sample to its original global index. | ||
| TensorDict containing: | ||
| - Requested data fields (e.g., "prompts", "attention_mask") | ||
| - "global_indexes" field mapping samples to original global indexes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not being used. refer to #8
transfer_queue/client.py
Outdated
| >>> # The order of samples in the TensorDict matches the order of global_indexes in batch_meta | ||
| >>> # TensorDict with fields "prompts", "attention_mask", and sample order matching metadata global_indexes | ||
| Note: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See whether this is valid
f741665
into
0oshowero0:han/unified_storage_abstract
Summary by CodeRabbit
Release Notes
New Features
Documentation