- 
                Notifications
    
You must be signed in to change notification settings  - Fork 1.2k
 
Add RFC for Qwen-Code CLI Output Formats and IPC Stream JSON Capability #810
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Add RFC for Qwen-Code CLI Output Formats and IPC Stream JSON Capability #810
Conversation
This commit introduces a new RFC document that outlines the structured input/output capabilities for the Qwen-Code CLI. It includes the addition of `--input-format` and `--output-format` flags, detailing the supported formats (`text`, `stream-json`, `stream-chunk-json`), and describes the integration scenarios, design goals, and error semantics. The document aims to facilitate programmatic integration with third-party systems and improve the overall automation experience. Signed-off-by: x22x22 <[email protected]>
          
 Let's chat here. It's more convenient to view the documents this way. This PR doesn't need to be merged.  | 
    
- Introduced structured input/output capabilities at the CLI layer to support third-party integrations. - Added `--input-format` and `--output-format` options with support for `text`, `stream-json`, and `stream-chunk-json`. - Defined JSON Lines output, error semantics, and session metadata. - Enhanced the heartbeat mechanism for monitoring subprocess health. - Implemented real-time interrupt capabilities for cancelling ongoing requests. - Updated documentation to reflect changes and provide examples for usage.
…ify module design overview
…nent descriptions
- Updated the Chinese RFC to include new bidirectional control channel events and detailed event mechanism classification. - Removed the English RFC file as it is no longer needed. - Enhanced design requirements for In-Process MCP Server support and clarified event handling for third-party integrations. - Improved clarity on structured input/output capabilities and error handling semantics.
          
 The RFC is done. Please take a look and we can discuss it in the DingTalk group. Thank you!  | 
    
          
 Thanks for your contribution! We'll try to review as soon as possible(hopefully within this week) and start a draft later for the feature so that we can push it faster.  | 
    
| 
           Hello, when will it be implemented, and when is it expected to be equipped with this ability.  | 
    
          
 I incidentally translated it into an English version with AI. The complete list of documents is as follows:  | 
    
          
 I will enter the development phase now, because our team is very anxious to use this framework. But it's okay, I will make adjustments according to your suggestions if you have new ideas after you have reviewed it. The final PR submission will be based on your ideas.  | 
    
          
 
 I am currently in the development phase; however, the project maintainers require further review. Once their review is completed, I will revise the code according to their feedback and proceed with the pull request. We anticipate that the JSON input/output functionality and the Python SDK will be available within approximately one week.  | 
    
…tion details in Qwen-Code Agent framework documentation
…iling core components, dependencies, and operational guidelines
…ore functionalities and communication protocols
…ystem details and logging conventions
…ed core components, responsibilities, and event flow specifications
…utput specifications and integration scenarios
… field and clarify continuation behavior
…on scenarios, consumption strategies, and command coordination guidelines
…tails and command handling improvements
- Updated session scheduling terminology from "子代理" to "子 Agent" for consistency. - Added detailed sections on Agentic session capabilities, including session loop, context management, message stream handling, dynamic control interfaces, and transport layer extensions. - Included examples for quick start, multi-Agent orchestration, embedded MCP tools, and integrated session scaffolding. - Expanded on debugging and environment injection strategies, emphasizing error handling and permission decision-making. - Clarified communication patterns and MCP capabilities, with a focus on permission updates and hook configurations.
- Updated core component and responsibilities descriptions for better clarity. - Improved language for core functions and objectives to enhance readability. - Revised architecture overview table for consistency and clarity. - Enhanced event flow descriptions for better understanding of interactions. - Expanded capability mapping section to clarify current and future capabilities. - Introduced detailed agent session capabilities, including session loop and context management. - Added examples for dynamic control interfaces and transport layer extensions. - Improved logging and observability sections for better clarity on SDK behavior. - Enhanced configuration injection and settings management details. - Updated integration model section with clearer examples and descriptions.
- Added support for 'stream-json' input and output formats in the CLI. - Introduced StreamJsonWriter for handling structured output. - Enhanced non-interactive CLI to process stream-json formatted messages. - Implemented parsing of stream-json input and control requests. - Added tests for stream-json functionality, including user messages and control responses. - Updated configuration to include input/output format options. - Improved error handling for invalid stream-json input.
          
 I have completed the initial version of json-stream. You may evaluate its performance by referring to "docs/cli/index.md". I will proceed to implement the remaining components as outlined in the RFC specifications.  | 
    
          
 我看了所有的RFCs和commits,可能会有些分歧: 
 我们计划先根据自身需求实现一个最小可用版本,再迭代更多功能支持。基于这些情况,你可以在fork中继续实现SDK部分以满足你的业务需求,让此PR专注于实现 I've reviewed all the RFCs and commits, and there may be some divergences: 
 We plan to first implement a minimum viable version based on our needs, and then iterate to add support for more features. Based on these situations, you can continue to implement the SDK part in your fork to meet your business needs, and let this PR focus on implementing   | 
    
          
 
 
 
 I agree with you. I actually didn't plan to submit any SDK-related content in this pull request (except for the Python call examples). It's just that @pomelo-nwu mentioned in a previous "issues" that he wanted to hear my thoughts on the SDK. So I'm adding the corresponding RFC here. Also, my new ref_claude is also designed with reference to Claude's data structures. Therefore, all the code submitted so far is as close as possible to Claude's messages (not OpenAI's). Hooks are an important and useful mechanism that can be implemented in another pull request. Without a hook mechanism, mechanisms like authorization can't be used. For now, you can run it in Yolo mode.  | 
    
… management - Introduced StreamJsonControlContext to manage hook callbacks and MCP clients. - Implemented a prompt queue to handle user prompts asynchronously. - Refactored handleUserPrompt to support aborting and managing tool calls. - Added control request handling for permission modes, model setting, and tool usage. - Enhanced StreamJsonWriter to include session ID in system messages and emit detailed result envelopes. - Added tests for StreamJsonWriter to verify result emissions and session ID inclusion. - Updated non-interactive tool executor to support output updates and completion handlers.
          
 Regarding your previous suggestion to establish a DingTalk group for more efficient communication and collaborative development, may I inquire if it is now feasible to proceed with creating the group?  | 
    
| 
           距离代码合入还有多久,什么时候会发布第一个可用的版本?等不及想用 我已经尝试此PR的代码,看上去一切运行正常👍  | 
    
| } | ||
| 
               | 
          ||
| return aggregate; | ||
| }; | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
看起来对non-interactive mode做了很大的侵入式变更。
个人理解是有一定必要的,但是我们是否可以先保留核心的stream json、control_request/response部分,避免一次性实现太多的变更,也有助于我们在迭代时考虑如何拆分它们到语义更明确的模块中去。
This seems like a fairly intrusive change to non-interactive mode.
IMHO, it's necessary, but I'm wondering if we can retain the core stream JSON and control_request/response components first to avoid making too many changes all at once. This will also help us consider how to split them into modules with clearer semantics as we iterate.
| export type StreamJsonEnvelope = | ||
| | StreamJsonOutputEnvelope | ||
| | StreamJsonInputEnvelope; | ||
| 
               | 
          
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
也许我们可以将这些message types单独作为protocol组织,以便于未来的SDK也可以对齐它们。
尽管TS版本的Claude Code Agent SDK并没有显式依赖CLI包,但它们有一个更底层的@anthropic-ai/sdk作为SoT。这和我们的情况有所不同。
想听听你的看法。
另外,也许我们可以直接将它们定义成*Message,而非*Evelope?
Perhaps we can organize these message types as separate protocols so that future SDKs can align with them.
Although the TS version of the Claude Code Agent SDK doesn't explicitly depend on the CLI package, it has a lower-level @anthropic-ai/sdk as its SoT. This differs from our situation.
I'd love to hear your thoughts.
Also, maybe we could just define them as *Message instead of *Evelope?
| hookCallbacks: new Map<string, HookCallbackRegistration>(), | ||
| registeredHookEvents: new Set<string>(), | ||
| mcpClients: new Map<string, { client: Client; config: MCPServerConfig }>(), | ||
| }; | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Love the thought of controlContext:)
| 
               | 
          ||
| function writeEnvelope(envelope: StreamJsonOutputEnvelope): void { | ||
| process.stdout.write(`${serializeStreamJsonEnvelope(envelope)}\n`); | ||
| } | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
input.ts看起来类似一个针对input stream的util。
但readStreamJsonInput及它依赖的parseStreamJsonInputFromIterable 都没有用到。
同时handleControlRequest在session.ts中也有重复定义。
writeEnvelope已经通过writer支持了。
extractUserMessageText应该是唯一有效的代码。
鉴于此我觉得我们可以把它和writer一起考虑做成一个stream manipulation的I/O模块作为session的依赖负责read/write,parse/wrap。
input.ts looks like a utility for the input stream.
However, readStreamJsonInput and its dependency parseStreamJsonInputFromIterable are not used.
Also, handleControlRequest is duplicated in session.ts.
writeEnvelope is already supported by writer.
extractUserMessageText should be the only valid code.
Given this, I believe we can consider combining it with writer to form a stream manipulation I/O module as a dependency of session, responsible for reading/writing, parsing/wrapping.
| writer: StreamJsonWriter, | ||
| controlContext: StreamJsonControlContext, | ||
| ): Promise<boolean> { | ||
| const subtype = envelope.request?.subtype; | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
看到session中做了很多message routing的工作,有个疑问:既然已经有了处理control_request的controller.ts是否可以放在那边去实现更清晰一些?
Seeing that session does a lot of message routing work, just leave a questiong here in case I'm missing something: since we already have a controller.ts that handles control_request, can we put it there to make the implementation clearer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
感谢贡献!
#838 正在同步上游更新,预计会是一个比较大的变更;我们可以在同步更新结束后合并此PR。
当前PR也包含了比较多内容,建议我们先保留核心的stream json相关功能,将大的PR拆分为几个规模相对可控的PR迭代。
- 辛苦清理一下docs,我们可以在功能稳定后更新文档
 - examples可以作为示例使用单独PR提交,这些PR不会close
 - RFC docs可以放置在dicussions留档,if you prefer
 - 当前PR还需要做一点重构的工作
 
google-gemini/gemini-cli#8016 non-interactive模式作为SDK的前置目前还存在一些问题,我们的实现也与上游产生了分叉,因此还需要观察和修复。
Thanks for your contribution!
#838 is being synchronized with upstream updates and is expected to be a significant change. We can merge this PR after the synchronization is completed.
The current pull request (PR) contains quite a bit of content. We recommend that you prioritize the core stream JSON functionality and split the large PR into several manageable iterations.
- Please take the time to clean up the docs; we'll update them once the functionality is stable.
 - Examples can be submitted as separate PRs; these PRs will not be closed.
 - RFC docs can be archived in dicussions if you prefer.
 - The current PR still requires some refactoring.
 
google-gemini/gemini-cli#8016 The non-interactive mode, as a pre-SDK implementation, currently has some issues. Our implementation has diverged from the upstream implementation, so we need to monitor and address them.
          
 
 Sure, then I'll come back to refactor this PR after your upstream work is merged.  | 
    
          
 May I ask if your upstream consolidation task has been completed?May I continue with this PR-related work?  | 
    
          
 May I ask if it's possible to create a DingTalk group for efficient communication?  | 
    
| 
           @x22x22 welcome!  | 
    

This commit introduces a new RFC document that outlines the structured input/output capabilities for the Qwen-Code CLI. It includes the addition of
--input-formatand--output-formatflags, detailing the supported formats (text,stream-json,stream-chunk-json), and describes the integration scenarios, design goals, and error semantics. The document aims to facilitate programmatic integration with third-party systems and improve the overall automation experience.TLDR
Dive Deeper
Reviewer Test Plan
Testing Matrix
Linked issues / bugs