Skip to content

Conversation

haixuanTao
Copy link
Collaborator

This PR introduces a dora-openai-websocket that makes it possible to connect an openai client through dora in realtime.

The server is able to spawn and connect to dataflows allowing to customize the dataflow from the user client.

Getting started

See README.md at examples/openai-realtime/README.md

@haixuanTao haixuanTao force-pushed the make-qwen-llm-configurable branch from 3f0d534 to fa888e8 Compare July 30, 2025 10:26
Copy link
Collaborator

@phil-opp phil-opp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Going through the CLI crate and opening a new connection to the coordinator seems a bit complicated, given that we already have a connection to a daemon through the node API. The daemon is already connected to the coordinator, so it could forward some start message to it. This way, we don't need to add the heavy CLI dependency and the code would work in distributed dataflows too.

So how about the following approach:

  • We add a new DaemonRequest::StartDataflow variant
    • First field is dataflow, i.e. the path to the dataflow YAML file
    • The second field is uv (optional)
    • Third field is name (optional)
    • No need for coordinator addr and port fields.
  • We add a new DoraNode::start_dataflow method that sends a DaemonRequest::StartDataflow request to the connected daemon
  • The daemon handles the StartDataflow request by reading the dataflow descriptor and session file and then sending a new CoordinatorRequest::StartDataflow request to the coordinator
  • The coordinator handles this CoordinatorRequest::StartDataflow request in a similar way as the ControlRequest::Start that is sent by the CLI

Comment on lines 290 to 300
dora_cli::command::Command::Start(Start {
dataflow,
name: Some(node_id.to_string()),
coordinator_addr: IpAddr::V4(Ipv4Addr::new(127, 0, 0, 1)),
coordinator_port: DORA_COORDINATOR_PORT_CONTROL_DEFAULT,
attach: false,
detach: true,
hot_reload: false,
uv: true,
})
.execute()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a synchronous call, which might block the async task. This is not recommended since it blocks a full thread of the tokio runtime.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. Although, this is more of a proof of concept. I think it's fine to just leave this sync call for now and rework this later.

Comment on lines 293 to 294
coordinator_addr: IpAddr::V4(Ipv4Addr::new(127, 0, 0, 1)),
coordinator_port: DORA_COORDINATOR_PORT_CONTROL_DEFAULT,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this only works when the node is running on the same machine than the coordinator. This seems quite limiting.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed I will make this configurable.

@phil-opp
Copy link
Collaborator

Going through the CLI crate and opening a new connection to the coordinator seems a bit complicated, given that we already have a connection to a daemon through the node API. The daemon is already connected to the coordinator, so it could forward some start message to it.

The drawback of this approach is that this adds more functionality to the dora-node-api. So this only makes sense if we think that a functionality like this is common enough and also useful for other nodes.

This way, we don't need to add the heavy CLI dependency

An alternative way to make the CLI dependency less heavy could be to make the dora daemon/dora coordinator/dora runtime commands optional features. Nodes like this one don't require it, so this could make the dependency more lightweight.

@haixuanTao
Copy link
Collaborator Author

Going through the CLI crate and opening a new connection to the coordinator seems a bit complicated, given that we already have a connection to a daemon through the node API. The daemon is already connected to the coordinator, so it could forward some start message to it. This way, we don't need to add the heavy CLI dependency and the code would work in distributed dataflows too.

So how about the following approach:

  • We add a new DaemonRequest::StartDataflow variant

    • First field is dataflow, i.e. the path to the dataflow YAML file
    • The second field is uv (optional)
    • Third field is name (optional)
    • No need for coordinator addr and port fields.
  • We add a new DoraNode::start_dataflow method that sends a DaemonRequest::StartDataflow request to the connected daemon

  • The daemon handles the StartDataflow request by reading the dataflow descriptor and session file and then sending a new CoordinatorRequest::StartDataflow request to the coordinator

  • The coordinator handles this CoordinatorRequest::StartDataflow request in a similar way as the ControlRequest::Start that is sent by the CLI

So I understand where this is coming from and I agree that it is annoying to have to link the whole cli library.

I think the thing is that implementing what you're mentioning is a lot of work as we would need to reimplement the node - daemon connection which I don't want to do at this moment.

I would rather just keep this easy fix to just put those function public for now...

@phil-opp
Copy link
Collaborator

I'm fine with making the functions public for now, but if this functionality is something that will also be needed by other nodes, I'd prefer a proper solution. The main limitation I'd like to avoid is that the coordinator currently has to run on the same machine as the node. Going through the daemon instead of the CLI would enable this functionality also in distributed settings.

I think the thing is that implementing what you're mentioning is a lot of work as we would need to reimplement the node - daemon connection which I don't want to do at this moment.

I don't think that it's too much work. Let me try to draft a PR against your PR.

@haixuanTao haixuanTao force-pushed the make-qwen-llm-configurable branch 2 times, most recently from afd751b to 1a626ad Compare August 29, 2025 09:25
@phil-opp
Copy link
Collaborator

phil-opp commented Sep 3, 2025

What's the status of this? How does this PR relate to #1122 ?

@phil-opp phil-opp added the waiting-for-author The pull request requires adjustments by the PR author. label Sep 3, 2025
@haixuanTao haixuanTao force-pushed the make-qwen-llm-configurable branch from 1df28d1 to 42a4c61 Compare September 10, 2025 19:17
@haixuanTao haixuanTao force-pushed the make-qwen-llm-configurable branch from 2c24307 to bc38f19 Compare September 17, 2025 07:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
waiting-for-author The pull request requires adjustments by the PR author.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants