Skip to content

feat(iroh)!: Configurable path selection#4232

Open
rklaehn wants to merge 10 commits intomainfrom
configurable-path-selection
Open

feat(iroh)!: Configurable path selection#4232
rklaehn wants to merge 10 commits intomainfrom
configurable-path-selection

Conversation

@rklaehn
Copy link
Copy Markdown
Contributor

@rklaehn rklaehn commented May 5, 2026

Description

An attempt to make path selection more flexible for complex custom transport use cases.

Path selection is now via a dynable trait PathSelector. The only fn select takes a PathSelectionContext that has the current path as well as a way to iterate over a flat list of PathSelectionData structs.

PathSelectionData currently just has the address and the stats, but in the future could be extended to contain more detailed information such as the congestion controller metrics noq_proto::ControllerMetrics etc. I think cc metrics would be quite helpful to make good decisions about path selection.

select returns a PathSelection struct that is at this time a newtype over an Option. add is first writer wins, all subsequent calls will ignore the addr and emit a warning. But we can change this in the future to allow multiple addrs.

The entire mechanism is only pub under the unstable-custom-transports flag, so we can change it in the future. But there are some additive changes we could do without breakage. For example we could allow PathSelection to also keep track of paths to be closed.

Breaking Changes

Removes some public methods on the endpoint builder that are too opinionated if you want to make path selection fully generic:

endpoint::Builder::transport_bias(kind, bias): removed
endpoint::transports::TransportBias: removed

And some changes involving the unstable_custom_transport API.

Notes & open questions

Note: the new signature tries to anticipate that in the future we might want to select multiple transports, but as of now it will just ignore these.

Change checklist

  • Self-review.
  • Documentation updates following the style guide, if relevant.
  • Tests if relevant.
  • All breaking changes documented.
    • List all breaking changes in the above "Breaking Changes" section.
    • Open an issue or PR on any number0 repos that are affected by this breaking change. Give guidance on how the updates should be handled or do the actual updates themselves. The major ones are:

@n0bot n0bot Bot added this to iroh May 5, 2026
@github-project-automation github-project-automation Bot moved this to 🚑 Needs Triage in iroh May 5, 2026
@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 6, 2026

Documentation for this PR has been generated and is available at: https://n0-computer.github.io/iroh/pr/4232/docs/iroh/

Last updated: 2026-05-06T10:15:36Z

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 6, 2026

Netsim report & logs for this PR have been generated and is available at: LOGS
This report will remain available for 3 days.

Last updated for commit: 11eceea

@rklaehn rklaehn requested a review from Frando May 6, 2026 09:18
@rklaehn
Copy link
Copy Markdown
Contributor Author

rklaehn commented May 6, 2026

@mcginty could you take a look at this? The current path selection is a bit limited especially once you have custom transports, but really also if you have more complex scenarios involving just ip and relay transports.

This is my attempt to make the selection API more flexible.

@rklaehn rklaehn changed the title feat(iroh)! Configurable path selection (WIP) feat(iroh)! Configurable path selection May 6, 2026
@rklaehn rklaehn force-pushed the configurable-path-selection branch from 0a79c3c to 7f35601 Compare May 6, 2026 09:49
@rklaehn rklaehn changed the title feat(iroh)! Configurable path selection feat(iroh)!: Configurable path selection May 6, 2026
// Apply our new path
if let Some((addr, rtt)) = selected_path {
// Apply the selector's primary path. Multi-path selection is not yet
// supported on the lifecycle side; only the primary is honoured.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment sounds a bit weird with "lifecycle side", let's just say we only support a single selected path atm.

/// has chosen. When iroh wants to make this configurable per non-relay transport, this
/// will move to a method on [`PathSelector`] with a default body that matches the rule
/// here.
fn path_status_for(addr: &transports::Addr) -> noq_proto::PathStatus {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still think this is the wrong logic, see #4233


#[cfg_attr(not(feature = "unstable-custom-transports"), allow(unreachable_pub))]
impl<'a> PathSelectionContext<'a> {
/// Module-private — only `select_path` in this file builds one, and the parameter
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove the comment, we never do comments like this

}

/// The path currently considered the preferred path to the remote endpoint, if any.
pub fn current(&self) -> Option<&'a transports::Addr> {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe call this previous? not sure, current works too I guess.

///
/// The same address may appear more than once when it is a path on multiple
/// connections to the remote. Selectors that care should aggregate as appropriate.
pub fn paths(&self) -> impl Iterator<Item = PathSelectionData<'a>> + '_ {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this iterates over all connections, then over all paths, and yield PathSelectionData for each. This is a bit weird, you can get the same path multiple times with different stats but no further details. Maybe we should just deduplicate and take the min RTT among all path stats for a transport addr?

#[cfg_attr(not(feature = "unstable-custom-transports"), allow(unreachable_pub))]
impl<'a> PathSelectionData<'a> {
/// The address of the candidate path.
pub fn addr(&self) -> &'a transports::Addr {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We usually expose this as TransportAddr but then we'd have to clone, so it's fine I think.

}
}

impl std::fmt::Debug for PathSelectionData<'_> {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use derive_more::Debug

self.addr
}

/// QUIC path statistics: rtt, cwnd, loss, mtu, etc.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's not have this list of abbreviations there, instead link to [PathStats]

/// in real networks is non-trivial.
#[cfg_attr(not(feature = "unstable-custom-transports"), allow(unreachable_pub))]
pub trait PathSelector: Send + Sync + std::fmt::Debug + 'static {
/// Picks a path among the candidates known for a remote endpoint.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Pick the selected path to carry application data among the currently open network paths to the remote endpoint"


/// The set of paths a [`PathSelector`] has chosen.
///
/// Today this holds at most one path. Build via [`PathSelection::default`] +
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Currently only one path is supported. Additional paths will be ignored and emit a warn log" or such.

}

/// All paths in this selection (today: 0 or 1).
#[allow(dead_code)] // only reached via the public re-export behind the unstable feature
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove dead code, and use allow(unreachable_pub) or gate the dead_code with cfg_attr

#[derive(Debug, Clone, Copy, PartialEq, Eq, PartialOrd, Ord)]
enum TransportType {
/// A primary path: used whenever available.
Primary = 0,
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the manual discriminators?

let mut best: Option<(&Addr, (TransportType, i128))> = None;
let mut current_key: Option<(TransportType, i128)> = None;

for psd in ctx.paths() {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is psd, why not path

not(feature = "unstable-custom-transports"),
allow(unreachable_pub, unused_imports)
)]
pub use self::remote_state::{
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe do a pub use .. gated behind the feature flag and a use .. gated behind not(feature) instead, would be more explicit IMO

Comment thread iroh/src/socket.rs
hooks: Default::default(),
transport_bias: Default::default(),
path_selector: Arc::new(
crate::socket::biased_rtt_path_selector::BiasedRttPathSelector::default(),
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

import instead of full path at call site

Copy link
Copy Markdown
Member

@Frando Frando left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like it. This is cleaner and more extensible than the previous transport bias solution. Some comments inline, mostly nits.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: 🚑 Needs Triage

Development

Successfully merging this pull request may close these issues.

2 participants