fix: nat lost in some p2p apps#2216
Conversation
|
丢 NAT 是什么意思 |
|
不知道为啥没法 comment。这个 pr 有几个严重 bug 我 comment 不上去。 |
| } | ||
| } | ||
| // Try to reuse an existing QUIC connection for this peer | ||
| if let Some(conn) = self.conn_map.get(&dst_peer_id) { |
There was a problem hiding this comment.
quinn::Connection 是个句柄,应该直接 Clone
另外应该还需要清理闲置的 Connection
There was a problem hiding this comment.
替换成moka::sync::Cachel解决这些问题了
Cache::builder()
.max_capacity(4096)
.time_to_idle(Duration::from_secs(600))
.build(),
|
只从 quic 的角度看确实是应该复用已有的 connection 的,我当时写的时候确实对 quic 不够熟悉 |
|
对于 transport_config,quic tunnel 和 quic proxy 应该需要不同的参数,这个有待测试 |
容器里流量通过gost来转发(relay协议,流量封装成tcp, 类似vless),easytier就当作网络中转,而且只处理tcp的请求。 |
| .context("quic write_chunk failed")?; | ||
|
|
||
| // Store the connection for future reuse | ||
| self.conn_map.insert(dst_peer_id, connection); |
There was a problem hiding this comment.
如果有并发的 connect,都会走到这,后来的会把先来的挤掉
There was a problem hiding this comment.
按照要求修改了, conn_locks 限制第一次的并发创建conn
| "quic connect: reused write_header failed peer={:?}, creating new", | ||
| dst_peer_id | ||
| ); | ||
| self.conn_map.remove(&dst_peer_id); |
8dee0fc to
b16fec3
Compare
73c1356 to
a8ab9ab
Compare
| let connection = match get_or_create_conn(dst_peer_id).await { | ||
| Ok(conn) => conn, | ||
| Err(e) => { | ||
| if attempt == 0 { | ||
| debug!( | ||
| "quic connect attempt 0 failed={}, retrying after delay...", | ||
| e | ||
| ); | ||
| tokio::time::sleep(Duration::from_millis(300)).await; | ||
| continue; | ||
| } | ||
| return Err(anyhow!("quic connect failed after retry: {}", e).into()); | ||
| } | ||
| }; |
There was a problem hiding this comment.
fast fallback 重点是同时尝试建立多个连接,取成功的那个,所以如果要有的话,必须放在 get_with 里面。但我觉得对 quic 来说也不太需要,试一次就行了
There was a problem hiding this comment.
写法修改了下,没有循环了, 手动retry一次,针对 conn.open_bi()失败的情况重试一次
There was a problem hiding this comment.
我搞错了,不好意思,我以为你想要保留 happy eyeballs,之前这个逻辑没什么问题,改简洁点就好了
for attempt in 0..2 {
let endpoint = self.endpoint.clone();
let connection = self.conn_map
.try_get_with(dst_peer_id, async move {
endpoint.connect(addr, "")?.await?
})
.await
.context("failed to get or create quic connection")?;
let stream = async {
let mut stream: QuicStream = connection.open_bi().await?.into();
stream.writer_mut().write_chunk(header.clone()).await?;
Ok(stream)
}.await;
match stream {
Ok(stream) => return Ok(stream),
Err(error) => {
debug!(?dst_peer_id, attempt, ?error, "quic connect: stream setup failed");
}
}
if attempt == 0 {
self.conn_map.invalidate(&dst_peer_id).await;
tokio::time::sleep(Duration::from_millis(300)).await;
}
}| .max_capacity(4096) | ||
| .time_to_idle(Duration::from_secs(600)) |
There was a problem hiding this comment.
capacity (resp. tti) 应该和 quinn transport_config 中指定的连接数上限(resp. 连接 tti)差不多,加个注释说明一下好了
|
另外只开启 enable_kcp_proxy 不开启 enable_quic_proxy 的时候也有这个问题吗? |
我现在不确定kcp是否有问题了,需要再观察一下 |
…ix nat lost problem
reuse conn by dst_peer_id, every peer use only 1 quic conn, to fix nat lost problem
我遇到了丢nat的问题。场景是透明代理, 所有数据都通过 tun(gost自带)发到远端(自带的relay协议, 基于tcp),
跑p2p应用(erigon) 会有 0 caplin peer的问题。
经过调试,使用单个quic conn来处理所有连接(open_bi), 可以解决这个问题, pr的修改大致是这个意思
如果场景的连接数量过高,可以本地修改easytier/src/tunnel/quic.rs 的max_concurrent_bidi_streams配置为2000(默认256)