diff --git a/.github/workflows/integration_tests.yml b/.github/workflows/integration_tests.yml index a7403a071..d45bc691b 100644 --- a/.github/workflows/integration_tests.yml +++ b/.github/workflows/integration_tests.yml @@ -49,7 +49,7 @@ jobs: run: sudo apt-get install -y --no-install-recommends build-essential patchelf pkg-config net-tools - name: Install libkrunfw - run: curl -L -o /tmp/libkrunfw-4.9.0-x86_64.tgz https://github.com/containers/libkrunfw/releases/download/v4.9.0/libkrunfw-4.9.0-x86_64.tgz && mkdir tmp && tar xf /tmp/libkrunfw-4.9.0-x86_64.tgz -C tmp && sudo mv tmp/lib64/* /lib/x86_64-linux-gnu + run: curl -L -o /tmp/libkrunfw-5.0.0-x86_64.tgz https://github.com/containers/libkrunfw/releases/download/v5.0.0/libkrunfw-5.0.0-x86_64.tgz && mkdir tmp && tar xf /tmp/libkrunfw-5.0.0-x86_64.tgz -C tmp && sudo mv tmp/lib64/* /lib/x86_64-linux-gnu - name: Integration tests run: RUST_LOG=trace KRUN_ENOMEM_WORKAROUND=1 KRUN_NO_UNSHARE=1 make test diff --git a/README.md b/README.md index f9689b337..56532b63e 100644 --- a/README.md +++ b/README.md @@ -58,11 +58,48 @@ Each variant generates a dynamic library with a different name (and ```soname``` ## Networking -In ```libkrun```, networking is provided by two different, mutually exclusive techniques: +In ```libkrun```, networking is provided by two different, mutually exclusive techniques: **virtio-vsock + TSI** and **virtio-net + passt/gvproxy**. -- **virtio-vsock + TSI**: A novel technique called **Transparent Socket Impersonation** which allows the VM to have network connectivity without a virtual interface. This technique supports both outgoing and incoming connections. It's possible for userspace applications running in the VM to transparently connect to endpoints outside the VM and receive connections from the outside to ports listening inside the VM. Requires a custom kernel (like the one bundled in **libkrunfw**) and it's limited to AF_INET SOCK_DGRAM and SOCK_STREAM sockets. +### virtio-vsock + TSI -- **virtio-net + passt/gvproxy**: A conventional virtual interface that allows the guest to communicate with the outside through the VMM using a supporting application like [passt](https://passt.top/passt/about/) or [gvproxy](https://github.com/containers/gvisor-tap-vsock). +This is a novel technique called **Transparent Socket Impersonation** which allows the VM to have network connectivity without a virtual interface. This technique supports both outgoing and incoming connections. It's possible for userspace applications running in the VM to transparently connect to endpoints outside the VM and receive connections from the outside to ports listening inside the VM. + +#### Enabling TSI + +TSI for AF_INET and AF_INET6 is automatically enabled when no network interface is added to the VM. TSI for AF_UNIX is enabled when, in addition to the previous condition, `krun_set_root` has been used to set `/` as root filesystem. + +#### Known limitations + +- Requires a custom kernel (like the one bundled in **libkrunfw**). +- It's limited to SOCK_DGRAM and SOCK_STREAM sockets and AF_INET, AF_INET6 and AF_UNIX address families (for instance, raw sockets aren't supported). +- Listening on SOCK_DGRAM sockets from the guest is not supported. +- When TSI is enabled for AF_UNIX sockets, only absolute path are supported as addresses. + +### **virtio-net + passt/gvproxy** + +A conventional virtual interface that allows the guest to communicate with the outside through the VMM using a supporting application like [passt](https://passt.top/passt/about/) or [gvproxy](https://github.com/containers/gvisor-tap-vsock). + +#### Enabling virtio-net + +Use `krun_add_net_unixstream` and/or `krun_add_net_unixdgram` to add a virtio-net interface connected to the userspace network proxy. + +## Security model + +The libkrun security model is primarily defined by the consideration that both the guest and the VMM pertain to the same security context. For many operations, the VMM acts as a proxy for the guest within the host. Host resources that are accessible to the VMM can potentially be accessed by the guest through it. + +While defining the security implementation of your environment, you should think about the guest and the VMM as a single entity. To prevent the guest from accessing host's resources, you need to use the host's OS security features to run the VMM inside an isolated context. On Linux, the primary mechanism to be used for this purpose is namespaces. Single-user systems may have a more relaxed security policy and just ensure the VMM runs with a particular UID/GID. + +While most virtio devices allow the guest to access resources from the host, two of them require special consideration when used: virtio-fs and virtio-vsock+TSI. + +### virtio-fs + +When exposing a directory in a filesystem from the host to the guest through virtio-fs devices configured with `krun_set_root` and/or `krun_add_virtiofs`, libkrun **does not** provide any protection against the guest attempting to access other directories in the same filesystem, or even other filesystems in the host. + +A mount point isolation mechanism from the host should be used in combination with virtio-fs. + +### virtio-vsock + TSI + +When TSI is enabled, the VMM acts as a proxy for AF_INET, AF_INET6 and AF_UNIX sockets, for both incoming and outgoing connections. For all that matters, the VMM and the guest should be considered to be running in the network context. As such, you should apply on the VMM whatever restrictions you want to apply on the guest. ## Building and installing diff --git a/src/devices/src/virtio/vsock/device.rs b/src/devices/src/virtio/vsock/device.rs index 7018423a8..35b80b1ae 100644 --- a/src/devices/src/virtio/vsock/device.rs +++ b/src/devices/src/virtio/vsock/device.rs @@ -52,6 +52,8 @@ impl Vsock { host_port_map: Option>, queues: Vec, unix_ipc_port_map: Option>, + enable_tsi: bool, + enable_tsi_unix: bool, ) -> super::Result { let mut queue_events = Vec::new(); for _ in 0..queues.len() { @@ -64,7 +66,13 @@ impl Vsock { Ok(Vsock { cid, - muxer: VsockMuxer::new(cid, host_port_map, unix_ipc_port_map), + muxer: VsockMuxer::new( + cid, + host_port_map, + unix_ipc_port_map, + enable_tsi, + enable_tsi_unix, + ), queue_rx, queue_tx, queues, @@ -82,12 +90,21 @@ impl Vsock { cid: u64, host_port_map: Option>, unix_ipc_port_map: Option>, + enable_tsi: bool, + enable_tsi_unix: bool, ) -> super::Result { let queues: Vec = defs::QUEUE_SIZES .iter() .map(|&max_size| VirtQueue::new(max_size)) .collect(); - Self::with_queues(cid, host_port_map, queues, unix_ipc_port_map) + Self::with_queues( + cid, + host_port_map, + queues, + unix_ipc_port_map, + enable_tsi, + enable_tsi_unix, + ) } pub fn id(&self) -> &str { @@ -102,7 +119,7 @@ impl Vsock { /// have pending. Return `true` if descriptors have been added to the used ring, and `false` /// otherwise. pub fn process_stream_rx(&mut self) -> bool { - debug!("vsock: process_stream_rx()"); + debug!("process_stream_rx()"); let mem = match self.device_state { DeviceState::Activated(ref mem, _) => mem, // This should never happen, it's been already validated in the event handler. @@ -111,10 +128,10 @@ impl Vsock { let mut have_used = false; - debug!("vsock: process_rx before while"); + debug!("process_rx before while"); let mut queue_rx = self.queue_rx.lock().unwrap(); while let Some(head) = queue_rx.pop(mem) { - debug!("vsock: process_rx inside while"); + debug!("process_rx inside while"); let used_len = match VsockPacket::from_rx_virtq_head(&head) { Ok(mut pkt) => { if self.muxer.recv_pkt(&mut pkt).is_ok() { @@ -127,12 +144,12 @@ impl Vsock { } } Err(e) => { - warn!("vsock: RX queue error: {e:?}"); + warn!("RX queue error: {e:?}"); 0 } }; - debug!("vsock: process_rx: something to queue"); + debug!("process_rx: something to queue"); have_used = true; if let Err(e) = queue_rx.add_used(mem, head.index, used_len) { error!("failed to add used elements to the queue: {e:?}"); @@ -145,7 +162,7 @@ impl Vsock { /// Walk the driver-provided TX queue buffers, package them up as vsock packets, and process /// them. Return `true` if descriptors have been added to the used ring, and `false` otherwise. pub fn process_stream_tx(&mut self) -> bool { - debug!("vsock::process_stream_tx()"); + debug!("process_stream_tx()"); let mem = match self.device_state { DeviceState::Activated(ref mem, _) => mem, // This should never happen, it's been already validated in the event handler. @@ -159,7 +176,7 @@ impl Vsock { let pkt = match VsockPacket::from_tx_virtq_head(&head) { Ok(pkt) => pkt, Err(e) => { - error!("vsock: error reading TX packet: {e:?}"); + error!("error reading TX packet: {e:?}"); have_used = true; if let Err(e) = queue_tx.add_used(mem, head.index, 0) { error!("failed to add used elements to the queue: {e:?}"); @@ -169,13 +186,13 @@ impl Vsock { }; if pkt.type_() == uapi::VSOCK_TYPE_DGRAM { - debug!("vsock::process_stream_tx() is DGRAM"); + debug!("process_stream_tx() is DGRAM"); if self.muxer.send_dgram_pkt(&pkt).is_err() { queue_tx.undo_pop(); break; } } else { - debug!("vsock::process_stream_tx() is STREAM"); + debug!("process_stream_tx() is STREAM"); if self.muxer.send_stream_pkt(&pkt).is_err() { queue_tx.undo_pop(); break; @@ -235,7 +252,7 @@ impl VirtioDevice for Vsock { byte_order::write_le_u32(data, ((self.cid() >> 32) & 0xffff_ffff) as u32) } _ => warn!( - "vsock: virtio-vsock received invalid read request of {} bytes at offset {}", + "virtio-vsock received invalid read request of {} bytes at offset {}", data.len(), offset ), @@ -244,7 +261,7 @@ impl VirtioDevice for Vsock { fn write_config(&mut self, offset: u64, data: &[u8]) { warn!( - "vsock: guest driver attempted to write device config (offset={:x}, len={:x})", + "guest driver attempted to write device config (offset={:x}, len={:x})", offset, data.len() ); diff --git a/src/devices/src/virtio/vsock/event_handler.rs b/src/devices/src/virtio/vsock/event_handler.rs index f6b8f5474..b7fb15799 100644 --- a/src/devices/src/virtio/vsock/event_handler.rs +++ b/src/devices/src/virtio/vsock/event_handler.rs @@ -15,11 +15,11 @@ use crate::virtio::VirtioDevice; impl Vsock { pub(crate) fn handle_rxq_event(&mut self, event: &EpollEvent) -> bool { - debug!("vsock: RX queue event"); + debug!("RX queue event"); let event_set = event.event_set(); if event_set != EventSet::IN { - warn!("vsock: rxq unexpected event {event_set:?}"); + warn!("rxq unexpected event {event_set:?}"); return false; } @@ -33,11 +33,11 @@ impl Vsock { } pub(crate) fn handle_txq_event(&mut self, event: &EpollEvent) -> bool { - debug!("vsock: TX queue event"); + debug!("TX queue event"); let event_set = event.event_set(); if event_set != EventSet::IN { - warn!("vsock: txq unexpected event {event_set:?}"); + warn!("txq unexpected event {event_set:?}"); return false; } @@ -57,11 +57,11 @@ impl Vsock { } fn handle_evq_event(&mut self, event: &EpollEvent) -> bool { - debug!("vsock: event queue event"); + debug!("event queue event"); let event_set = event.event_set(); if event_set != EventSet::IN { - warn!("vsock: evq unexpected event {event_set:?}"); + warn!("evq unexpected event {event_set:?}"); return false; } @@ -72,7 +72,7 @@ impl Vsock { } fn handle_activate_event(&self, event_manager: &mut EventManager) { - debug!("vsock: activate event"); + debug!("activate event"); if let Err(e) = self.activate_evt.read() { error!("Failed to consume vsock activate event: {e:?}"); } @@ -147,7 +147,7 @@ impl Subscriber for Vsock { self.device_state.signal_used_queue(); } } else { - warn!("Vsock: The device is not yet activated. Spurious event received: {source:?}"); + warn!("The device is not yet activated. Spurious event received: {source:?}"); } } diff --git a/src/devices/src/virtio/vsock/mod.rs b/src/devices/src/virtio/vsock/mod.rs index 49917c5bf..b2d3d2648 100644 --- a/src/devices/src/virtio/vsock/mod.rs +++ b/src/devices/src/virtio/vsock/mod.rs @@ -14,10 +14,10 @@ mod muxer_thread; mod packet; mod proxy; mod reaper; -mod tcp; #[cfg(target_os = "macos")] mod timesync; -mod udp; +mod tsi_dgram; +mod tsi_stream; mod unix; pub use self::defs::uapi::VIRTIO_ID_VSOCK as TYPE_VSOCK; @@ -59,6 +59,11 @@ mod defs { pub const TSI_ACCEPT: u32 = 1030; pub const TSI_PROXY_RELEASE: u32 = 1031; + // Linux definitions that we need for cross-platform compatibility. + pub const LINUX_AF_UNIX: u16 = 1; + pub const LINUX_AF_INET: u16 = 2; + pub const LINUX_AF_INET6: u16 = 10; + pub mod uapi { /// Virtio feature flags. diff --git a/src/devices/src/virtio/vsock/muxer.rs b/src/devices/src/virtio/vsock/muxer.rs index c63fc12c7..0f91882dc 100644 --- a/src/devices/src/virtio/vsock/muxer.rs +++ b/src/devices/src/virtio/vsock/muxer.rs @@ -11,10 +11,10 @@ use super::muxer_thread::MuxerThread; use super::packet::{TsiConnectReq, TsiGetnameRsp, VsockPacket}; use super::proxy::{Proxy, ProxyRemoval, ProxyUpdate}; use super::reaper::ReaperThread; -use super::tcp::TcpProxy; #[cfg(target_os = "macos")] use super::timesync::TimesyncThread; -use super::udp::UdpProxy; +use super::tsi_dgram::TsiDgramProxy; +use super::tsi_stream::TsiStreamProxy; use super::unix::UnixProxy; use super::VsockError; use crossbeam_channel::{unbounded, Sender}; @@ -22,7 +22,7 @@ use utils::epoll::{ControlOperation, Epoll, EpollEvent, EventSet}; use vm_memory::GuestMemoryMmap; use crate::virtio::InterruptTransport; -use std::net::Ipv4Addr; +use std::net::{Ipv4Addr, SocketAddrV4}; pub type ProxyMap = Arc>>>>; @@ -106,6 +106,8 @@ pub struct VsockMuxer { proxy_map: ProxyMap, reaper_sender: Option>, unix_ipc_port_map: Option>, + enable_tsi: bool, + enable_tsi_unix: bool, } impl VsockMuxer { @@ -113,6 +115,8 @@ impl VsockMuxer { cid: u64, host_port_map: Option>, unix_ipc_port_map: Option>, + enable_tsi: bool, + enable_tsi_unix: bool, ) -> Self { VsockMuxer { cid, @@ -125,6 +129,8 @@ impl VsockMuxer { proxy_map: Arc::new(RwLock::new(HashMap::new())), reaper_sender: None, unix_ipc_port_map, + enable_tsi, + enable_tsi_unix, } } @@ -170,7 +176,7 @@ impl VsockMuxer { } pub(crate) fn recv_pkt(&mut self, pkt: &mut VsockPacket) -> super::Result<()> { - debug!("vsock: recv_stream_pkt"); + debug!("recv_stream_pkt"); if self.rxq.lock().unwrap().is_empty() { return Err(VsockError::NoData); } @@ -182,6 +188,38 @@ impl VsockMuxer { Ok(()) } + fn push_packet(&self, rx: MuxerRx) { + let mem = match self.mem.as_ref() { + Some(m) => m, + None => { + error!("proxy creation without mem"); + return; + } + }; + let queue_mutex = match self.queue.as_ref() { + Some(q) => q, + None => { + error!("stream proxy creation without stream queue"); + return; + } + }; + + let mut queue = queue_mutex.lock().unwrap(); + if let Some(head) = queue.pop(mem) { + if let Ok(mut pkt) = VsockPacket::from_rx_virtq_head(&head) { + rx_to_pkt(self.cid, rx, &mut pkt); + if let Err(e) = queue.add_used(mem, head.index, pkt.hdr().len() as u32 + pkt.len()) + { + error!("failed to add used elements to the queue: {e:?}"); + } + } + } else { + error!("couldn't push pkt to queue, adding it to rxq"); + drop(queue); + self.rxq.lock().unwrap().push(rx); + } + } + pub fn update_polling(&self, id: u64, fd: RawFd, evset: EventSet) { debug!("update_polling id={id} fd={fd:?} evset={evset:?}"); let _ = self @@ -223,10 +261,10 @@ impl VsockMuxer { } fn process_proxy_create(&self, pkt: &VsockPacket) { - debug!("vsock: proxy create request"); + debug!("proxy create request"); if let Some(req) = pkt.read_proxy_create() { debug!( - "vsock: proxy create request: peer_port={}, type={}", + "proxy create request: peer_port={}, type={}", req.peer_port, req._type ); let mem = match self.mem.as_ref() { @@ -245,11 +283,16 @@ impl VsockMuxer { }; match req._type { defs::SOCK_STREAM => { - debug!("vsock: proxy create stream"); + debug!("proxy create stream"); let id = ((req.peer_port as u64) << 32) | (defs::TSI_PROXY_PORT as u64); - match TcpProxy::new( + if req.family as i32 == libc::AF_UNIX && !self.enable_tsi_unix { + warn!("rejecting tcp unix proxy because tsi_unix is disabled"); + return; + } + match TsiStreamProxy::new( id, self.cid, + req.family, defs::TSI_PROXY_PORT, req.peer_port, pkt.src_port(), @@ -267,11 +310,16 @@ impl VsockMuxer { } } defs::SOCK_DGRAM => { - debug!("vsock: proxy create dgram"); + debug!("proxy create dgram"); let id = ((req.peer_port as u64) << 32) | (defs::TSI_PROXY_PORT as u64); - match UdpProxy::new( + if req.family as i32 == libc::AF_UNIX && !self.enable_tsi_unix { + warn!("rejecting udp unix proxy because tsi_unix is disabled"); + return; + } + match TsiDgramProxy::new( id, self.cid, + req.family, req.peer_port, mem.clone(), queue.clone(), @@ -286,49 +334,58 @@ impl VsockMuxer { Err(e) => debug!("error creating udp proxy: {e}"), } } - _ => debug!("vsock: unknown type on connection request"), + _ => debug!("unknown type on connection request"), }; } } fn process_connect(&self, pkt: &VsockPacket) { - debug!("vsock: proxy connect request"); + debug!("proxy connect request"); if let Some(req) = pkt.read_connect_req() { let id = ((req.peer_port as u64) << 32) | (defs::TSI_PROXY_PORT as u64); - debug!("vsock: proxy connect request: id={id}"); - let update = self - .proxy_map - .read() - .unwrap() - .get(&id) - .map(|proxy| proxy.lock().unwrap().connect(pkt, req)); - - if let Some(update) = update { - self.process_proxy_update(id, update); + debug!("proxy connect request: id={id}"); + match self.proxy_map.read().unwrap().get(&id) { + Some(proxy) => { + self.process_proxy_update(id, proxy.lock().unwrap().connect(pkt, req)); + } + None => self.push_packet(MuxerRx::ConnResponse { + local_port: pkt.dst_port(), + peer_port: pkt.src_port(), + result: -libc::ECONNREFUSED, + }), } } } fn process_getname(&self, pkt: &VsockPacket) { - debug!("vsock: new getname request"); + debug!("new getname request"); if let Some(req) = pkt.read_getname_req() { let id = ((req.peer_port as u64) << 32) | (req.local_port as u64); debug!( - "vsock: new getname request: id={}, peer_port={}, local_port={}", + "new getname request: id={}, peer_port={}, local_port={}", id, req.peer_port, req.local_port ); - if let Some(proxy) = self.proxy_map.read().unwrap().get(&id) { - proxy.lock().unwrap().getpeername(pkt); + match self.proxy_map.read().unwrap().get(&id) { + Some(proxy) => proxy.lock().unwrap().getpeername(pkt), + None => self.push_packet(MuxerRx::GetnameResponse { + local_port: pkt.dst_port(), + peer_port: pkt.src_port(), + data: TsiGetnameRsp { + result: -libc::EINVAL, + addr_len: 0, + addr: SocketAddrV4::new(Ipv4Addr::new(0, 0, 0, 0), 0).into(), + }, + }), } } } fn process_sendto_addr(&self, pkt: &VsockPacket) { - debug!("vsock: new DGRAM sendto addr: src={}", pkt.src_port()); + debug!("new DGRAM sendto addr: src={}", pkt.src_port()); if let Some(req) = pkt.read_sendto_addr() { let id = ((req.peer_port as u64) << 32) | (defs::TSI_PROXY_PORT as u64); - debug!("vsock: new DGRAM sendto addr: id={id}"); + debug!("new DGRAM sendto addr: id={id}"); let update = self .proxy_map .read() @@ -344,54 +401,53 @@ impl VsockMuxer { fn process_sendto_data(&self, pkt: &VsockPacket) { let id = ((pkt.src_port() as u64) << 32) | (defs::TSI_PROXY_PORT as u64); - debug!("vsock: DGRAM sendto data: id={} src={}", id, pkt.src_port()); + debug!("DGRAM sendto data: id={} src={}", id, pkt.src_port()); if let Some(proxy) = self.proxy_map.read().unwrap().get(&id) { proxy.lock().unwrap().sendto_data(pkt); } } fn process_listen_request(&self, pkt: &VsockPacket) { - debug!("vsock: DGRAM listen request: src={}", pkt.src_port()); + debug!("DGRAM listen request: src={}", pkt.src_port()); if let Some(req) = pkt.read_listen_req() { let id = ((req.peer_port as u64) << 32) | (defs::TSI_PROXY_PORT as u64); - debug!("vsock: DGRAM listen request: id={id}"); - let update = self - .proxy_map - .read() - .unwrap() - .get(&id) - .map(|proxy| proxy.lock().unwrap().listen(pkt, req, &self.host_port_map)); - - if let Some(update) = update { - self.process_proxy_update(id, update); - } + debug!("DGRAM listen request: id={id}"); + match self.proxy_map.read().unwrap().get(&id) { + Some(proxy) => self.process_proxy_update( + id, + proxy.lock().unwrap().listen(pkt, req, &self.host_port_map), + ), + None => self.push_packet(MuxerRx::ListenResponse { + local_port: pkt.dst_port(), + peer_port: pkt.src_port(), + result: -libc::EPERM, + }), + }; } } fn process_accept_request(&self, pkt: &VsockPacket) { - debug!("vsock: DGRAM accept request: src={}", pkt.src_port()); + debug!("DGRAM accept request: src={}", pkt.src_port()); if let Some(req) = pkt.read_accept_req() { let id = ((req.peer_port as u64) << 32) | (defs::TSI_PROXY_PORT as u64); - debug!("vsock: DGRAM accept request: id={id}"); - let update = self - .proxy_map - .read() - .unwrap() - .get(&id) - .map(|proxy| proxy.lock().unwrap().accept(req)); - - if let Some(update) = update { - self.process_proxy_update(id, update); + debug!("DGRAM accept request: id={id}"); + match self.proxy_map.read().unwrap().get(&id) { + Some(proxy) => self.process_proxy_update(id, proxy.lock().unwrap().accept(req)), + None => self.push_packet(MuxerRx::AcceptResponse { + local_port: pkt.dst_port(), + peer_port: pkt.src_port(), + result: -libc::EINVAL, + }), } } } fn process_proxy_release(&self, pkt: &VsockPacket) { - debug!("vsock: DGRAM release request: src={}", pkt.src_port()); + debug!("DGRAM release request: src={}", pkt.src_port()); if let Some(req) = pkt.read_release_req() { let id = ((req.peer_port as u64) << 32) | (req.local_port as u64); debug!( - "vsock: DGRAM release request: id={} local_port={} peer_port={}", + "DGRAM release request: id={} local_port={} peer_port={}", id, req.local_port, req.peer_port ); let update = if let Some(proxy) = self.proxy_map.read().unwrap().get(&id) { @@ -410,49 +466,46 @@ impl VsockMuxer { } } debug!( - "vsock: DGRAM release request: proxies={}", + "DGRAM release request: proxies={}", self.proxy_map.read().unwrap().len() ); } fn process_dgram_rw(&self, pkt: &VsockPacket) { - debug!("vsock: DGRAM OP_RW"); + debug!("DGRAM OP_RW"); let id = ((pkt.src_port() as u64) << 32) | (defs::TSI_PROXY_PORT as u64); if let Some(proxy_lock) = self.proxy_map.read().unwrap().get(&id) { - debug!("vsock: DGRAM allowing OP_RW for {}", pkt.src_port()); + debug!("DGRAM allowing OP_RW for {}", pkt.src_port()); let mut proxy = proxy_lock.lock().unwrap(); let update = proxy.sendmsg(pkt); self.process_proxy_update(id, update); } else { - debug!("vsock: DGRAM ignoring OP_RW for {}", pkt.src_port()); + debug!("DGRAM ignoring OP_RW for {}", pkt.src_port()); } } pub(crate) fn send_dgram_pkt(&mut self, pkt: &VsockPacket) -> super::Result<()> { debug!( - "vsock: send_dgram_pkt: src_port={} dst_port={}", + "send_dgram_pkt: src_port={} dst_port={}", pkt.src_port(), pkt.dst_port() ); if pkt.dst_cid() != uapi::VSOCK_HOST_CID { - debug!( - "vsock: dropping guest packet for unknown CID: {:?}", - pkt.hdr() - ); + debug!("dropping guest packet for unknown CID: {:?}", pkt.hdr()); return Ok(()); } match pkt.dst_port() { - defs::TSI_PROXY_CREATE => self.process_proxy_create(pkt), - defs::TSI_CONNECT => self.process_connect(pkt), - defs::TSI_GETNAME => self.process_getname(pkt), - defs::TSI_SENDTO_ADDR => self.process_sendto_addr(pkt), - defs::TSI_SENDTO_DATA => self.process_sendto_data(pkt), - defs::TSI_LISTEN => self.process_listen_request(pkt), - defs::TSI_ACCEPT => self.process_accept_request(pkt), - defs::TSI_PROXY_RELEASE => self.process_proxy_release(pkt), + defs::TSI_PROXY_CREATE if self.enable_tsi => self.process_proxy_create(pkt), + defs::TSI_CONNECT if self.enable_tsi => self.process_connect(pkt), + defs::TSI_GETNAME if self.enable_tsi => self.process_getname(pkt), + defs::TSI_SENDTO_ADDR if self.enable_tsi => self.process_sendto_addr(pkt), + defs::TSI_SENDTO_DATA if self.enable_tsi => self.process_sendto_data(pkt), + defs::TSI_LISTEN if self.enable_tsi => self.process_listen_request(pkt), + defs::TSI_ACCEPT if self.enable_tsi => self.process_accept_request(pkt), + defs::TSI_PROXY_RELEASE if self.enable_tsi => self.process_proxy_release(pkt), _ => { if pkt.op() == uapi::VSOCK_OP_RW { self.process_dgram_rw(pkt); @@ -466,7 +519,7 @@ impl VsockMuxer { } fn process_op_request(&mut self, pkt: &VsockPacket) { - debug!("vsock: OP_REQUEST"); + debug!("OP_REQUEST"); let id: u64 = ((pkt.src_port() as u64) << 32) | (pkt.dst_port() as u64); let mut proxy_map = self.proxy_map.write().unwrap(); @@ -479,7 +532,7 @@ impl VsockMuxer { let mem = self.mem.as_ref().unwrap(); let queue = self.queue.as_ref().unwrap(); if *listen { - warn!("vsock: Attempting to connect a socket that is listening, sending rst"); + warn!("Attempting to connect a socket that is listening, sending rst"); let rx = MuxerRx::Reset { local_port: pkt.dst_port(), peer_port: pkt.src_port(), @@ -502,8 +555,7 @@ impl VsockMuxer { .unwrap(); let tsi = TsiConnectReq { peer_port: 0, - addr: Ipv4Addr::new(0, 0, 0, 0), - port: 0, + addr: SocketAddrV4::new(Ipv4Addr::new(0, 0, 0, 0), 0).into(), }; let update = unix.connect(pkt, tsi); unix.confirm_connect(pkt); @@ -514,7 +566,7 @@ impl VsockMuxer { } fn process_op_response(&self, pkt: &VsockPacket) { - debug!("vsock: OP_RESPONSE"); + debug!("OP_RESPONSE"); let id: u64 = ((pkt.src_port() as u64) << 32) | (pkt.dst_port() as u64); let update = self .proxy_map @@ -539,7 +591,7 @@ impl VsockMuxer { } fn process_op_shutdown(&self, pkt: &VsockPacket) { - debug!("vsock: OP_SHUTDOWN"); + debug!("OP_SHUTDOWN"); let id: u64 = ((pkt.src_port() as u64) << 32) | (pkt.dst_port() as u64); if let Some(proxy) = self.proxy_map.read().unwrap().get(&id) { proxy.lock().unwrap().shutdown(pkt); @@ -547,7 +599,7 @@ impl VsockMuxer { } fn process_op_credit_update(&self, pkt: &VsockPacket) { - debug!("vsock: OP_CREDIT_UPDATE"); + debug!("OP_CREDIT_UPDATE"); let id: u64 = ((pkt.src_port() as u64) << 32) | (pkt.dst_port() as u64); let update = self .proxy_map @@ -561,11 +613,11 @@ impl VsockMuxer { } fn process_stream_rw(&self, pkt: &VsockPacket) { - debug!("vsock: OP_RW"); + debug!("OP_RW"); let id: u64 = ((pkt.src_port() as u64) << 32) | (pkt.dst_port() as u64); if let Some(proxy_lock) = self.proxy_map.read().unwrap().get(&id) { debug!( - "vsock: allowing OP_RW: src={} dst={}", + "allowing OP_RW: src={} dst={}", pkt.src_port(), pkt.dst_port() ); @@ -573,7 +625,7 @@ impl VsockMuxer { let update = proxy.sendmsg(pkt); self.process_proxy_update(id, update); } else { - debug!("vsock: invalid OP_RW for {}, sending reset", pkt.src_port()); + debug!("invalid OP_RW for {}, sending reset", pkt.src_port()); let mem = match self.mem.as_ref() { Some(m) => m, None => { @@ -599,11 +651,11 @@ impl VsockMuxer { } fn process_stream_rst(&self, pkt: &VsockPacket) { - debug!("vsock: OP_RST"); + debug!("OP_RST"); let id: u64 = ((pkt.src_port() as u64) << 32) | (pkt.dst_port() as u64); if let Some(proxy_lock) = self.proxy_map.read().unwrap().get(&id) { debug!( - "vsock: allowing OP_RST: id={} src={} dst={}", + "allowing OP_RST: id={} src={} dst={}", id, pkt.src_port(), pkt.dst_port() @@ -612,23 +664,20 @@ impl VsockMuxer { let update = proxy.release(); self.process_proxy_update(id, update); } else { - debug!("vsock: invalid OP_RST for {}", pkt.src_port()); + debug!("invalid OP_RST for {}", pkt.src_port()); } } pub(crate) fn send_stream_pkt(&mut self, pkt: &VsockPacket) -> super::Result<()> { debug!( - "vsock: send_pkt: src_port={} dst_port={}, op={}", + "send_pkt: src_port={} dst_port={}, op={}", pkt.src_port(), pkt.dst_port(), pkt.op() ); if pkt.dst_cid() != uapi::VSOCK_HOST_CID { - debug!( - "vsock: dropping guest packet for unknown CID: {:?}", - pkt.hdr() - ); + debug!("dropping guest packet for unknown CID: {:?}", pkt.hdr()); return Ok(()); } diff --git a/src/devices/src/virtio/vsock/muxer_thread.rs b/src/devices/src/virtio/vsock/muxer_thread.rs index 13987736f..1a215887e 100644 --- a/src/devices/src/virtio/vsock/muxer_thread.rs +++ b/src/devices/src/virtio/vsock/muxer_thread.rs @@ -8,7 +8,7 @@ use super::super::Queue as VirtQueue; use super::muxer::{push_packet, MuxerRx, ProxyMap}; use super::muxer_rxq::MuxerRxQ; use super::proxy::{NewProxyType, Proxy, ProxyRemoval, ProxyUpdate}; -use super::tcp::TcpProxy; +use super::tsi_stream::TsiStreamProxy; use crate::virtio::vsock::defs; use crate::virtio::vsock::unix::{UnixAcceptorProxy, UnixProxy}; @@ -106,14 +106,15 @@ impl MuxerThread { let mut should_signal = update.signal_queue; - if let Some((peer_port, accept_fd, proxy_type)) = update.new_proxy { + if let Some((peer_port, accept_fd, family, proxy_type)) = update.new_proxy { let local_port: u32 = thread_rng.random_range(1024..u32::MAX); let new_id: u64 = ((peer_port as u64) << 32) | (local_port as u64); let new_proxy: Box = match proxy_type { - NewProxyType::Tcp => Box::new(TcpProxy::new_reverse( + NewProxyType::Tcp => Box::new(TsiStreamProxy::new_reverse( new_id, self.cid, id, + family, local_port, peer_port, accept_fd, @@ -197,7 +198,7 @@ impl MuxerThread { } } Err(e) => { - debug!("vsock: failed to consume muxer epoll event: {e}"); + debug!("failed to consume muxer epoll event: {e}"); } } } diff --git a/src/devices/src/virtio/vsock/packet.rs b/src/devices/src/virtio/vsock/packet.rs index b903bb9ba..c9cc0d527 100644 --- a/src/devices/src/virtio/vsock/packet.rs +++ b/src/devices/src/virtio/vsock/packet.rs @@ -17,10 +17,15 @@ /// to temporary buffers, before passing it on to the vsock backend. use std::convert::TryInto; use std::ffi::CStr; -use std::net::Ipv4Addr; +use std::net::{Ipv4Addr, SocketAddrV4}; +#[cfg(target_os = "macos")] +use std::net::{Ipv6Addr, SocketAddrV6}; use std::os::raw::c_char; use std::result; +#[cfg(target_os = "linux")] +use nix::sys::socket::{sockaddr, AddressFamily}; +use nix::sys::socket::{SockaddrLike, SockaddrStorage}; use utils::byte_order; use vm_memory::{self, Address, GuestAddress, GuestMemory, GuestMemoryError}; @@ -96,14 +101,14 @@ const HDROFF_FWD_CNT: usize = 40; #[repr(C)] pub struct TsiProxyCreate { pub peer_port: u32, + pub family: u16, pub _type: u16, } #[repr(C)] pub struct TsiConnectReq { pub peer_port: u32, - pub addr: Ipv4Addr, - pub port: u16, + pub addr: SockaddrStorage, } #[repr(C)] @@ -121,17 +126,19 @@ pub struct TsiGetnameReq { #[repr(C)] #[derive(Debug)] pub struct TsiGetnameRsp { - pub addr: Ipv4Addr, - pub port: u16, pub result: i32, + pub addr_len: u32, + pub addr: SockaddrStorage, } impl Default for TsiGetnameRsp { fn default() -> Self { + let addr: SockaddrStorage = SocketAddrV4::new(Ipv4Addr::new(0, 0, 0, 0), 0).into(); TsiGetnameRsp { - addr: Ipv4Addr::new(0, 0, 0, 0), - port: 0, result: -1, + // It's fine to unwrap here sice we've just created the SocketAddrV4 above. + addr_len: addr.as_sockaddr_in().unwrap().len(), + addr, } } } @@ -140,18 +147,16 @@ impl Default for TsiGetnameRsp { #[derive(Debug)] pub struct TsiSendtoAddr { pub peer_port: u32, - pub addr: Ipv4Addr, - pub port: u16, + pub addr: SockaddrStorage, } #[repr(C)] #[derive(Debug)] pub struct TsiListenReq { pub peer_port: u32, - pub addr: Ipv4Addr, - pub port: u16, pub vm_port: u32, pub backlog: i32, + pub addr: SockaddrStorage, } #[repr(C)] @@ -479,31 +484,91 @@ impl VsockPacket { } } + #[cfg(target_os = "linux")] + fn parse_address(buf: &[u8], addr_len: u32) -> Option { + let sockaddr: SockaddrStorage = unsafe { + SockaddrStorage::from_raw(&buf[0] as *const _ as *const sockaddr, Some(addr_len))? + }; + + match sockaddr.family() { + Some(AddressFamily::Inet) => debug!("parse_address: AF_INET"), + Some(AddressFamily::Inet6) => debug!("parse_address: AF_INET6"), + Some(AddressFamily::Unix) => debug!("parse_address: AF_UNIX"), + _ => { + if let Some(family) = sockaddr.family() { + warn!("parse_address: unsupported family {family:?}"); + } else { + warn!("parse_address: error parsing family"); + } + return None; + } + } + + Some(sockaddr) + } + + #[cfg(target_os = "macos")] + fn parse_address(buf: &[u8], _addr_len: u32) -> Option { + let family: u16 = byte_order::read_le_u16(&buf[0..2]); + + match family { + defs::LINUX_AF_INET => { + debug!("parse_address: AF_INET"); + let in_port: u16 = byte_order::read_be_u16(&buf[2..4]); + let in_addr = Ipv4Addr::new(buf[4], buf[5], buf[6], buf[7]); + Some(SocketAddrV4::new(in_addr, in_port).into()) + } + defs::LINUX_AF_INET6 => { + debug!("parse_address: AF_INET6"); + let in_port: u16 = byte_order::read_be_u16(&buf[2..4]); + let flowinfo: u32 = byte_order::read_be_u32(&buf[4..8]); + let in6_addr = Ipv6Addr::new( + byte_order::read_be_u16(&buf[8..10]), + byte_order::read_be_u16(&buf[10..12]), + byte_order::read_be_u16(&buf[12..14]), + byte_order::read_be_u16(&buf[14..16]), + byte_order::read_be_u16(&buf[16..18]), + byte_order::read_be_u16(&buf[18..20]), + byte_order::read_be_u16(&buf[20..22]), + byte_order::read_be_u16(&buf[22..24]), + ); + let scope_id: u32 = byte_order::read_be_u32(&buf[24..28]); + Some(SocketAddrV6::new(in6_addr, in_port, flowinfo, scope_id).into()) + } + defs::LINUX_AF_UNIX => { + // On macOS, SockaddrStorage doesn't implement `from_raw` for + // Unix sockets, nor a way to cast an UnixPath to it. + error!("AF_UNIX sockets aren't yet supported on macOS"); + None + } + _ => None, + } + } + pub fn read_proxy_create(&self) -> Option { if self.buf_size >= 6 { let peer_port: u32 = byte_order::read_le_u32(&self.buf().unwrap()[0..]); - let _type: u16 = byte_order::read_le_u16(&self.buf().unwrap()[4..]); + let family: u16 = byte_order::read_le_u16(&self.buf().unwrap()[4..]); + let _type: u16 = byte_order::read_le_u16(&self.buf().unwrap()[6..]); - Some(TsiProxyCreate { peer_port, _type }) + Some(TsiProxyCreate { + peer_port, + family, + _type, + }) } else { None } } pub fn read_connect_req(&self) -> Option { - if self.buf_size >= 10 { - let peer_port: u32 = byte_order::read_le_u32(&self.buf().unwrap()[0..]); - let port: u16 = byte_order::read_be_u16(&self.buf().unwrap()[8..]); - - let ptr = &self.buf().unwrap()[4]; - let slice = unsafe { std::slice::from_raw_parts(ptr as *const u8, 4) }; - let addr = Ipv4Addr::new(slice[0], slice[1], slice[2], slice[3]); + if self.buf_size >= 4 { + let buf = self.buf().unwrap(); + let peer_port: u32 = byte_order::read_le_u32(&buf[0..]); + let addr_len: u32 = byte_order::read_le_u32(&buf[4..]); + let addr = Self::parse_address(&buf[8..], addr_len)?; - Some(TsiConnectReq { - peer_port, - addr, - port, - }) + Some(TsiConnectReq { peer_port, addr }) } else { None } @@ -533,54 +598,46 @@ impl VsockPacket { } pub fn write_getname_rsp(&mut self, rsp: TsiGetnameRsp) { - if self.buf_size >= 10 { + if self.buf_size >= 132 { if let Some(buf) = self.buf_mut() { - for (i, b) in rsp.addr.octets().iter().enumerate() { - buf[i] = *b; - } - byte_order::write_be_u16(&mut buf[4..], rsp.port); - byte_order::write_le_u32(&mut buf[6..], rsp.result as u32); + byte_order::write_le_u32(&mut buf[0..], rsp.result as u32); + byte_order::write_le_u32(&mut buf[4..], rsp.addr_len); + let addr_ptr = rsp.addr.as_ptr(); + let slice = unsafe { + std::slice::from_raw_parts(addr_ptr as *const u8, rsp.addr.len() as usize) + }; + buf[8..(rsp.addr.len() + 8) as usize].copy_from_slice(slice); } } } pub fn read_sendto_addr(&self) -> Option { - if self.buf_size >= 10 { - let peer_port: u32 = byte_order::read_le_u32(&self.buf().unwrap()[0..]); - let port: u16 = byte_order::read_be_u16(&self.buf().unwrap()[8..]); - - let ptr = &self.buf().unwrap()[4]; - let slice = unsafe { std::slice::from_raw_parts(ptr as *const u8, 4) }; - let addr = Ipv4Addr::new(slice[0], slice[1], slice[2], slice[3]); + if self.buf_size >= 4 { + let buf = self.buf().unwrap(); + let peer_port: u32 = byte_order::read_le_u32(&buf[0..]); + let addr_len: u32 = byte_order::read_le_u32(&buf[4..]); + let addr = Self::parse_address(&buf[8..], addr_len)?; - Some(TsiSendtoAddr { - peer_port, - addr, - port, - }) + Some(TsiSendtoAddr { peer_port, addr }) } else { None } } pub fn read_listen_req(&self) -> Option { - if self.buf_size >= 18 { - let peer_port: u32 = byte_order::read_le_u32(&self.buf().unwrap()[0..]); - - let ptr = &self.buf().unwrap()[4]; - let slice = unsafe { std::slice::from_raw_parts(ptr as *const u8, 4) }; - let addr = Ipv4Addr::new(slice[0], slice[1], slice[2], slice[3]); - - let port: u16 = byte_order::read_be_u16(&self.buf().unwrap()[8..]); - let vm_port: u32 = byte_order::read_le_u32(&self.buf().unwrap()[10..]); - let backlog: u32 = byte_order::read_le_u32(&self.buf().unwrap()[14..]); + if self.buf_size >= 12 { + let buf = self.buf().unwrap(); + let peer_port: u32 = byte_order::read_le_u32(&buf[0..]); + let vm_port: u32 = byte_order::read_le_u32(&buf[4..]); + let backlog: u32 = byte_order::read_le_u32(&buf[8..]); + let addr_len: u32 = byte_order::read_le_u32(&buf[12..]); + let addr = Self::parse_address(&buf[16..], addr_len)?; Some(TsiListenReq { peer_port, - addr, - port, vm_port, backlog: backlog as i32, + addr, }) } else { None diff --git a/src/devices/src/virtio/vsock/proxy.rs b/src/devices/src/virtio/vsock/proxy.rs index cead8a8b0..254f4b98c 100644 --- a/src/devices/src/virtio/vsock/proxy.rs +++ b/src/devices/src/virtio/vsock/proxy.rs @@ -5,6 +5,7 @@ use std::os::unix::io::{AsRawFd, RawFd}; use super::muxer::MuxerRx; use super::packet::{TsiAcceptReq, TsiConnectReq, TsiListenReq, TsiSendtoAddr, VsockPacket}; +use nix::sys::socket::AddressFamily; use utils::epoll::EventSet; #[derive(Debug)] @@ -19,6 +20,8 @@ pub enum RecvPkt { #[derive(Debug)] pub enum ProxyError { CreatingSocket(nix::errno::Errno), + InvalidFamily, + SettingReuseAddr(nix::errno::Errno), SettingReusePort(nix::errno::Errno), } @@ -54,7 +57,7 @@ pub struct ProxyUpdate { pub signal_queue: bool, pub remove_proxy: ProxyRemoval, pub polling: Option<(u64, RawFd, EventSet)>, - pub new_proxy: Option<(u32, OwnedFd, NewProxyType)>, + pub new_proxy: Option<(u32, OwnedFd, AddressFamily, NewProxyType)>, pub push_accept: Option<(u64, u64)>, pub push_credit_req: Option, } diff --git a/src/devices/src/virtio/vsock/udp.rs b/src/devices/src/virtio/vsock/tsi_dgram.rs similarity index 83% rename from src/devices/src/virtio/vsock/udp.rs rename to src/devices/src/virtio/vsock/tsi_dgram.rs index 634874b1c..896f539b8 100644 --- a/src/devices/src/virtio/vsock/udp.rs +++ b/src/devices/src/virtio/vsock/tsi_dgram.rs @@ -1,5 +1,5 @@ use std::collections::HashMap; -use std::net::SocketAddrV4; +use std::net::{Ipv4Addr, SocketAddrV4}; use std::num::Wrapping; use std::os::fd::OwnedFd; use std::os::unix::io::{AsRawFd, RawFd}; @@ -8,7 +8,7 @@ use std::sync::{Arc, Mutex}; use nix::fcntl::{fcntl, FcntlArg, OFlag}; use nix::sys::socket::{ bind, connect, getpeername, recv, send, sendto, socket, AddressFamily, MsgFlags, SockFlag, - SockType, SockaddrIn, + SockType, SockaddrIn, SockaddrLike, SockaddrStorage, }; #[cfg(target_os = "macos")] @@ -26,14 +26,14 @@ use utils::epoll::EventSet; use vm_memory::GuestMemoryMmap; -pub struct UdpProxy { +pub struct TsiDgramProxy { pub id: u64, cid: u64, local_port: u32, peer_port: u32, fd: OwnedFd, pub status: ProxyStatus, - sendto_addr: Option, + sendto_addr: Option, listening: bool, mem: GuestMemoryMmap, queue: Arc>, @@ -44,22 +44,26 @@ pub struct UdpProxy { peer_fwd_cnt: Wrapping, } -impl UdpProxy { +impl TsiDgramProxy { pub fn new( id: u64, cid: u64, + family: u16, peer_port: u32, mem: GuestMemoryMmap, queue: Arc>, rxq: Arc>, ) -> Result { - let fd = socket( - AddressFamily::Inet, - SockType::Datagram, - SockFlag::empty(), - None, - ) - .map_err(ProxyError::CreatingSocket)?; + let family = match family { + defs::LINUX_AF_INET => AddressFamily::Inet, + defs::LINUX_AF_INET6 => AddressFamily::Inet6, + #[cfg(target_os = "linux")] + defs::LINUX_AF_UNIX => AddressFamily::Unix, + _ => return Err(ProxyError::InvalidFamily), + }; + + let fd = socket(family, SockType::Datagram, SockFlag::empty(), None) + .map_err(ProxyError::CreatingSocket)?; // macOS forces us to do this here instead of just using SockFlag::SOCK_NONBLOCK above. match fcntl(&fd, FcntlArg::F_GETFL) { @@ -89,7 +93,7 @@ impl UdpProxy { }; } - Ok(UdpProxy { + Ok(TsiDgramProxy { id, cid, local_port: 0, @@ -110,7 +114,7 @@ impl UdpProxy { fn init_pkt(&self, pkt: &mut VsockPacket) { debug!( - "udp: init_pkt: id={}, src_port={}, dst_port={}", + "init_pkt: id={}, src_port={}, dst_port={}", self.id, self.local_port, self.peer_port ); pkt.set_op(uapi::VSOCK_OP_RW) @@ -161,7 +165,7 @@ impl UdpProxy { match recv(self.fd.as_raw_fd(), &mut buf[..max_len], MsgFlags::empty()) { Ok(cnt) => { - debug!("vsock: udp: recv cnt={cnt}"); + debug!("recv cnt={cnt}"); if cnt > 0 { RecvPkt::Read(cnt) } else { @@ -169,12 +173,12 @@ impl UdpProxy { } } Err(e) => { - debug!("vsock: udp: recv_pkt: recv error: {e:?}"); + debug!("recv_pkt: recv error: {e:?}"); RecvPkt::Error } } } else { - debug!("vsock: udp: recv_pkt: pkt without buf"); + debug!("recv_pkt: pkt without buf"); RecvPkt::Error } } @@ -204,7 +208,7 @@ impl UdpProxy { RecvPkt::Error => 0, }, Err(e) => { - debug!("vsock: tcp: recv_pkt: RX queue error: {e:?}"); + debug!("recv_pkt: RX queue error: {e:?}"); 0 } }; @@ -214,19 +218,19 @@ impl UdpProxy { break; } else { have_used = true; - debug!("vsock: udp: recv_pkt: pushing packet with {len} bytes"); + debug!("recv_pkt: pushing packet with {len} bytes"); if let Err(e) = queue.add_used(&self.mem, head.index, len as u32) { error!("failed to add used elements to the queue: {e:?}"); } } } - debug!("vsock: udp: recv_pkt: have_used={have_used}"); + debug!("recv_pkt: have_used={have_used}"); (have_used, wait_credit) } } -impl Proxy for UdpProxy { +impl Proxy for TsiDgramProxy { fn id(&self) -> u64 { self.id } @@ -236,18 +240,15 @@ impl Proxy for UdpProxy { } fn connect(&mut self, pkt: &VsockPacket, req: TsiConnectReq) -> ProxyUpdate { - debug!("vsock: udp: connect: addr={}, port={}", req.addr, req.port); - let res = match connect( - self.fd.as_raw_fd(), - &SockaddrIn::from(SocketAddrV4::new(req.addr, req.port)), - ) { + debug!("connect: addr={}", req.addr); + let res = match connect(self.fd.as_raw_fd(), &req.addr) { Ok(()) => { - debug!("vsock: connect: Connected"); + debug!("connect: Connected"); self.status = ProxyStatus::Connected; 0 } Err(e) => { - debug!("vsock: UdpProxy: Error connecting: {e}"); + debug!("Error connecting: {e}"); #[cfg(target_os = "macos")] let errno = -linux_errno_raw(e as i32); #[cfg(target_os = "linux")] @@ -275,14 +276,26 @@ impl Proxy for UdpProxy { } fn getpeername(&mut self, pkt: &VsockPacket) { - debug!("vsock: udp: process_getpeername"); + debug!("process_getpeername"); + + let (result, addr): (i32, SockaddrStorage) = match getpeername(self.fd.as_raw_fd()) { + Ok(name) => (0, name), + Err(e) => { + #[cfg(target_os = "macos")] + let errno = -linux_errno_raw(e as i32); + #[cfg(target_os = "linux")] + let errno = -(e as i32); + ( + errno, + SocketAddrV4::new(Ipv4Addr::new(0, 0, 0, 0), 0).into(), + ) + } + }; - let name = getpeername::(self.fd.as_raw_fd()).unwrap(); - let addr = name.ip(); let data = TsiGetnameRsp { + result, + addr_len: addr.len(), addr, - port: name.port(), - result: 0, }; // This response goes to the connection. @@ -295,7 +308,7 @@ impl Proxy for UdpProxy { } fn sendmsg(&mut self, pkt: &VsockPacket) -> ProxyUpdate { - debug!("vsock: udp_proxy: sendmsg"); + debug!("sendmsg"); let ret = if let Some(buf) = pkt.buf() { #[cfg(target_os = "macos")] @@ -314,27 +327,24 @@ impl Proxy for UdpProxy { -libc::EINVAL }; - debug!("vsock: udp_proxy: sendmsg ret={ret}"); + debug!("sendmsg ret={ret}"); ProxyUpdate::default() } fn sendto_addr(&mut self, req: TsiSendtoAddr) -> ProxyUpdate { - debug!( - "vsock: udp_proxy: sendto_addr: addr={}, port={}", - req.addr, req.port - ); + debug!("sendto_addr: addr={}", req.addr); let mut update = ProxyUpdate::default(); - self.sendto_addr = Some(SockaddrIn::from(SocketAddrV4::new(req.addr, req.port))); + self.sendto_addr = Some(req.addr); if !self.listening { match bind(self.fd.as_raw_fd(), &SockaddrIn::new(0, 0, 0, 0, 0)) { Ok(_) => { self.listening = true; update.polling = Some((self.id, self.fd.as_raw_fd(), EventSet::IN)); } - Err(e) => debug!("vsock: udp_proxy: couldn't bind socket: {e}"), + Err(e) => debug!("couldn't bind socket: {e}"), } } @@ -342,7 +352,7 @@ impl Proxy for UdpProxy { } fn sendto_data(&mut self, pkt: &VsockPacket) { - debug!("vsock: udp_proxy: sendto_data"); + debug!("sendto_data"); self.peer_buf_alloc = pkt.buf_alloc(); self.peer_fwd_cnt = Wrapping(pkt.fwd_cnt()); @@ -361,10 +371,10 @@ impl Proxy for UdpProxy { Err(err) => debug!("error in sendto: {err}"), } } else { - debug!("vsock: udp_proxy: sendto_data pkt without buffer"); + debug!("sendto_data pkt without buffer"); } } else { - debug!("vsock: udp_proxy: sendto_data without sendto_addr"); + debug!("sendto_data without sendto_addr"); } } @@ -383,7 +393,7 @@ impl Proxy for UdpProxy { fn update_peer_credit(&mut self, pkt: &VsockPacket) -> ProxyUpdate { debug!( - "vsock: udp_proxy: update_credit: buf_alloc={} rx_cnt={} fwd_cnt={}", + "update_credit: buf_alloc={} rx_cnt={} fwd_cnt={}", pkt.buf_alloc(), self.rx_cnt, pkt.fwd_cnt() @@ -447,14 +457,14 @@ impl Proxy for UdpProxy { } if evset.contains(EventSet::OUT) { - error!("vsock::udp: EventSet::OUT unexpected"); + error!("EventSet::OUT unexpected"); } update } } -impl AsRawFd for UdpProxy { +impl AsRawFd for TsiDgramProxy { fn as_raw_fd(&self) -> RawFd { self.fd.as_raw_fd() } diff --git a/src/devices/src/virtio/vsock/tcp.rs b/src/devices/src/virtio/vsock/tsi_stream.rs similarity index 76% rename from src/devices/src/virtio/vsock/tcp.rs rename to src/devices/src/virtio/vsock/tsi_stream.rs index 4b52d9668..434264fae 100644 --- a/src/devices/src/virtio/vsock/tcp.rs +++ b/src/devices/src/virtio/vsock/tsi_stream.rs @@ -1,15 +1,23 @@ use std::collections::HashMap; -use std::net::{Ipv4Addr, SocketAddrV4}; +use std::fs; +use std::net::{Ipv4Addr, SocketAddrV4, SocketAddrV6}; use std::num::Wrapping; use std::os::fd::{FromRawFd, OwnedFd}; +use std::os::unix::fs::FileTypeExt; use std::os::unix::io::{AsRawFd, RawFd}; +use std::path::PathBuf; +use std::str::FromStr; use std::sync::{Arc, Mutex}; +#[cfg(target_os = "linux")] +use libc::EINVAL; +#[cfg(target_os = "macos")] +use libc::EINVAL; use nix::errno::Errno; use nix::fcntl::{fcntl, FcntlArg, OFlag}; use nix::sys::socket::{ accept, bind, connect, getpeername, listen, recv, send, setsockopt, shutdown, socket, sockopt, - AddressFamily, Backlog, MsgFlags, Shutdown, SockFlag, SockType, SockaddrIn, + AddressFamily, Backlog, MsgFlags, Shutdown, SockFlag, SockType, SockaddrLike, SockaddrStorage, }; #[cfg(target_os = "macos")] @@ -29,10 +37,11 @@ use utils::epoll::EventSet; use vm_memory::GuestMemoryMmap; -pub struct TcpProxy { +pub struct TsiStreamProxy { id: u64, cid: u64, parent_id: u64, + family: AddressFamily, local_port: u32, peer_port: u32, control_port: u32, @@ -48,13 +57,15 @@ pub struct TcpProxy { peer_fwd_cnt: Wrapping, push_cnt: Wrapping, pending_accepts: u64, + unixsock_path: Option, } -impl TcpProxy { +impl TsiStreamProxy { #[allow(clippy::too_many_arguments)] pub fn new( id: u64, cid: u64, + family: u16, local_port: u32, peer_port: u32, control_port: u32, @@ -62,13 +73,15 @@ impl TcpProxy { queue: Arc>, rxq: Arc>, ) -> Result { - let fd = socket( - AddressFamily::Inet, - SockType::Stream, - SockFlag::empty(), - None, - ) - .map_err(ProxyError::CreatingSocket)?; + let family = match family { + defs::LINUX_AF_INET => AddressFamily::Inet, + defs::LINUX_AF_INET6 => AddressFamily::Inet6, + #[cfg(target_os = "linux")] + defs::LINUX_AF_UNIX => AddressFamily::Unix, + _ => return Err(ProxyError::InvalidFamily), + }; + let fd = socket(family, SockType::Stream, SockFlag::empty(), None) + .map_err(ProxyError::CreatingSocket)?; // macOS forces us to do this here instead of just using SockFlag::SOCK_NONBLOCK above. match fcntl(&fd, FcntlArg::F_GETFL) { @@ -83,7 +96,12 @@ impl TcpProxy { Err(e) => error!("couldn't obtain fd flags id={id}, err={e}"), }; - setsockopt(&fd, sockopt::ReusePort, &true).map_err(ProxyError::SettingReusePort)?; + if family == AddressFamily::Unix { + setsockopt(&fd, sockopt::ReuseAddr, &true).map_err(ProxyError::SettingReuseAddr)?; + } else { + setsockopt(&fd, sockopt::ReusePort, &true).map_err(ProxyError::SettingReusePort)?; + } + #[cfg(target_os = "macos")] { // nix doesn't provide an abstraction for SO_NOSIGPIPE, fall back to libc. @@ -99,10 +117,11 @@ impl TcpProxy { }; } - Ok(TcpProxy { + Ok(TsiStreamProxy { id, cid, parent_id: 0, + family, local_port, peer_port, control_port, @@ -118,6 +137,7 @@ impl TcpProxy { peer_fwd_cnt: Wrapping(0), push_cnt: Wrapping(0), pending_accepts: 0, + unixsock_path: None, }) } @@ -126,6 +146,7 @@ impl TcpProxy { id: u64, cid: u64, parent_id: u64, + family: AddressFamily, local_port: u32, peer_port: u32, fd: OwnedFd, @@ -134,10 +155,11 @@ impl TcpProxy { rxq: Arc>, ) -> Self { debug!("new_reverse: id={id} local_port={local_port} peer_port={peer_port}"); - TcpProxy { + TsiStreamProxy { id, cid, parent_id, + family, local_port, peer_port, control_port: 0, @@ -153,12 +175,13 @@ impl TcpProxy { peer_fwd_cnt: Wrapping(0), push_cnt: Wrapping(0), pending_accepts: 0, + unixsock_path: None, } } fn init_data_pkt(&self, pkt: &mut VsockPacket) { debug!( - "tcp: init_data_pkt: id={}, local_port={}, peer_port={}", + "init_data_pkt: id={}, local_port={}, peer_port={}", self.id, self.local_port, self.peer_port ); pkt.set_op(uapi::VSOCK_OP_RW) @@ -176,30 +199,56 @@ impl TcpProxy { return 0; } - let port = if let Some(port_map) = host_port_map { - if let Some(port) = port_map.get(&req.port) { - *port + let addr: SockaddrStorage = if let Some(port_map) = host_port_map { + if let Some(sin) = req.addr.as_sockaddr_in() { + debug!("sockaddr is ipv4"); + if let Some(port) = port_map.get(&sin.port()) { + SocketAddrV4::new(sin.ip(), *port).into() + } else { + req.addr + } + } else if let Some(sin6) = req.addr.as_sockaddr_in6() { + debug!("sockaddr is ipv6"); + if let Some(port) = port_map.get(&sin6.port()) { + SocketAddrV6::new(sin6.ip(), *port, sin6.flowinfo(), sin6.flowinfo()).into() + } else { + req.addr + } + } else if req.addr.as_unix_addr().is_some() { + debug!("sockaddr is unix"); + req.addr } else { - return -libc::EPERM; + return -libc::EINVAL; } } else { - req.port + req.addr }; - match bind( - self.fd.as_raw_fd(), - &SockaddrIn::from(SocketAddrV4::new(req.addr, port)), - ) { + let unixsock_path = self.get_unixsock_path(&addr); + // If the userspace process in the guest has already created the socket, + // we need to unlink it to take ownership of the node in the filesystem. + if let Some(path) = &unixsock_path { + if let Err(e) = fs::remove_file(path) { + debug!("error removing socket: {e}"); + } + } + + match bind(self.fd.as_raw_fd(), &addr) { Ok(_) => { debug!("tcp bind: id={}", self.id); + + // For unix sockets we need to unlink the path on Drop, since + // it's possible the userspace application can't do it itself. + self.unixsock_path = unixsock_path; + match Backlog::new(req.backlog) { Ok(backlog) => match listen(&self.fd, backlog) { Ok(_) => { - debug!("tcp: proxy: id={}", self.id); + debug!("proxy: id={}", self.id); 0 } Err(e) => { - warn!("tcp: proxy: id={} err={}", self.id, e); + warn!("proxy: id={} err={}", self.id, e); #[cfg(target_os = "macos")] let errno = -linux_errno_raw(e as i32); #[cfg(target_os = "linux")] @@ -208,7 +257,7 @@ impl TcpProxy { } }, Err(e) => { - warn!("tcp: proxy: id={} err={}", self.id, e); + warn!("proxy: id={} err={}", self.id, e); #[cfg(target_os = "macos")] let errno = -linux_errno_raw(e as i32); #[cfg(target_os = "linux")] @@ -254,21 +303,21 @@ impl TcpProxy { MsgFlags::MSG_DONTWAIT, ) { Ok(cnt) => { - debug!("vsock: tcp: recv cnt={cnt}"); + debug!("recv cnt={cnt}"); if cnt > 0 { - debug!("vsock: tcp: recv rx_cnt={}", self.rx_cnt); + debug!("recv rx_cnt={}", self.rx_cnt); RecvPkt::Read(cnt) } else { RecvPkt::Close } } Err(e) => { - debug!("vsock: tcp: recv_pkt: recv error: {e:?}"); + debug!("recv_pkt: recv error: {e:?}"); RecvPkt::Error } } } else { - debug!("vsock: tcp: recv_pkt: pkt without buf"); + debug!("recv_pkt: pkt without buf"); RecvPkt::Error } } @@ -298,7 +347,7 @@ impl TcpProxy { RecvPkt::Error => 0, }, Err(e) => { - debug!("vsock: tcp: recv_pkt: RX queue error: {e:?}"); + debug!("recv_pkt: RX queue error: {e:?}"); 0 } }; @@ -310,7 +359,7 @@ impl TcpProxy { have_used = true; self.push_cnt += Wrapping(len as u32); debug!( - "vsock: tcp: recv_pkt: pushing packet with {} bytes, push_cnt={}", + "recv_pkt: pushing packet with {} bytes, push_cnt={}", len, self.push_cnt ); if let Err(e) = queue.add_used(&self.mem, head.index, len as u32) { @@ -319,7 +368,7 @@ impl TcpProxy { } } - debug!("vsock: tcp: recv_pkt: have_used={have_used}"); + debug!("recv_pkt: have_used={have_used}"); (have_used, wait_credit) } @@ -366,9 +415,46 @@ impl TcpProxy { Err(e) => error!("couldn't obtain fd flags id={}, err={}", self.id, e), }; } + + fn get_addr_len(&self, addr: &SockaddrStorage) -> Option { + let addr_len = match self.family { + AddressFamily::Inet => addr.as_sockaddr_in()?.len(), + AddressFamily::Inet6 => addr.as_sockaddr_in6()?.len(), + AddressFamily::Unix => addr.as_unix_addr()?.len(), + _ => 0, + }; + + Some(addr_len) + } + + fn get_unixsock_path(&self, addr: &SockaddrStorage) -> Option { + if let Some(addr) = addr.as_unix_addr() { + if let Some(path) = addr.path() { + // SockaddrStorage doesn't clean up NULLs. This is fine when + // using addr with other nix methods, but we need to clean them + // up to be able to treat it as a path with other Rust crates. + let path_str = path.to_str()?.replace("\0", ""); + debug!("unix socket path_str={path_str}"); + + match fs::metadata(&path_str) { + Ok(metadata) => { + if metadata.file_type().is_socket() { + debug!("unix socket path is socket"); + return PathBuf::from_str(&path_str).ok(); + } else { + debug!("unix socket path is NOT a socket"); + } + } + Err(e) => debug!("metadata failed with {e}"), + } + } + } + + None + } } -impl Proxy for TcpProxy { +impl Proxy for TsiStreamProxy { fn id(&self) -> u64 { self.id } @@ -380,22 +466,19 @@ impl Proxy for TcpProxy { fn connect(&mut self, _pkt: &VsockPacket, req: TsiConnectReq) -> ProxyUpdate { let mut update = ProxyUpdate::default(); - let result = match connect( - self.fd.as_raw_fd(), - &SockaddrIn::from(SocketAddrV4::new(req.addr, req.port)), - ) { + let result = match connect(self.fd.as_raw_fd(), &req.addr) { Ok(()) => { - debug!("vsock: connect: Connected"); + debug!("connect: Connected"); self.switch_to_connected(); 0 } Err(nix::errno::Errno::EINPROGRESS) => { - debug!("vsock: connect: Connecting"); + debug!("connect: Connecting"); self.status = ProxyStatus::Connecting; 0 } Err(e) => { - debug!("vsock: TcpProxy: Error connecting: {e}"); + debug!("TcpProxy: Error connecting: {e}"); #[cfg(target_os = "macos")] let errno = -linux_errno_raw(Errno::last_raw()); #[cfg(target_os = "linux")] @@ -418,7 +501,7 @@ impl Proxy for TcpProxy { fn confirm_connect(&mut self, pkt: &VsockPacket) -> Option { debug!( - "tcp: confirm_connect: local_port={} peer_port={}, src_port={}, dst_port={}", + "confirm_connect: local_port={} peer_port={}, src_port={}, dst_port={}", pkt.dst_port(), pkt.src_port(), self.local_port, @@ -449,21 +532,37 @@ impl Proxy for TcpProxy { fn getpeername(&mut self, pkt: &VsockPacket) { debug!("getpeername: id={}", self.id); - let (result, addr, port) = match getpeername::(self.fd.as_raw_fd()) { - Ok(name) => { - let addr = name.ip(); - (0, addr, name.port()) - } - Err(e) => { - #[cfg(target_os = "macos")] - let errno = -linux_errno_raw(e as i32); - #[cfg(target_os = "linux")] - let errno = -(e as i32); - (errno, Ipv4Addr::new(0, 0, 0, 0), 0) - } - }; + let (result, addr_len, addr): (i32, u32, SockaddrStorage) = + match getpeername(self.fd.as_raw_fd()) { + Ok(addr) => { + if let Some(addr_len) = self.get_addr_len(&addr) { + (0, addr_len, addr) + } else { + #[cfg(target_os = "macos")] + let errno = -linux_errno_raw(EINVAL); + #[cfg(target_os = "linux")] + let errno = -EINVAL; + (errno, 0, addr) + } + } + Err(e) => { + #[cfg(target_os = "macos")] + let errno = -linux_errno_raw(e as i32); + #[cfg(target_os = "linux")] + let errno = -(e as i32); + ( + errno, + 0, + SocketAddrV4::new(Ipv4Addr::new(0, 0, 0, 0), 0).into(), + ) + } + }; - let data = TsiGetnameRsp { addr, port, result }; + let data = TsiGetnameRsp { + result, + addr_len, + addr, + }; debug!("getpeername: reply={data:?}"); @@ -477,7 +576,7 @@ impl Proxy for TcpProxy { } fn sendmsg(&mut self, pkt: &VsockPacket) -> ProxyUpdate { - debug!("vsock: tcp_proxy: sendmsg"); + debug!("sendmsg"); let mut update = ProxyUpdate::default(); @@ -525,7 +624,7 @@ impl Proxy for TcpProxy { update.signal_queue = true; } - debug!("vsock: tcp_proxy: sendmsg ret={ret}"); + debug!("sendmsg ret={ret}"); update } @@ -540,8 +639,8 @@ impl Proxy for TcpProxy { host_port_map: &Option>, ) -> ProxyUpdate { debug!( - "listen: id={} addr={}, port={}, vm_port={} backlog={}", - self.id, req.addr, req.port, req.vm_port, req.backlog + "listen: id={} addr={}, vm_port={} backlog={}", + self.id, req.addr, req.vm_port, req.backlog ); let mut update = ProxyUpdate::default(); @@ -752,17 +851,15 @@ impl Proxy for TcpProxy { Ok(accept_fd) => { // Safe because we've just obtained the FD from the `accept` call above. let new_fd = unsafe { OwnedFd::from_raw_fd(accept_fd) }; - update.new_proxy = Some((self.peer_port, new_fd, NewProxyType::Tcp)); + update.new_proxy = + Some((self.peer_port, new_fd, self.family, NewProxyType::Tcp)); } Err(e) => warn!("error accepting connection: id={}, err={}", self.id, e), }; update.signal_queue = true; return update; } else { - debug!( - "vsock::tcp: EventSet::IN while not connected: {:?}", - self.status - ); + debug!("EventSet::IN while not connected: {:?}", self.status); } } @@ -776,7 +873,7 @@ impl Proxy for TcpProxy { // OP_REQUEST and the vsock transport is fully established. update.polling = Some((self.id(), self.fd.as_raw_fd(), EventSet::empty())); } else { - error!("vsock::tcp: EventSet::OUT while not connecting"); + error!("EventSet::OUT while not connecting"); } } @@ -784,8 +881,16 @@ impl Proxy for TcpProxy { } } -impl AsRawFd for TcpProxy { +impl AsRawFd for TsiStreamProxy { fn as_raw_fd(&self) -> RawFd { self.fd.as_raw_fd() } } + +impl Drop for TsiStreamProxy { + fn drop(&mut self) { + if let Some(path) = &self.unixsock_path { + _ = fs::remove_file(path); + } + } +} diff --git a/src/devices/src/virtio/vsock/unix.rs b/src/devices/src/virtio/vsock/unix.rs index 688528534..88bf89931 100644 --- a/src/devices/src/virtio/vsock/unix.rs +++ b/src/devices/src/virtio/vsock/unix.rs @@ -225,21 +225,21 @@ impl UnixProxy { MsgFlags::MSG_DONTWAIT, ) { Ok(cnt) => { - debug!("vsock: unix: recv cnt={cnt}"); + debug!("recv cnt={cnt}"); if cnt > 0 { - debug!("vsock: tcp: recv rx_cnt={}", self.rx_cnt); + debug!("recv rx_cnt={}", self.rx_cnt); RecvPkt::Read(cnt) } else { RecvPkt::Close } } Err(e) => { - debug!("vsock: tcp: recv_pkt: recv error: {e:?}"); + debug!("recv_pkt: recv error: {e:?}"); RecvPkt::Error } } } else { - debug!("vsock: tcp: recv_pkt: pkt without buf"); + debug!("recv_pkt: pkt without buf"); RecvPkt::Error } } @@ -269,7 +269,7 @@ impl UnixProxy { RecvPkt::Error => 0, }, Err(e) => { - debug!("vsock: tcp: recv_pkt: RX queue error: {e:?}"); + debug!("recv_pkt: RX queue error: {e:?}"); 0 } }; @@ -281,7 +281,7 @@ impl UnixProxy { have_used = true; self.push_cnt += Wrapping(len as u32); debug!( - "vsock: tcp: recv_pkt: pushing packet with {} bytes, push_cnt={}", + "recv_pkt: pushing packet with {} bytes, push_cnt={}", len, self.push_cnt ); if let Err(e) = queue.add_used(&self.mem, head.index, len as u32) { @@ -290,13 +290,13 @@ impl UnixProxy { } } - debug!("vsock: tcp: recv_pkt: have_used={have_used}"); + debug!("recv_pkt: have_used={have_used}"); (have_used, wait_credit) } fn init_data_pkt(&self, pkt: &mut VsockPacket) { debug!( - "tcp: init_data_pkt: id={}, local_port={}, peer_port={}", + "init_data_pkt: id={}, local_port={}, peer_port={}", self.id, self.local_port, self.peer_port ); @@ -327,17 +327,17 @@ impl Proxy for UnixProxy { let result = match connect(self.fd.as_raw_fd(), &addr) { Ok(()) => { - debug!("vsock: connect: Connected"); + debug!("connect: Connected"); self.switch_to_connected(); 0 } Err(nix::errno::Errno::EINPROGRESS) => { - debug!("vsock: connect: Connecting"); + debug!("connect: Connecting"); self.status = ProxyStatus::Connecting; 0 } Err(e) => { - debug!("vsock: UnixProxy: Error connecting: {e}"); + debug!("Error connecting: {e}"); #[cfg(target_os = "macos")] let errno = -linux_errno_raw(Errno::last_raw()); #[cfg(target_os = "linux")] @@ -360,7 +360,7 @@ impl Proxy for UnixProxy { fn confirm_connect(&mut self, pkt: &VsockPacket) -> Option { debug!( - "tcp: confirm_connect: local_port={} peer_port={}, src_port={}, dst_port={}", + "confirm_connect: local_port={} peer_port={}, src_port={}, dst_port={}", pkt.dst_port(), pkt.src_port(), self.local_port, @@ -437,7 +437,7 @@ impl Proxy for UnixProxy { update.signal_queue = true; } - debug!("vsock: tcp_proxy: sendmsg ret={ret}"); + debug!("sendmsg ret={ret}"); update } @@ -595,10 +595,7 @@ impl Proxy for UnixProxy { update.polling = Some((self.id(), self.fd.as_raw_fd(), EventSet::empty())); } } else { - debug!( - "vsock::tcp: EventSet::IN while not connected: {:?}", - self.status - ); + debug!("EventSet::IN while not connected: {:?}", self.status); } } @@ -610,7 +607,7 @@ impl Proxy for UnixProxy { update.signal_queue = true; update.polling = Some((self.id(), self.fd.as_raw_fd(), EventSet::IN)); } else { - error!("vsock::tcp: EventSet::OUT while not connecting"); + error!("EventSet::OUT while not connecting"); } } @@ -704,7 +701,12 @@ impl Proxy for UnixAcceptorProxy { Ok(accept_fd) => { // Safe because we've just obtained the FD from the `accept` call above. let new_fd = unsafe { OwnedFd::from_raw_fd(accept_fd) }; - update.new_proxy = Some((self.peer_port, new_fd, NewProxyType::Unix)); + update.new_proxy = Some(( + self.peer_port, + new_fd, + AddressFamily::Unix, + NewProxyType::Unix, + )); } Err(e) => warn!("error accepting connection: id={}, err={}", self.id, e), }; diff --git a/src/libkrun/src/lib.rs b/src/libkrun/src/lib.rs index 55f3986dd..1fe8cee2a 100644 --- a/src/libkrun/src/lib.rs +++ b/src/libkrun/src/lib.rs @@ -76,13 +76,13 @@ const MAX_ARGS: usize = 4096; // krunfw library name for each context #[cfg(all(target_os = "linux", not(feature = "tee")))] -const KRUNFW_NAME: &str = "libkrunfw.so.4"; +const KRUNFW_NAME: &str = "libkrunfw.so.5"; #[cfg(all(target_os = "linux", feature = "amd-sev"))] -const KRUNFW_NAME: &str = "libkrunfw-sev.so.4"; +const KRUNFW_NAME: &str = "libkrunfw-sev.so.5"; #[cfg(all(target_os = "linux", feature = "tdx"))] -const KRUNFW_NAME: &str = "libkrunfw-tdx.so.4"; +const KRUNFW_NAME: &str = "libkrunfw-tdx.so.5"; #[cfg(target_os = "macos")] -const KRUNFW_NAME: &str = "libkrunfw.4.dylib"; +const KRUNFW_NAME: &str = "libkrunfw.5.dylib"; // Path to the init binary to be executed inside the VM. const INIT_PATH: &str = "/init.krun"; @@ -2384,16 +2384,20 @@ pub extern "C" fn krun_start_enter(ctx_id: u32) -> i32 { guest_cid: 3, host_port_map: None, unix_ipc_port_map: None, + enable_tsi: false, + enable_tsi_unix: false, }; #[cfg(feature = "net")] if ctx_cfg.vmr.net.list.is_empty() && ctx_cfg.legacy_net_cfg.is_none() { vsock_config.host_port_map = ctx_cfg.tsi_port_map; + vsock_config.enable_tsi = true; vsock_set = true; } #[cfg(not(feature = "net"))] { vsock_config.host_port_map = ctx_cfg.tsi_port_map; + vsock_config.enable_tsi = true; vsock_set = true; } @@ -2403,6 +2407,15 @@ pub extern "C" fn krun_start_enter(ctx_id: u32) -> i32 { } if vsock_set { + if vsock_config.enable_tsi { + // We only support using TSI for AF_UNIX in a containerized context, + // so only enable it when we have a single virtio-fs device pointing + // to root. + #[cfg(not(feature = "tee"))] + if ctx_cfg.vmr.fs.len() == 1 && ctx_cfg.vmr.fs[0].shared_dir == "/" { + vsock_config.enable_tsi_unix = true; + } + } ctx_cfg.vmr.set_vsock_device(vsock_config).unwrap(); } diff --git a/src/vmm/src/vmm_config/vsock.rs b/src/vmm/src/vmm_config/vsock.rs index 3868640ca..732505fb9 100644 --- a/src/vmm/src/vmm_config/vsock.rs +++ b/src/vmm/src/vmm_config/vsock.rs @@ -40,6 +40,10 @@ pub struct VsockDeviceConfig { pub host_port_map: Option>, /// An optional map of guest port to host UNIX domain sockets for IPC. pub unix_ipc_port_map: Option>, + /// Whether to enable TSI + pub enable_tsi: bool, + /// Whether to enable TSI for AF_UNIX + pub enable_tsi_unix: bool, } struct VsockWrapper { @@ -78,6 +82,8 @@ impl VsockBuilder { u64::from(cfg.guest_cid), cfg.host_port_map, cfg.unix_ipc_port_map, + cfg.enable_tsi, + cfg.enable_tsi_unix, ) .map_err(VsockConfigError::CreateVsockDevice) } @@ -115,6 +121,8 @@ pub(crate) mod tests { guest_cid: 3, host_port_map: None, unix_ipc_port_map: None, + enable_tsi: false, + enable_tsi_unix: false, } }