|
| 1 | +# Per-Message GSO and GRO on Linux |
| 2 | + |
| 3 | +Use Generic Segmentation Offload (GSO) and Generic Receive Offload (GRO) on a per-message basis for fine-grained control over UDP datagram segmentation and aggregation. |
| 4 | + |
| 5 | +## Overview |
| 6 | + |
| 7 | +Generic Segmentation Offload (GSO) and Generic Receive Offload (GRO) are Linux kernel features that enable efficient handling of UDP datagrams by offloading segmentation and aggregation work to the kernel or network interface card (NIC). |
| 8 | + |
| 9 | +SwiftNIO provides per-message APIs for both features, allowing dynamic control over segmentation and aggregation on a datagram-by-datagram basis. This offers more flexibility than channel-wide configuration, which requires static segment sizes known ahead of time. |
| 10 | + |
| 11 | +### What is Generic Segmentation Offload (GSO)? |
| 12 | + |
| 13 | +GSO allows you to send a single large buffer (a "superbuffer") that the kernel automatically splits into multiple UDP datagrams of a specified segment size. Instead of your application creating many small datagrams, you write one superbuffer and let the kernel handle the segmentation efficiently. |
| 14 | + |
| 15 | +**Benefits:** |
| 16 | +- Reduced application overhead by avoiding manual buffer segmentation |
| 17 | +- Lower CPU usage as the kernel or NIC performs the segmentation |
| 18 | +- Improved throughput for high-volume UDP applications |
| 19 | + |
| 20 | +### What is Generic Receive Offload (GRO)? |
| 21 | + |
| 22 | +GRO is the reverse of GSO: the kernel aggregates multiple received UDP datagrams into a single larger buffer (again, a "superbuffer") before delivering it to your application. When enabled with per-message metadata, you receive information about the original segment size used for aggregation. |
| 23 | + |
| 24 | +**Benefits:** |
| 25 | +- Fewer read syscalls and event loop iterations |
| 26 | +- Reduced per-packet processing overhead |
| 27 | +- Better performance for applications receiving many small datagrams |
| 28 | + |
| 29 | +## Per-Message GSO |
| 30 | + |
| 31 | +The per-message GSO API allows you to specify segmentation parameters for each datagram write, rather than configuring a static segment size for the entire channel. |
| 32 | + |
| 33 | +### Enabling Per-Message GSO |
| 34 | + |
| 35 | +To use per-message GSO, set the `segmentSize` field in `AddressedEnvelope.Metadata` when writing datagrams: |
| 36 | + |
| 37 | +```swift |
| 38 | +import NIOCore |
| 39 | +import NIOPosix |
| 40 | + |
| 41 | +// Create a large buffer to send (10 segments of 1000 bytes each) |
| 42 | +let segmentSize = 1000 |
| 43 | +let segmentCount = 10 |
| 44 | +var largeBuffer = channel.allocator.buffer(capacity: segmentSize * segmentCount) |
| 45 | +largeBuffer.writeRepeatingByte(1, count: segmentSize * segmentCount) |
| 46 | + |
| 47 | +// Write with per-message GSO metadata |
| 48 | +let envelope = AddressedEnvelope( |
| 49 | + remoteAddress: destinationAddress, |
| 50 | + data: largeBuffer, |
| 51 | + metadata: .init( |
| 52 | + ecnState: .transportNotCapable, |
| 53 | + packetInfo: nil, |
| 54 | + segmentSize: segmentSize // Enable GSO with 1000-byte segments |
| 55 | + ) |
| 56 | +) |
| 57 | + |
| 58 | +try await channel.writeAndFlush(envelope) |
| 59 | +``` |
| 60 | + |
| 61 | +The kernel will automatically split `largeBuffer` into 10 separate UDP datagrams of 1000 bytes each. |
| 62 | + |
| 63 | +### Mixing GSO and Non-GSO Writes |
| 64 | + |
| 65 | +You can freely mix writes with and without per-message GSO on the same channel: |
| 66 | + |
| 67 | +```swift |
| 68 | +// Write with GSO |
| 69 | +let gsoEnvelope = AddressedEnvelope( |
| 70 | + remoteAddress: destinationAddress, |
| 71 | + data: largeBuffer, |
| 72 | + metadata: .init(ecnState: .transportNotCapable, packetInfo: nil, segmentSize: 1000) |
| 73 | +) |
| 74 | + |
| 75 | +// Write without GSO (normal datagram) |
| 76 | +let normalEnvelope = AddressedEnvelope( |
| 77 | + remoteAddress: destinationAddress, |
| 78 | + data: smallBuffer |
| 79 | +) |
| 80 | + |
| 81 | +let write1 = channel.write(gsoEnvelope) |
| 82 | +let write2 = channel.write(normalEnvelope) |
| 83 | +channel.flush() |
| 84 | +``` |
| 85 | + |
| 86 | +## Per-Message GRO |
| 87 | + |
| 88 | +The per-message GRO API provides segment size information for each received aggregated datagram through the same `AddressedEnvelope.Metadata.segmentSize` field used for GSO. |
| 89 | + |
| 90 | +### Enabling Per-Message GRO |
| 91 | + |
| 92 | +To enable per-message GRO segment size reporting, you must: |
| 93 | + |
| 94 | +1. Enable channel-level GRO using `ChannelOptions.datagramReceiveOffload` |
| 95 | +2. Enable per-message segment size reporting using `ChannelOptions.datagramReceiveSegmentSize` |
| 96 | +3. Configure an appropriate receive buffer allocator to accommodate aggregated datagrams |
| 97 | + |
| 98 | +```swift |
| 99 | +import NIOCore |
| 100 | +import NIOPosix |
| 101 | + |
| 102 | +// Enable GRO on the channel |
| 103 | +try await channel.setOption(.datagramReceiveOffload, value: true) |
| 104 | + |
| 105 | +// Enable per-message segment size reporting |
| 106 | +try await channel.setOption(.datagramReceiveSegmentSize, value: true) |
| 107 | + |
| 108 | +// Configure a larger receive buffer to accommodate aggregated datagrams |
| 109 | +let largeBufferAllocator = FixedSizeRecvByteBufferAllocator(capacity: 65536) |
| 110 | +try await channel.setOption(.recvAllocator, value: largeBufferAllocator) |
| 111 | +``` |
| 112 | + |
| 113 | +### Reading Segment Size from Received Datagrams |
| 114 | + |
| 115 | +When you receive an aggregated datagram, the `segmentSize` field in the metadata contains the original segment size: |
| 116 | + |
| 117 | +```swift |
| 118 | +// In your channel handler |
| 119 | +func channelRead(context: ChannelHandlerContext, data: NIOAny) { |
| 120 | + let envelope = self.unwrapInboundIn(data) |
| 121 | + |
| 122 | + // Check if this is an aggregated datagram |
| 123 | + if let segmentSize = envelope.metadata?.segmentSize { |
| 124 | + print("Received aggregated datagram:") |
| 125 | + print(" Total size: \(envelope.data.readableBytes) bytes") |
| 126 | + print(" Original segment size: \(segmentSize) bytes") |
| 127 | + print(" Approximate segment count: \(envelope.data.readableBytes / segmentSize)") |
| 128 | + } else { |
| 129 | + print("Received normal datagram: \(envelope.data.readableBytes) bytes") |
| 130 | + } |
| 131 | +} |
| 132 | +``` |
| 133 | + |
| 134 | +### Buffer Allocator Considerations |
| 135 | + |
| 136 | +When using GRO, ensure your receive buffer allocator provides buffers large enough to hold aggregated datagrams. The default datagram channel allocator uses 2048-byte fixed buffers, which may be too small: |
| 137 | + |
| 138 | +```swift |
| 139 | +// Instead of the default 2048-byte buffers, use larger buffers |
| 140 | +let allocator = FixedSizeRecvByteBufferAllocator(capacity: 65536) // 64KB buffers |
| 141 | +try await channel.setOption(.recvAllocator, value: allocator) |
| 142 | +``` |
| 143 | + |
| 144 | +If the receive buffer is too small, the kernel will not be able to aggregate as many datagrams, reducing the effectiveness of GRO. |
| 145 | + |
| 146 | +## Platform Requirements and Limitations |
| 147 | + |
| 148 | +### Linux-Only Feature |
| 149 | + |
| 150 | +Per-message GSO and GRO are only supported on Linux. Attempting to use these features on other platforms will result in errors: |
| 151 | + |
| 152 | +- **GSO**: Writing an envelope with `segmentSize` set will fail the write promise with `ChannelError.operationUnsupported` |
| 153 | +- **GRO**: Setting `ChannelOptions.datagramReceiveSegmentSize` will fail with `ChannelError.operationUnsupported` |
| 154 | + |
| 155 | +### Kernel Version Requirements |
| 156 | + |
| 157 | +- **GSO**: Requires Linux kernel 4.18 or newer |
| 158 | +- **GRO**: Requires Linux kernel 5.10 or newer |
| 159 | + |
| 160 | +### Runtime Support Detection |
| 161 | + |
| 162 | +Check for GSO and GRO support at runtime using the `System` APIs: |
| 163 | + |
| 164 | +```swift |
| 165 | +import NIOPosix |
| 166 | + |
| 167 | +if System.supportsUDPSegmentationOffload { |
| 168 | + print("GSO is supported on this platform") |
| 169 | + // Use per-message GSO |
| 170 | +} else { |
| 171 | + print("GSO is not supported, falling back to normal writes") |
| 172 | +} |
| 173 | + |
| 174 | +if System.supportsUDPReceiveOffload { |
| 175 | + print("GRO is supported on this platform") |
| 176 | + // Enable per-message GRO |
| 177 | +} else { |
| 178 | + print("GRO is not supported") |
| 179 | +} |
| 180 | +``` |
| 181 | + |
| 182 | +### Error Handling |
| 183 | + |
| 184 | +Generally speaking, error handling for GSO and GRO is very similar to the error handling without them. An important note is that a single promise cannot handle individualised errors for the datagrams within the superbuffer. The kernel delivers only one return code for a given send or receive, which affects multiple datagrams within the superbuffer. However, it may not affect _all_ datagrams within the superbuffer, |
| 185 | +as the writes can be split. The result is that if the _final_ superbuffer write completes successfully the |
| 186 | +promise will be succeeded, even if errors occurred earlier. |
| 187 | + |
| 188 | +In the event that you do not follow the steps above and attempt to use GSO on platforms that do not support it, the promise will fail with `.operationUnsupported` and no writes will be attempted. |
0 commit comments