Our support team has encountered a case where ibmveth + openvswitch + bnx2x has lead to some issues, which IBM should probably be aware of before turning on large segments in more places. Here's a summary from support for that issue: ========== [Issue: we see a firmware assertion from an IBM branded bnx2x card. Decoding the dump with the help of upstream shows that the assert is caused by a packet with GSO on and gso_size > ~9700 bytes being passed to the card. We traced the packets through the system, and came up with this root cause. The system uses ibmveth to talk to AIX LPARs, a bnx2x network card to talk to the world, and Open vSwitch to tie them together. There is no VIOS involvement - the card is attached to the Linux partition.] The packets causing the issue come through the ibmveth interface - from the AIX LPAR. The veth protocol is 'special' - communication between LPARs on the same chassis can use very large (64k) frames to reduce overhead. Normal networks cannot handle such large packets, so traditionally, the VIOS partition would signal to the AIX partitions that it was 'special', and AIX would send regular, ethernet-sized packets to VIOS, which VIOS would then send out. This signalling between VIOS and AIX is done in a way that is not standards-compliant, and so was never made part of Linux. Instead, the Linux driver has always understood large frames and passed them up the network stack. In some cases (e.g. with TCP), multiple TCP segments are coalesced into one large packet. In Linux, this goes through the generic receive offload code, using a similar mechanism to GSO. These segments can be very large which presents as a very large MSS (maximum segment size) or gso_size. Normally, the large packet is simply passed to whatever network application on Linux is going to consume it, and everything is OK. However, in this case, the packets go through Open vSwitch, and are then passed to the bnx2x driver. The bnx2x driver/hardware supports TSO and GSO, but with a restriction: the maximum segment size is limited to around 9700 bytes. Normally this is more than adequate as jumbo frames are limited to 9000 bytes. However, if a large packet with large (>9700 byte) TCP segments arrives through ibmveth, and is passed to bnx2x, the hardware will panic. Turning off TSO prevents the crash as the kernel resegments the data and assembles the packets in software. This has a performance cost. Clearly at the very least, bnx2x should not crash in this case, and I am working towards a patch for that. However, this still leaves us with some issues. The only thing the bnx2x driver can sensibly do is drop the packet, which will prevent the crash. However, there will still be issues with large packets: when they are dropped, the other side will eventually realise that the data is missing and ask for a retransmit, but the retransmit might also be too big - there's no way of signalling back to the AIX LPAR that it should reduce the MSS. Even if the data eventually gets through there will be a latency/throughput/performance hit. ========== Seeing as IBM seems to be in active development in this area - indeed this code explicitly deals with ibmveth + ovs, could some one from IBM review this?