Bug #1442257 “lxc network.mtu setting not set consistently acros...” : Series 1.23 : Bugs : juju-core

Dimiter Naydenov (dimitern) on 2015-04-09

Changed in juju-core:
status:	New → Triaged
importance:	Undecided → High
milestone:	none → 1.24-alpha1
tags:	added: addressability lxc network

Ian Booth (wallyworld) on 2015-04-13

Changed in juju-core:
importance:	High → Critical

Revision history for this message

Stéphane Graber (stgraber) wrote on 2015-04-14:

#1

Download full text (7.8 KiB)

I'm not sure what you're seeing and what your environment is actually like, but veth devices DO NOT alter the packets in any way shape or form, so your assertion that LXC appends anything to the packet is just wrong.

Here is a proof, showing a simple test with the no-frag flag set (which is the easiest way to test those kind of issues).

Sending a ping with a packet of exactly 1500 from one container to another (no NAT, same subnet, same bridge):
root@trusty01:/# ping -M do 10.0.3.115 -s 1472 -c 5
PING 10.0.3.115 (10.0.3.115) 1472(1500) bytes of data.
1480 bytes from 10.0.3.115: icmp_seq=1 ttl=64 time=0.046 ms
1480 bytes from 10.0.3.115: icmp_seq=2 ttl=64 time=0.056 ms
1480 bytes from 10.0.3.115: icmp_seq=3 ttl=64 time=0.047 ms
1480 bytes from 10.0.3.115: icmp_seq=4 ttl=64 time=0.099 ms
1480 bytes from 10.0.3.115: icmp_seq=5 ttl=64 time=0.049 ms

--- 10.0.3.115 ping statistics ---
5 packets transmitted, 5 received, 0% packet loss, time 3996ms
rtt min/avg/max/mdev = 0.046/0.059/0.099/0.021 ms

Sending a ping with a packet of exactly 1501 from one container to another (no NAT, same subnet, same bridge):
root@trusty01:/# ping -M do 10.0.3.115 -s 1473 -c 5
PING 10.0.3.115 (10.0.3.115) 1473(1501) bytes of data.
ping: local error: Message too long, mtu=1500
ping: local error: Message too long, mtu=1500
ping: local error: Message too long, mtu=1500
ping: local error: Message too long, mtu=1500
ping: local error: Message too long, mtu=1500

--- 10.0.3.115 ping statistics ---
5 packets transmitted, 0 received, +5 errors, 100% packet loss, time 3999ms

Sending a ping with a packet of exactly 1500 from one container to a public service (NAT):
root@trusty01:/# ping -M do 8.8.8.8 -s 1472 -c 5
PING 8.8.8.8 (8.8.8.8) 1472(1500) bytes of data.
1480 bytes from 8.8.8.8: icmp_seq=1 ttl=54 time=11.1 ms
1480 bytes from 8.8.8.8: icmp_seq=2 ttl=54 time=11.5 ms
1480 bytes from 8.8.8.8: icmp_seq=3 ttl=54 time=10.8 ms
1480 bytes from 8.8.8.8: icmp_seq=4 ttl=54 time=10.9 ms
1480 bytes from 8.8.8.8: icmp_seq=5 ttl=54 time=12.7 ms

--- 8.8.8.8 ping statistics ---
5 packets transmitted, 5 received, 0% packet loss, time 4004ms
rtt min/avg/max/mdev = 10.813/11.446/12.760/0.714 ms

Sending a ping with a packet of exactly 1501 from one container to a public service (NAT):
root@trusty01:/# ping -M do 8.8.8.8 -s 1473 -c 5
PING 8.8.8.8 (8.8.8.8) 1473(1501) bytes of data.
ping: local error: Message too long, mtu=1500
ping: local error: Message too long, mtu=1500
ping: local error: Message too long, mtu=1500
ping: local error: Message too long, mtu=1500
ping: local error: Message too long, mtu=1500

--- 8.8.8.8 ping statistics ---
5 packets transmitted, 0 received, +5 errors, 100% packet loss, time 3999ms

An ICMP echo request is 28 bytes large, so 1472 bytes of data results in exactly 1500 bytes on the network which is the absolute maximum you can send without fragmentation. The -M do flag forces the packets to go unfragmented, causing the expected failure when reaching MTU+1.

tracepath can also be used to detect PMTU as can be seen here:
root@trusty01:/# tracepath -n vorash.stgraber.org
1?: [LOCALHOST] pmtu 1500
1: 10.0.3.1 ...