Path MTU Discovery (PMTUD)

The IP MTU is the largest size of IP datagram which may be transferred using a specific data link connection The MTU value is a design parameter of a LAN and is a mutually agreed value (i.e. both ends of a link agree to use the same specific value) for most widea area network links. The size of MTU may vary greatly between different links. Note, people who design lower-layer networks (below IP), often define the MTU in a difference way to the IP-oriented people. This can catch you out, if you are not careful.

IP fragmentation could be used to break larger packets into a sequence of smaller packets for transmission across the network. This however, lowers the efficiency and reliability of Internet communication. The loss of a single fragment results in the loss of an entire fragmented packet, because even if all other fragments are received correctly, the original packet cannot be reassembled and delivered and has to be resent. This is one example were router IP fragmentation can be considered harmful.

In addition, some Network Address Translators (NATs) and firewalls drop IP fragments. Often network address translation performed by a NATs only operates on complete IP packets Some firewall policies also require inspection of a complete IP packet. Even with these being the case, some NATs and firewalls simply do not implement the necessary reassembly functionality, and instead choose to drop all fragments. These fundamental issues with fragmentation exist for both IPv4 and IPv6.

Instead of making routers fragment packets, an end system could try to find out the largest IP packet that may be sent to a specific destination. The Path MTU Discovery algorithm operates at the sender at the boundary of the Transport Layer and Network Layers, generating probe messages and responding to ICMP error reports that indicate a low MTU. In Intermediate Systems (IS), i.e. routers, this operates in the Network Layer, returning ICMP error messages based on the link-layer configuratiuon.

The way in which an end system finds out this large packet size associated with a specific path (i.e. on the series of links used to recah a destination), is to send a large IP packet (up to the MTU of the link to which it is connected). The packet is sent with the Don't Fragment (DF) flag set in the IP protocol header.

ICMP-based PMTUD

The simplest form of PMTUD relies on the ICMP server that executes on all IP end system computers and all IP intermediate systems (i.e routers). If a router receives a packet that is less than or equal to the MTU of the link that it wishes to forward the apcket to, then it sends the packet. If however, a router finds that the MTU of the next link exceeds the packet size and the DF flag is set, this tells the router not to segment the packet. Instead, the router will discard the packet. An Internet Control Message Protocol (ICMP) message is returned by the router (R1 in the example below) back to the sender (H0), with a code saying the packet has been discarded, but importantly, this message also says the reason (i.e. the fragmentation would have been required) and indicates the maximum MTU allowed (in this case the MTU of the link between R1 and R2).

Occasionally the end system will generate a large packet, just to see if a new Internet path has been found (i.e. a different route). The new path may allow a larger P-MTU.

If an IPv4 end system receives an ICMP message saying a packet is too large (type=3, code=4), it sets a variable called the Path-MTU (P-MTU) to the appropriate maximum size and then itself fragments the packet to make sure it will not be discarded next time. The end system keeps (caches) a set of P-MTU values for each IP address in use.

When there are a series of links along the path, each with increasingly smaller MTUs, the above process may take place a number of times, before the sender finally determines the minimum value of the P-MTU. Once the P-MTU has been found, all packets are sent segmented to this new value.

In this model, Routers do not therefore have to do any additional processing for these packets. This is much more efficient than router fragmentation. However, practical problems are being experienced within the current Internet, caused by systems (e.g. some firewalls) that do not return the required ICMP messages back to the sender. The widely deployed version of PMTUD relies on messages received from the ICMP protocol, specified by RFC1191.

PMTUD without relying on ICMP

There are issues with deploying PMTUD in practical networks, you may need to think harder, and the best method is to avoid having to receive ICMP messages (RFC2923). One standard method is to sends probes and validate which get through at the transport layer - this is known as Packet-Layer PMTUD (PLPMTUD) and is standardised in RFC4821. PLPMTUD can of course also utilise ICMP messages (if it wishes) as a part of the process of learning what size of MTU will work across an entire Internet path.

PMTUD with IPv6

RFC4443, which defines ICMPv6 specifies a "Packet Too Big" (type 2, code 0) error message, that is analogous to the ICMP "fragmentation needed and DF bit set" (type 3, code 4) error message of ICMP for IPv4. RFC1981 defines the Path MTU Discovery mechanism for IP Version 6, that makes use of these messages to determine the MTU of an arbitrary internet path. PLPMTUD also works with IPv6.


See also

ICMP Type and Code Values

Kent, C. and J. Mogul, "Fragmentation considered harmful", Proc. SIGCOMM '87 vol. 17, No. 5, October 1987.

J. Postel.Internet Control Message Protocol. Sep-01-1981, RFC 792.

J. Mogul & Steve Deering, Path MTU Discovery, RFC 1191

Lahey, K., "TCP Problems with Path MTU Discovery", RFC 2923, September 2000.

McCann, J., Deering, S., and J. Mogul, "Path MTU Discovery for IP version 6", RFC 1981, August 1996.

Conta, A., Deering, S., and M. Gupta, Ed., "Internet Control Message Protocol (ICMPv6) for the Internet Protocol Version 6 (IPv6) Specification", RFC 4443, March 2006.

Mathis, M. and J. Heffner, "Packetization Layer Path MTU Discovery", RFC 4821, March 2007.


Gorry Fairhurst - Date: 28/4/2011 EG3567