Recently worked on a customer issue whereby they had a VPN dropping intermittently. The VPN was riding over several Internet providers. The below screen shots are examples and not the actual view when working with this customer.
The customer performed traceroutes (which was a good thing) and noticed a measured packet loss on one of the hops along the path (orange/yellow box below). What stands out is the zero packet loss before and after the hop that shows packet loss. It looked something like the following,
The orange/yellow box shows the observed packet loss. The blue arrow is pointing to the following hop, which shows zero packet loss. The significance in this is the packet loss measured at that hop is due to the router’s ablitiy to respond directly (originate a packet back from itself). If this process is slow or simply does not happen then the host initiating the traceroute will count this as packet loss since there was not a response received back. The blue arrow is showing that transit traffic through the router is not experiencing packet loss.
According to the standard (RFC1393), when an packet arrives with an ICMP traceroute option, the router should respond to the host originating the traceroute.
“When a router forwards a packet with an IP Traceroute option, it should send an ICMP Traceroute message to the host in the Originator IP Address field of the option.”
The keyword there is should.
So then the question must be asked, how then can traceroute show packet loss which is impacting the IP flow? Here’s another screen capture that provides and example.
The above orange/yellow box is surrounding successive counts of packet loss across almost all hops. When this is seen, it means that traffic passing through the hop where the first packet loss is seen is impacted. Investigating should occur at this point and move backwards. In the above example, the source problem turned out to be a bad RG device.
It is important to keep in mind that traceroute is a tool to help provide clues to a network issue. Understanding the tool and how it works makes it more useful when reading it’s output.
Comments