|Home||Switchboard||Unix Administration||Red Hat||TCP/IP Networks||Neoliberalism||Toxic Managers|
May the source be with you, but remember the KISS principle ;-)
Bigger doesn't imply better. Bigger often is a sign of obesity, of lost control, of overcomplexity, of cancerous cells
|News||See Also||Recommended Links||Recommended Papers||Alternatives||ICMP|
|ICMP Tools||ping||traceroute -p||hping2||MTR||TCPtraceroute|
Traceroute is a tool used to discover the links along a path. It exists in several variants: classic tcpdump (UDP-based) and tcptraceroute, a traceroute implementation using TCP SYN packets, instead of the more traditional UDP or ICMP ECHO packets. The latter might be more useful than plain traceroute when you need to investigate the games your ISP is playing around with particular service.
While this is the first step in investigating a path's behavior and performance, it is useful for other tasks as well. Classic traceroute generates UDP packets for tracing the flow of traffic through a network. By default it starts at some random high port (typically 33434) and increment the port number by one with each packet. The port used is configurable via -p option.
The traceroute program was written by Van Jacobson and others. It is based on a clever use of the Time-To-Live (TTL) field in the IP packet's header. The TTL field, described briefly in the last chapter, is used to limit the life of a packet. When a router fails or is misconfigured, a routing loop or circular path may result. The TTL field prevents packets from remaining on a network indefinitely should such a routing loop occur. A packet's TTL field is decremented each time the packet crosses a router on its way through a network. When its value reaches 0, the packet is discarded rather than forwarded. When discarded, an ICMP TIME_EXCEEDED message is sent back to the packet's source to inform the source that the packet was discarded. By manipulating the TTL field of the original packet, the program traceroute uses information from these ICMP messages to discover paths through a network.
In other words traceroute uses the TTL field of the IP header to force each hop along the path to return an ICMP Time Exceeded message. The destination host is recognized because it returns an ICMP Destination Unreachable message. The first set of packets is sent with a TTL of 1, which times out at the first router. The second set of packets has a TTL of 2 and times out at the second router. This pattern is followed until the destination host is reached. As each packet is sent, the results are displayed.
Just as many packet filters are configured to block ping, many are configured to stop traceroute from working as well. Although this limits the usefulness of traceroute for end-to-end troubleshooting, traceroute can still provide useful information about the path followed between the endpoints of the connections.
Path discovery is also an essential step in diagnosing routing problems. While you may fully understand the structure of your network and know what path you want your packets to take through your network, the real path may come as a surprise.
Once packets leave your network, you have almost no control over the path they actually take to their destination. You may know very little about the structure of adjacent networks. Path discovery can provide a way to discover who their ISP is, how your ISP is connected to the world, and other information such as peering arrangements. traceroute is the tool of choice for collecting this kind of information.
Typically, when the probe packets finally have an adequate TTL and reach their destination, they will be discarded and an ICMP PORT_UNREACHABLE message will be returned. This happens because traceroute sends all its probe packets with what should be invalid port numbers, i.e., port numbers that aren't usually used. To do this, traceroute starts with a very large port number, typically 33434, and increments this value with each subsequent packet. Thus, each of the three packets in a set will have three different unlikely port numbers. The receipt of ICMP PORT_UNREACHABLE messages is the signal that the end of the path has been reached. Here is a simple example of using traceroute:
bsd1# traceroute 188.8.131.52
traceroute to 184.108.40.206 (220.127.116.11), 30 hops max, 40 byte packets
1 18.104.22.168 (22.214.171.124) 1.162 ms 1.068 ms 1.025 ms
2 cisco (126.96.36.199) 4.249 ms 4.275 ms 4.256 ms
3 188.8.131.52 (184.108.40.206) 4.433 ms 4.521 ms 4.450 ms
4 e0.r01.ia-gnwd.Infoave.Net (220.127.116.11) 5.178 ms 5.173 ms 5.140 ms
5 18.104.22.168 (22.214.171.124) 13.171 ms 13.277 ms 13.352 ms
6 126.96.36.199 (188.8.131.52) 18.395 ms 18.238 ms 18.210 ms
7 atm12-0-10-mp.r01.ia-clma.infoave.net (184.108.40.206) 18.816 ms 18.934 ms
8 Serial5-1-1.GW1.RDU1.ALTER.NET (220.127.116.11) 26.658 ms 26.484 ms 26.855
9 Fddi12-0-0.GW2.RDU1.ALTER.NET (18.104.22.168) 26.692 ms 26.697 ms 26.490
10 smatnet-gw2.customer.ALTER.NET (22.214.171.124) 27.736 ms 28.101 ms 27.738
11 rcmt1-S10-1-1.sprintsvc.net (126.96.36.199) 33.539 ms 33.219 ms 32.446 m
12 rcmt3-FE0-0.sprintsvc.net (188.8.131.52) 32.641 ms 32.724 ms 32.898 ms
13 gwd1-S3-7.sprintsvc.net (184.108.40.206) 46.026 ms 50.724 ms 45.960 ms
14 gateway.ais-gwd.com (220.127.116.11) 47.828 ms 50.912 ms 47.823 ms
15 pm3-02.ais-gwd.com (18.104.22.168) 63.786 ms 48.432 ms 48.113 ms
16 user58.ais-gwd.com (22.214.171.124) 200.910 ms 184.587 ms 202.771 ms
The results should be fairly self-explanatory. This particular path was 16 hops long. Reverse name lookup is attempted for the IP address of each device, and, if successful, these names are reported in addition to IP addresses. Times are reported for each of the three probes sent. They are interpreted in the same way as times with ping (However, if you just want times for one hop, ping is generally a better choice.)
Although no packets were lost in this example, should a packet be lost, an asterisk is printed in the place of the missing time. In some cases, all three times may be replaced with asterisks. This can happen for several reasons. First, the router at this hop may not return ICMP TIME_EXCEEDED messages. Second, some older routers may incorrectly forward packets even though the TTL is 0. A third possibility is that ICMP messages may be given low priority and may not be returned in a timely manner. Finally, beyond some point of the path, ICMP packets may be blocked.
Other routing problems may exist as well. In some instances
traceroute will append additional messages
to the end of lines in the form of an exclamation point and a letter.
!P indicate, respectively, that the host, network,
or protocol is unreachable.
!F indicates that
fragmentation is needed.
!S indicates a source
Two options control how much information is printed. Name resolution can be disabled with the -n option. This can be useful if name resolution fails for some reason or if you just don't want to wait on it. The -v option is the verbose flag. With this flag set, the source and packet sizes of the probes will be reported for each packet. If other ICMP messages are received, they will also be reported, so this can be an important option when troubleshooting.
Several options may be used to alter the behavior of traceroute, but most are rarely needed. An example is the -m option. The TTL field is an 8-bit number allowing a maximum of 255 hops. Most implementations of traceroute default to trying only 30 hops before halting. The -m option can be used to change the maximum number of hops tested to any value up to 255.
As noted earlier, traceroute usually receives a PORT_UNREACHABLE message when it reaches its final destination because it uses a series of unusually large port numbers as the destination ports. Should the number actually match a port that has a running service, the PORT_UNREACHABLE message will not be returned. This is rarely a problem since three packets are sent with different port numbers, but, if it is, the -p option lets you specify a different starting port so these ports can be avoided.
Normally, traceroute sends three probe packets for each TTL value with a timeout of three seconds for replies. The default number of packets per set can be changed with the -q option. The default timeout can be changed with the -w option.
Additional options support how packets are routed. See the manpage for details on these if needed.
Complications with traceroute
The information traceroute supplies has its limitations. In some situations, the results returned by traceroute have a very short shelf life. This is particularly true for long paths crossing several networks and ISPs.
You should also recall that a router, by definition, is a computer with multiple network interfaces, each with a different IP address. This raises an obvious question: which IP address should be returned for a router? For traceroute, the answer is dictated by the mechanism it uses to discover the route. It can report only the address of the interface receiving the packet. This means a quite different path will be reported if traceroute is run in the reverse direction.
Here is the output when the previous example is run again from what was originally the destination to what was originally the source, i.e., with the source and destination exchanged:
Tracing route to 126.96.36.199 over a maximum of 30 hops
1 132 ms 129 ms 129 ms pm3-02.ais-gwd.com [188.8.131.52]
2 137 ms 130 ms 129 ms sprint-cisco-01.ais-gwd.com [184.108.40.206]
3 136 ms 129 ms 139 ms 220.127.116.11
4 145 ms 150 ms 140 ms rcmt3-S4-5.sprintsvc.net [18.104.22.168]
5 155 ms 149 ms 149 ms sl-gw2-rly-5-0-0.sprintlink.net [22.214.171.124]
6 165 ms 149 ms 149 ms sl-bb11-rly-2-1.sprintlink.net [126.96.36.199]
7 465 ms 449 ms 399 ms sl-gw11-dc-8-0-0.sprintlink.net [188.8.131.52]
8 155 ms 159 ms 159 ms sl-infonet-2-0-0-T3.sprintlink.net [184.108.40.206]
9 164 ms 159 ms 159 ms atm4-0-10-mp.r01.ia-gnvl.infoave.net [220.127.116.11]
10 164 ms 169 ms 169 ms atm4-0-30.r1.scgnvl.infoave.net [18.104.22.168]
11 175 ms 179 ms 179 ms 22.214.171.124
12 184 ms 189 ms 195 ms e0.r02.ia-gnwd.Infoave.Net [126.96.36.199]
13 190 ms 179 ms 180 ms 188.8.131.52
14 185 ms 179 ms 179 ms 184.108.40.206
15 174 ms 179 ms 179 ms 220.127.116.11
There are several obvious differences. First, the format is slightly different because this example was run using Microsoft's implementation of traceroute, tracert. This, however, should present no difficulty.
A closer examination shows that there are more fundamental differences. The second trace is not simply the first trace in reverse order. The IP addresses are not the same, and the number of hops is different.
There are two things going on here. First, as previously mentioned, traceroute reports the IP number of the interface where the packet arrives. The reverse path will use different interfaces on each router, so different IP addresses will be reported. While this can be a bit confusing at first glance, it can be useful. By running traceroute at each end of a connection, a much more complete picture of the connection can be created.
Traceroute is the program that shows you the route over the network between two systems, listing all the intermediate routers a connection must pass through to get to its destination. It can help you determine why your connections to a given server might be poor, and can often help you figure out where exactly the problem is. It also shows you how systems are connected to each other, letting you see how your ISP connects to the Internet as well as how the target system is connected.
This tutorial was written for users of premium Usenet services, but can be useful for anyone wanting to learn to use traceroute.
Running a traceroute
- Running a traceroute
- Reading the output
- The reverse route
- Tracing from elsewhere
- Finding the problem: timeouts
- Finding the problem: long routes
- Finding the problem: high round-trip times
- Finding the problem: routing weirdness
- Finding the problem: using ping
- Under the hood
The traceroute program is available on most computers which support networking, including most Unix systems, Mac OS X, and Windows 95 and later.
On a Unix system, including Mac OS X, run a traceroute at the command line like this:
If the traceroute command is not found, it may be present but not in your shell's search path. On some systems, traceroute can be found in /usr/sbin, which is often not in the default user path. In this case, run it with the full path:
On Mac OS X, if you would rather not open a terminal and use the command line, a GUI front-end for traceroute (and several other utilities) called Network Utility can be found in the Utilities folder within the Applications folder. Run it, click the “Traceroute” tab, and enter an address to run a trace to.
MTR is an alternate implementation of traceroute for Unix. It combines a trace with continuing pings of each hop to provide a more complete report all at once. It is available here.
If you're stuck with Windows, the command is called tracert. Open a DOS window and enter the command:
You can also download VisualRoute, a graphical traceroute program available for Windows, Sparc Solaris, and Linux. VisualRoute helps you analyze the traceroute, and provides a nifty world map showing you where your packets are going (it's not always geographically accurate). View a screenshot (I have obscured my local addresses).Reading the output
Here is some example traceroute output, from a Unix system:traceroute to library.airnews.net (18.104.22.168), 30 hops max, 40 byte packets 1 rbrt3 (22.214.171.124) 4.867 ms 4.893 ms 3.449 ms 2 519.Hssi2-0-0.GW1.EWR1.ALTER.NET (126.96.36.199) 6.918 ms 8.721 ms 16.476 ms 3 113.ATM3-0.XR2.EWR1.ALTER.NET (188.8.131.52) 6.323 ms 6.123 ms 7.011 ms 4 192.ATM2-0.TR2.EWR1.ALTER.NET (184.108.40.206) 6.955 ms 15.400 ms 6.684 ms 5 105.ATM6-0.TR2.DFW4.ALTER.NET (220.127.116.11) 49.105 ms 49.921 ms 47.371 ms 6 298.ATM7-0.XR2.DFW4.ALTER.NET (18.104.22.168) 48.162 ms 48.052 ms 47.565 ms 7 194.ATM9-0-0.GW1.DFW1.ALTER.NET (22.214.171.124) 47.886 ms 47.380 ms 50.690 ms 8 iadfw3-gw.customer.ALTER.NET (126.96.36.199) 69.827 ms 68.112 ms 66.859 ms 9 library.airnews.net (188.8.131.52) 174.853 ms 163.945 ms 147.501 ms
Here, I am tracing the route to library.airnews.net, the news server name at Airnews. The first line of output is information about what I'm doing; it shows the target system, that system's IP address, the maximum number of hops that will be allowed, and the size of the packets being sent.
Then we have one line for each system or router in the path between me and the target system. Each line shows the name of the system (as determined from DNS), the system's IP address, and three round trip times in milliseconds. The round trip times (or RTTs) tell us how long it took a packet to get from me to that system and back again, called the latency between the two systems. By default, three packets are sent to each system along the route, so we get three RTTs.
Sometimes, a line in the output may have one or two of the times missing, with an asterisk where it should be:9 host230-1.com (184.108.40.206) 12.619 ms * *
In this case, the machine is up and responding, but for whatever reason it did not respond to the second and third packets. This does not necessarily indicate a problem; in fact, it is usually normal, and just means that the system discarded the packet for some reason. Many systems do this normally. These are most often computers, rather than dedicated routers. Systems running Solaris routinely show an asterisk instead of the second RTT.
It's important to remember that timeouts are not necessarily an indication of packet loss. This is a common misconception, but since there are only three probes, dropping one response is no big deal.
Sometimes you will see an entry with just an IP address and no name:1 220.127.116.11 (18.104.22.168) 0.858 ms 1.003 ms 1.152 ms
This simply means that a reverse DNS lookup on the address failed, so the name of the system could not be determined.
If your trace ends in all timeouts, like this:12 al-fa3-0-0.austtx.ixcis.net (22.214.171.124) 84.585 ms 92.399 ms 87.805 ms 13 * * * 14 * * * 15 * * *
This means that the target system could not be reached. More accurately, it means that the packets could not make it there and back; they may actually be reaching the target system but encountering problems on the return trip (more on this later). This is possibly due to some kind of problem, but it may also be an intentional block due to a firewall or other security measures, and the block may affect traceroute but not actual server connections.
A trace can end with one of several error indications indicating why the trace cannot proceed. In this example, the router is indicating that it has no route to the target host:4 rbrt3.exit109.com (126.96.36.199) 35.931 ms !H * 39.970 ms !H
The !H is a “host unreachable” error message (it indicates that an ICMP error message was received). The trace will stop at this point. Possible ICMP error messages of this nature include:
- Host unreachable. The router has no route to the target system.
- Network unreachable.
- Protocol unreachable.
- Source route failed. You tried to use source routing, but the router is configured to block source-routed packets.
- Fragmentation needed. This indicates that the router is misconfigured.
- Communication administratively prohibited. The network administrator has blocked traceroute at this router.
Sometimes, with some versions of traceroute, you will see TTL warnings after the times:6 qwest-nyc-oc12.above.net (188.8.131.52) 90.0 ms (ttl=251!) 90.0 ms (ttl=251!) 90.0 ms (ttl=251!)
This merely indicates that the TTL (time-to-live) value on the reply packet was different from what was expected. This probably means that your route is asymmetric (see below). This is not shown by all versions of traceroute, and can be safely ignored.
The output of the Windows version of traceroute is slightly different from the Unix examples (I have censored my router's name and IP address from the listing):Tracing route to news-east.usenetserver.com [184.108.40.206] over a maximum of 30 hops: 1 3 ms 3 ms 2 ms my.router [xxx.xxx.xx.xxx] 2 35 ms 36 ms 35 ms rbtserv5.exit109.com [220.127.116.11] 3 36 ms 37 ms 36 ms rbrt3.exit109.com [18.104.22.168] 4 41 ms 40 ms 41 ms 571.Hssi5-0.GW1.EWR1.ALTER.NET [22.214.171.124] 5 42 ms 44 ms 52 ms 113.ATM2-0.XR1.EWR1.ALTER.NET [126.96.36.199] 6 43 ms 41 ms 41 ms 193.at-1-0-0.XR1.NYC9.ALTER.NET [188.8.131.52] 7 61 ms 41 ms 41 ms 181.ATM6-0.BR2.NYC9.ALTER.NET [184.108.40.206] 8 41 ms 42 ms 47 ms 220.127.116.11 9 47 ms 42 ms 42 ms so-6-0-0.mp2.NewYork1.level3.net [18.104.22.168] 10 65 ms 63 ms 68 ms loopback0.hsipaccess1.Atlanta1.Level3.net [22.214.171.124] 11 104 ms 68 ms 80 ms news-east.usenetserver.com [126.96.36.199] Trace complete.
The Windows version does not show ICMP error messages in the manner described above. Errors are shown as (possibly ambiguous or confusing) text. For example, a “host unreachable” error will be shown as “Destination net unreachable” on Windows.
The rest of the examples will be in Unix format.The reverse route
Any connection over the Internet actually depends on two routes: the route from your system to the server, and the route from that server back to your system. These routes may be (and often are) completely different (asymmetric). If they differ, a problem in your connection could be a problem with either the route to the server, or with the route back from the server. A problem reflected in a traceroute output may actually not lie with the obvious system in your trace; it may rather be with some other system on the reverse route back from the system that looks, from the trace, to be the cause of the problem.
So a traceroute from you to the server is only showing you half of the picture. The other half is the return route or reverse route. So how can you see that route?
In the good old days, you could use source routing with traceroute to see the reverse trace back to you from a host. The idea is to specify what is called a loose source route, which specifies a system your packets should pass through before proceeding on to their destination.
The ability to use loose source routing to see the reverse route could be pretty handy. Unfortunately, source routing has a great potential for abuse, and therefore most network administrators block all source-routed packets at their border routers. So, in practice, loose source routes aren't going to work.
These days, the only hope you likely have of running a reverse traceroute is if the system you want to trace from has a traceroute facility on their web site. Many systems, and Usenet providers in particular, have a web page where you can run a traceroute from their system back to yours. In combination with your trace to their system, this can give you the other half of the picture. I have a list of Usenet provider traceroute pages here.Tracing from elsewhere
It can also be useful to see the result of a traceroute from somewhere else on the net. There are many public traceroute pages available which let you trace from those systems to other systems or back to your own system. There is an exhaustive list at www.traceroute.org.
Since many systems are multi-homed (have more than one connection to the Internet), you may have to run traces to a system from multiple locations in order to “see” all of its connections. In addition to diagnosing technical problems, this can be useful to determine what kind of connections a system has to the Internet.Finding the problem: timeouts
If your trace to a system ends in timeouts, and never completes, there could be a problem. (The other explanation is that a system is blocking traceroute attempts, either by filtering all ICMP messages or by other means.) Your next step is to figure out where the problem is.
Well, obviously, if the trace stops at a particular system and can't go any further, then that system is where the problem lies, right? Possibly, but not necessarily.
If your traceroute ends in timeouts at a certain system, it's likely that either the connection between that system and the next system on the route, or the next system itself, is the source of the problem. The system may be down, or the network connecting them may be down. You may just have to wait for the problem to be fixed, especially if the problem system is not at your ISP and thus you aren't a paying customer of that network.
The problem could, however, not be with that system. Recall that the packets must travel from your system to the router and back again before you can see the results, and that the return route may be different from the forward route. Thus, the problem could lie somewhere on the return route between the system giving the timeouts and your own system, and that problem may not be reflected in the previous parts of the trace because the route may be entirely different.
Let's say you have a timeout like this:16 c1-pos5-3.snjsca1.home.net (188.8.131.52) 136.612 ms 129.795 ms 129.133 ms 17 bb1-pos6-0-0.rdc1.sfba.home.net (184.108.40.206) 130.473 ms 137.609 ms 134.162 ms 18 * * *
The last reachable system on the route is at hop 17. The problem may be with the system at hop 18, or with the network connection between hops 17 and 18. Or it may be on the return route. It's very possible that the routers at hop 17 and hop 18 have different return routes to your system. The return route from 17 may work just fine, while the return route from 18 has a problem. That problem could be with that system, or it could be a totally different system, many hops away. It could even be a problem at your own ISP. The only way to tell is to see the reverse trace. A reverse trace from hop 17 would be useful here as well, to verify that the routes are indeed different. Of course, it may be difficult or impossible to obtain traceroutes from those systems, because the network administrator at home.net would have to run them for you, and is probably too busy to worry about such a request.
In this case, you can try running traces to the target system from various other places (use the list at traceroute.org) to see if it is reachable from elsewhere. In the above example, if you knew what router was normally at hop 18 (from seeing it in previous traces), you could try a trace to that router from another site.
Finding the problem: long routes
If your route to a server is very long, performance is going to suffer. A long route can be due to less-than-optimal configuration within some network along the way. Take a look at this route:traceroute to 220.127.116.11 (18.104.22.168), 30 hops max, 40 byte packets 1 main2-249-97.iad.above.net (22.214.171.124) 1.143 ms 0.559 ms 0.382 ms 2 core1-main2-oc3-1.iad.above.net (126.96.36.199) 0.574 ms 0.886 ms 0.429 ms 3 sjc-iad-oc12-1.sjc.above.net (188.8.131.52) 82.134 ms 82.537 ms 82.158 ms 4 sl-gw8-sj-0-1.sprintlink.net (184.108.40.206) 82.523 ms 82.383 ms 82.949 ms 5 sl-bb12-sj-6-0.sprintlink.net (220.127.116.11) 82.348 ms 82.762 ms 83.029 ms 6 sl-bb10-sj-8-0.sprintlink.net (18.104.22.168) 83.346 ms 83.012 ms 83.006 ms 7 sl-bb10-rly-6-0.sprintlink.net (22.214.171.124) 136.004 ms 135.804 ms 136.274 ms 8 sl-bb6-dc-0-0-0.sprintlink.net (126.96.36.199) 137.625 ms 137.204 ms 136.794 ms 9 gip-dc-2-fddi1-0.gip.net (188.8.131.52) 137.344 ms 138.156 ms 139.390 ms 10 gip-arch-1-atm2-0-0-132-atm.gip.net (184.108.40.206) 311.850 ms 325.246 ms 285.607 ms 11 gip-telehouse-1-atm0-0-0-333-atm.gip.net (220.127.116.11) 281.472 ms 291.957 ms 314.661 ms 12 gip-linx-fddi0.gip.net (18.104.22.168) 277.425 ms 297.364 ms 248.030 ms 13 linx-gw1.UK.EU.net (22.214.171.124) 291.800 ms 213.447 ms 221.377 ms 14 Nyk-nr01.NY.US.EU.net (126.96.36.199) 266.863 ms 301.220 ms 320.008 ms 15 nyc-core-02.inet.qwest.net (188.8.131.52) 206.191 ms 233.207 ms * 16 nyc-core-03.inet.qwest.net (184.108.40.206) 235.085 ms 270.805 ms 252.668 ms 17 nyc-core-01.inet.qwest.net (220.127.116.11) 281.931 ms 277.519 ms 278.152 ms 18 wdc-core-02.inet.qwest.net (18.104.22.168) 265.548 ms 233.789 ms 219.698 ms 19 wdc-core-03.inet.qwest.net (22.214.171.124) 200.913 ms 225.456 ms 246.335 ms 20 atl-core-01.inet.qwest.net (126.96.36.199) 237.049 ms 253.304 ms 215.435 ms 21 atl-edge-04.inet.qwest.net (188.8.131.52) 234.406 ms 289.490 ms 300.829 ms 22 184.108.40.206 (220.127.116.11) 296.876 ms 333.235 ms 272.397 ms 23 Adelphia-pvc55-t3-gw.aibusiness.net (18.104.22.168) 287.180 ms 268.736 ms 276.649 ms 24 surf4-145-237.pbc.adelphia.net (22.214.171.124) 382.868 ms 420.165 ms 393.398 ms
In this example, both the source and destination of the trace are in the United States. However, note that between hops 11 and 14, the route goes to London and back (LINX is the London Internet Exchange). Obviously, this is a problem; there are two transatlantic hops here which are completely unnecessary. Sprintlink is handing the traffic off to gip.net, which is taking it across the ocean before giving it to Qwest.
Finding the problem: high latency
Recall that the three numbers given on each line of output show the round trip times (latency) in milliseconds. Smaller numbers generally mean better connections. As the latency of a connection inreases, interactive response suffers. Download speed can also suffer as a result of high latency (due to TCP windowing), or as a result of whatever is actually causing that high latency.
Typically, a modem connection's inherent latency will be around 120-130ms. The latency on an ISDN line is usually around 40-45ms. If you use a connection of this type, you won't see any better than these numbers.
If you see, in a trace output, a large “jump” in latency from one hop to the next, that could indicate a problem. It could be a saturated (overused) network link; a slow network link; an overloaded router; or some other problem at that hop. Of course, it could also be a problem anywhere on the return route from the high-latency hop as well. You can use the ping program (described below) to get a better idea of the latency as well as the packet loss to a given site or router; traceroute only does three probes per router (by default), which isn't a very good sample on its own.
A jump in latency can also indicate a long hop, such as a cross-country link or one that crosses an ocean. A long line is naturally going to have higher latency than a short one. For example:4 core1.telehouse.level3.net (126.96.36.199) 2.355 ms 4.932 ms 3.473 ms 5 core1.London1.Level3.net (188.8.131.52) 2.550 ms 1.934 ms 3.110 ms 6 atm10-0-100.core1.NewYork1.Level3.net (184.108.40.206) 77.629 ms 75.664 ms 75.351 ms
The link between hops 5 and 6 is transatlatic, and thus is adding more than 70ms to the latency. This is normal.Finding the problem: routing weirdness
One example of “weirdness” that you might see in traceroute output is exposure of private address space. Certain ranges of IP addresses are reserved for private, non-Internet use. These address ranges are not assigned to anyone, and are open for use by any system. They cannot be routed over the Internet, and thus are for internal use only. Sending traffic between private address space and outside networks must be done via internal routing or address translation.
The reserved private address ranges are:
Private addresses should never be visible over the Internet. But, sometimes you will see them in traceroute output. If they appear within your local network, this is okay; private addresses inside your own network can be visible to you. If, however, they appear within someone else's network, this can be problematic:10 ebay-2-gw.customer.ALTER.NET (220.127.116.11) 114.204 ms 123.232 ms 120.957 ms 11 10.1.2.5 (10.1.2.5) 110.693 ms 114.475 ms 107.747 ms 12 * * * 13 * * *
The private address 10.1.2.5 within another network should not be visible to us. In this case, though, it is the last visible address before the trace ends in timeouts.
Visibility of private IP addresses doesn't necessarily (or even usually) mean that the route does not work. It is often simply the way the administrators of the target network have set up their system. In fact, the output above, despite the private IP address and the timeouts, shows a route that works perfectly well for web access.
However, a route which includes private addresses is difficult to troubleshoot. You can't ping the private routers to see if there is any packet loss. You can't trace directly to them from other sites. And in general, they show a certain level of cluelessness in how the network is set up.
Here is another example of routing weirdness:11 USW-phx-gw.customer.ALTER.NET (18.104.22.168) 142.840 ms 151.245 ms 129.564 ms 12 22.214.171.124 (126.96.36.199) 127.569 ms vdsla121.phnx.uswest.net (188.8.131.52) 185.214 ms * 13 vdsla121.phnx.uswest.net (184.108.40.206) 442.912 ms 205.956 ms 221.537 ms 14 vdsla121.phnx.uswest.net (220.127.116.11) 164.728 ms 186.997 ms 190.414 ms 15 vdsla121.phnx.uswest.net (18.104.22.168) 306.964 ms 189.152 ms 221.288 ms
All looks well until hop 12. At that hop, the first packet is replied to from 22.214.171.124, but the second and third (which should be coming from the same place) are being returned from a different address, and timing out, respectively. After that, hops 13, 14, and 15 are all showing the same address! Since the response times are actually different, though, we can guess that they are, in reality, different systems. The trace ends normally at hop 15.
So what the heck is going on here? US West says this is a security measure, to hide the details of their internal network. The last few hops all return the address of the end-user's ADSL line, rather than their actual address. I'm not entirely sure what kind of “security” this is meant to provide.
Obviously, this makes any kind of troubleshooting of this connection next to impossible. If you encounter problems in this situation, the best you can do is contact the network provider and let them deal with it.
Sometimes you might see a route start “looping” back and forth between two routers, until the 30-hop limit is reached. This is a routing loop. This usually means that one router has lost communication (BGP) with another, and thus has dropped that route. Since the router has lost the route it needs, it sends the packet back where it came from, thinking maybe that is the best route. That router knows better and sends it back to the other one, over and over. Here's an example of a loop:14 hou-core-03.inet.qwest.net (126.96.36.199) 165.484 ms 164.335 ms 175.928 ms 15 hou-core-02.inet.qwest.net (188.8.131.52) 162.291 ms 172.713 ms 171.532 ms 16 kcm-core-01.inet.qwest.net (184.108.40.206) 212.967 ms 193.454 ms 199.457 ms 17 dal-core-01.inet.qwest.net (220.127.116.11) 206.296 ms 212.383 ms 189.592 ms 18 kcm-core-01.inet.qwest.net (18.104.22.168) 210.201 ms 225.674 ms 208.124 ms 19 dal-core-01.inet.qwest.net (22.214.171.124) 189.089 ms 201.505 ms 201.659 ms 20 kcm-core-01.inet.qwest.net (126.96.36.199) 334.19 ms 320.39 ms 245.182 ms 21 dal-core-01.inet.qwest.net (188.8.131.52) 218.519 ms 210.519 ms 246.635 ms
Finding the problem: using pingThe ping program is used to determine whether a route is experiencing packet loss, and to measure latency.
On a Unix SVR4 system (such as Solaris), use the command:ping -s news.server.name
On BSD Unix, Mac OS X, or Linux, use:ping news.server.name
And if you're stuck with Windows, open a DOS window and type:ping -t news.server.name
The output will consist of one line per ping (one per second), giving you the round-trip response time (RTT, or latency). The lower, the better. Note that if you can't traceroute to a system due to administrative blocking, you may not be able to ping it either.
Let the pings go for a while, then press control-C to stop it. You'll see a summary like this, on Unix:----usenet73.supernews.com PING Statistics---- 76 packets transmitted, 76 packets received, 0% packet loss round-trip (ms) min/avg/max = 138/144/179
Or like this, on Windows:Ping statistics for 184.108.40.206: Packets: Sent = 73, Received = 73, Lost = 0 (0% loss), Approximate round trip times in milli-seconds: Minimum = 132ms, Maximum = 164ms, Average = 139ms
First you see an indication of packet loss. The more loss you see, the worse your connection will be, because every lost packet on a data connection must be retransmitted. If you see 20% packet loss, it's going to be painful. This number is more meaningful if you let ping run for a while; if you only do five pings, 20% packet loss means it dropped one packet, which could be no big deal. Let it go for a while.
Latency times are important for performance; the lower the better. If you play online games like Quake you are probably familiar with this concept. For Usenet reading, this will matter most if you read news online, interactively, staying connected to the server the whole time. If you use an offline newsreader which downloads articles all at once and lets you read them from your local disk, latency is much less important (it can affect sustained download speeds, but that is beyond the scope of this document). What the output is showing you is the minimum, average, and maximum latency times seen during the ping run. A few systems may include a fourth number showing the standard deviation.
If you see packet loss on a connection, you can use ping with your traceroute output to find the source of the loss. Start by pinging the next to last router in the trace. If you still see packet loss, ping the one before that. Eventually the packet loss will disappear, and you have found the part of the path where the problem begins.
Note, however, that as with other problems, the cause of the loss could be the first router on the path showing packet loss, or it could be anywhere on the return path from that router. Remember that the return path can be totally different from what you see in your trace output. But, this gives you a good place to start pointing fingers.Under the hood
You don't need to worry about the low-level details of how traceroute works in order to use it. But, if you're interested, here they are.
Traceroute works by causing each router along a network path to return an ICMP (Internet Control Message Protocol) error message. An IP packet contains a time-to-live (TTL) value which specifies how long it can go on its search for a destination before being discarded. Each time a packet passes through a router, its TTL value is decremented by one; when it reaches zero, the packet is dropped, and an ICMP Time-To-Live Exceeded error message is returned to the sender.
The traceroute program sends its first group of packets with a TTL value of one. The first router along the path will therefore discard the packet (its TTL is decremented to zero) and return the TTL Exceeded error. Thus, we have found the first router on the path. Packets can then be sent with a TTL of two, and then three, and so on, causing each router along the path to return an error, identifying it to us. Eventually either the final destination is reached, or the maximum value (default is 30) is reached and the traceroute ends.
At the final destination, a different error is returned. Most traceroute programs work by sending UDP datagrams to some random high-numbered port where nothing is likely to be listening. When that final system is reached, since nothing is answering on that port, an ICMP Port Unreachable error message is returned, and we are finished.
The Windows version of traceroute uses ICMP Echo Request packets (ping packets) rather than UDP datagrams. In practice, this seems to make little difference in the outcome, unless a system along the route is blocking one type of traffic but not the other.
In the unlikely even that some program happens to be listening on the UDP port that traceroute is trying to contact, the trace will fail at the last hop. You can run another trace ucing ICMP Echo Requests, which will probably succeed, or specify a different target port for the UDP datagrams.
A few versions of traceroute, such as the one on Solaris, allow you to choose either method (high-port UDP or ICMP echo requests).
Jan 6, 2007 (SecurityFocus) Michal Zalewski (lcamtuf dione ids pl) (4 replies)
I'd like to announce the availability of a free security reconnaissance / firewall bypassing tool called 0trace. This tool enables the user to perform hop enumeration ("traceroute") within an established TCP connection, such as a HTTP or SMTP session. This is opposed to sending stray packets, as traceroute-type tools usually do.
The important benefit of using an established connection and matching TCP packets to send a TTL-based probe is that such traffic is happily allowed through by many stateful firewalls and other defenses without further
inspection (since it is related to an entry in the connection table).
I'm not aware of any public implementations of this technique, even though the concept itself is making rounds since 2000 or so; because of this, I thought it might be a good idea to give it a try.
[ Of course, I might be wrong, but Google seems to agree with my assessment. A related use of this idea is 'firewalk' by Schiffman and Goldsmith, a tool to probe firewall ACLs; another utility called 'tcptraceroute' by Michael C. Toren implements TCP SYN probes, but since the tool does not ride an existing connection, it is less likely to succeed (sometimes a handshake must be completed with the NAT device before any traffic is forwarded). ]
A good example of the difference is www.ebay.com (220.127.116.11) - a
regular UDP/ICMP traceroute and tcptraceroute both end like this:
14 as-0-0.bbr1.SanJose1.Level3.net (18.104.22.168) ...
15 ae-12-53.car2.SanJose1.Level3.net (22.214.171.124) ...
16 * * *
17 * * *
18 * * *
Let's do the same using 0trace: we first manually telnet to 126.96.36.199 to port 80, then execute: './0trace.sh eth0 188.8.131.52', and finally enter 'GET / HTTP/1.0' (followed by a single, not two newlines) to solicit some client-server traffic but keep the session alive for the couple of seconds 0trace needs to complete the probe.
The output is as follows:
17 184.108.40.206 <---
18 10.6.1.166 <--- new data
19 10.6.1.70 <---
The last three lines reveal firewalled infrastructure, including private addresses used on the inside of the company. This is obviously an important piece of information as far as penetration testing is concerned.
Of course, 0trace won't work everywhere and all the time. The tool will not produce interesting results in the following situations:
Enough chatter - the tool is available here (Linux version):
- Target's firewall drops all outgoing ICMP messages,
- Target's firewall does TTL or full-packet rewriting,
- There's an application layer proxy / load balancer in the way
(Akamai, in-house LBs, etc),
- There's no notable layer 3 infrastructure behind the firewall.
- The tool also has a fairly distinctive TCP signature, and as such, it can be detected by IDS/IPS systems.
Note: this is a 30-minute hack that involves C code coupled with a cheesy shellscript. It may not work on non-Linux systems, and may fail on some Linuxes, too. It could be improved in a number of ways - so if you like
it, rewrite it.
Many thanks for Robert Swiecki (www.swiecki.net) for forcing me to finally give this idea some thought and develop this piece.
LFT, short for Layer Four Traceroute, is a sort of 'traceroute' that often works much faster (than the commonly-used Van Jacobson method) and goes through many configurations of packet-filters (firewalls). More importantly, LFT implements numerous other features including AS number lookups through several reliable sources, loose source routing, netblock name lookups, et al. What makes LFT unique? LFT is the all-in-one traceroute tool because it can launch a variety of different probes using both UDP and TCP layer-4 protocols. For example, rather than only launching UDP probes in an attempt to elicit ICMP "TTL exceeded" from hosts in the path, LFT can send TCP SYN or FIN probes to target arbitrary services. Then, LFT listens for "TTL exceeded" messages, TCP RST (reset), and various other interesting heuristics from firewalls or other gateways in the path. LFT also distinguishes between TCP-based protocols (source and destination), which make its statistics slightly more realistic, and gives a savvy user the ability to trace protocol routes, not just layer-3 (IP) hops. With LFT's verbose output, much can be discovered about a target network.
WhoB is a likable whois client (see whois(1)) designed to provide everything a network engineer needs to know about a routed IP address by typing one line and reading one line. But even so, it's worth typing a few more lines because WhoB can do lots of other cool things for you! It can display the origin-ASN based on the global routing table at that time (according to Prefix WhoIs, RIPE NCC, or Cymru), the 'origin' ASN registered in the RADB (IRR), the netname and orgname, etc. By querying pWhoIs, WhoB can even show you all prefixes being announced by a specific Origin-ASN. WhoB performs the lookups quickly, the output is easily parsed by automated programs, and it's included as part of the Layer Four Traceroute (LFT) software package. LFT uses WhoB as a framework (and you can too, quite easily--see whois.h). Recent LFT releases (as of version 2.5) include WhoB functionality through a standalone "whob" client/command placed in the LFT binary directory.
LFT and WhoB continue to evolve and provide more and more useful data to network engineers and to anyone else that cares how IP datagrams are being routed. With the advent of smarter firewalls, traffic engineering, QoS, and per-protocol packet forwarding, LFT and WhoB have become invaluable tools for many network managers worldwide.
LFT and WhoB are released under our open source license.
Google matched content
You can also download VisualRoute, a graphical traceroute program available for Windows, Sparc Solaris, and Linux. VisualRoute helps you analyze the traceroute, and provides a nifty world map showing you where your packets are going (it's not always geographically accurate). View a screenshot (I have obscured my local addresses).
Comments: 1/17/2007 09:14:44 PM, Maxim Veksler said...
The reason you were not able to traceroute all the way to yahoo.com is because people block ICMP traffic in their firewalls, while permitting TCP80 (HTTP). This means that ping won't work but firefox will.
At 1/19/2007 07:50:45 AM, devnet said...
That's why you can run traceroute with the -p option. You can then specify port number.
At 1/20/2007 09:00:09 AM, Anonymous said...
1.) yahoo.com is pingable for me (in the US) just fine. yahoo is probably blocking pings/ICMP traffic from your network.
2.) "dig yahoo.com" returns two ip addresses for me - 220.127.116.11 and 18.104.22.168. It would have been a good idea to try both.
At 1/20/2007 09:00:56 AM, Anonymous said...
Take a look at the 'mtr' utility too. Works in console as well, is somehow graphic and interactive and has more options. Basic usage is the same - 'mtr [ip_or_hostname]'.
At 1/20/2007 09:08:04 AM, Kagehi said...
Actually, no. ICMP packets, while directed at ports, are never the less *not* the same as normal TCP/IP packets. If the router/firewall is blocking them right, then changing the port *won't* have any effect on if it works (or at least shouldn't, since control packets are "control packets", not "data packets"). Now, its almost standard for Linux systems to be using TCP/IP to do it. In both cases you are waiting for some sort of message. ICMP can be specifically asked to give some information that TCP/IP "can't", which may make the trace "slightly" faster. I think you could say, "Hello C, I want to send something to Z, tell me how long it took for the packet to be received by D.", where the path is A->B->C->D->E->etc. With TCP/IP all you can do is keep lengthening the time out, then wait to see if the returning data says it got some place the last one didn't. In principle, its the same result, but the method is very different.
That said, if you are not using Linux, you are screwed with a internet router, because most have firewalls, and most block ICMP by default. Tracert for Windows, up to XP Pro, has "no" ability to change ports and uses ICMP, which is blocked. In fact, since ICMP is a control protocol, I am not sure its even "possible" when using it to set a different port, since only the routers know what to do with them and all should be using the same port for that operation, unlike TCP/IP, which, as I said, is a *completely different* protocol. Heck, even Windows Firewall has no specific "Block port X, which is used by ICMP." It specifically has entirely separate settings "just for" ICMP and what its allowed to do, if it can be received or sent, by how to where, etc.
Wish I could find a "free" TCP/IP based one for Windows though. I have to help some people out once in a while running Windows and its impossible to tell the difference between the modem disconnecting from the ISP and a firewall blocking the ICMP based trace... :(
Groupthink : Two Party System as Polyarchy : Corruption of Regulators : Bureaucracies : Understanding Micromanagers and Control Freaks : Toxic Managers : Harvard Mafia : Diplomatic Communication : Surviving a Bad Performance Review : Insufficient Retirement Funds as Immanent Problem of Neoliberal Regime : PseudoScience : Who Rules America : Neoliberalism : The Iron Law of Oligarchy : Libertarian Philosophy
War and Peace : Skeptical Finance : John Kenneth Galbraith :Talleyrand : Oscar Wilde : Otto Von Bismarck : Keynes : George Carlin : Skeptics : Propaganda : SE quotes : Language Design and Programming Quotes : Random IT-related quotes : Somerset Maugham : Marcus Aurelius : Kurt Vonnegut : Eric Hoffer : Winston Churchill : Napoleon Bonaparte : Ambrose Bierce : Bernard Shaw : Mark Twain Quotes
Vol 25, No.12 (December, 2013) Rational Fools vs. Efficient Crooks The efficient markets hypothesis : Political Skeptic Bulletin, 2013 : Unemployment Bulletin, 2010 : Vol 23, No.10 (October, 2011) An observation about corporate security departments : Slightly Skeptical Euromaydan Chronicles, June 2014 : Greenspan legacy bulletin, 2008 : Vol 25, No.10 (October, 2013) Cryptolocker Trojan (Win32/Crilock.A) : Vol 25, No.08 (August, 2013) Cloud providers as intelligence collection hubs : Financial Humor Bulletin, 2010 : Inequality Bulletin, 2009 : Financial Humor Bulletin, 2008 : Copyleft Problems Bulletin, 2004 : Financial Humor Bulletin, 2011 : Energy Bulletin, 2010 : Malware Protection Bulletin, 2010 : Vol 26, No.1 (January, 2013) Object-Oriented Cult : Political Skeptic Bulletin, 2011 : Vol 23, No.11 (November, 2011) Softpanorama classification of sysadmin horror stories : Vol 25, No.05 (May, 2013) Corporate bullshit as a communication method : Vol 25, No.06 (June, 2013) A Note on the Relationship of Brooks Law and Conway Law
Fifty glorious years (1950-2000): the triumph of the US computer engineering : Donald Knuth : TAoCP and its Influence of Computer Science : Richard Stallman : Linus Torvalds : Larry Wall : John K. Ousterhout : CTSS : Multix OS Unix History : Unix shell history : VI editor : History of pipes concept : Solaris : MS DOS : Programming Languages History : PL/1 : Simula 67 : C : History of GCC development : Scripting Languages : Perl history : OS History : Mail : DNS : SSH : CPU Instruction Sets : SPARC systems 1987-2006 : Norton Commander : Norton Utilities : Norton Ghost : Frontpage history : Malware Defense History : GNU Screen : OSS early history
The Peter Principle : Parkinson Law : 1984 : The Mythical Man-Month : How to Solve It by George Polya : The Art of Computer Programming : The Elements of Programming Style : The Unix Hater’s Handbook : The Jargon file : The True Believer : Programming Pearls : The Good Soldier Svejk : The Power Elite
Most popular humor pages:
Manifest of the Softpanorama IT Slacker Society : Ten Commandments of the IT Slackers Society : Computer Humor Collection : BSD Logo Story : The Cuckoo's Egg : IT Slang : C++ Humor : ARE YOU A BBS ADDICT? : The Perl Purity Test : Object oriented programmers of all nations : Financial Humor : Financial Humor Bulletin, 2008 : Financial Humor Bulletin, 2010 : The Most Comprehensive Collection of Editor-related Humor : Programming Language Humor : Goldman Sachs related humor : Greenspan humor : C Humor : Scripting Humor : Real Programmers Humor : Web Humor : GPL-related Humor : OFM Humor : Politically Incorrect Humor : IDS Humor : "Linux Sucks" Humor : Russian Musical Humor : Best Russian Programmer Humor : Microsoft plans to buy Catholic Church : Richard Stallman Related Humor : Admin Humor : Perl-related Humor : Linus Torvalds Related humor : PseudoScience Related Humor : Networking Humor : Shell Humor : Financial Humor Bulletin, 2011 : Financial Humor Bulletin, 2012 : Financial Humor Bulletin, 2013 : Java Humor : Software Engineering Humor : Sun Solaris Related Humor : Education Humor : IBM Humor : Assembler-related Humor : VIM Humor : Computer Viruses Humor : Bright tomorrow is rescheduled to a day after tomorrow : Classic Computer Humor
The Last but not Least Technology is dominated by two types of people: those who understand what they do not manage and those who manage what they do not understand ~Archibald Putt. Ph.D
Copyright © 1996-2018 by Dr. Nikolai Bezroukov. www.softpanorama.org was initially created as a service to the (now defunct) UN Sustainable Development Networking Programme (SDNP) in the author free time and without any remuneration. This document is an industrial compilation designed and created exclusively for educational use and is distributed under the Softpanorama Content License. Original materials copyright belong to respective owners. Quotes are made for educational purposes only in compliance with the fair use doctrine.
FAIR USE NOTICE This site contains copyrighted material the use of which has not always been specifically authorized by the copyright owner. We are making such material available to advance understanding of computer science, IT technology, economic, scientific, and social issues. We believe this constitutes a 'fair use' of any such copyrighted material as provided by section 107 of the US Copyright Law according to which such material can be distributed without profit exclusively for research and educational purposes.
This is a Spartan WHYFF (We Help You For Free) site written by people for whom English is not a native language. Grammar and spelling errors should be expected. The site contain some broken links as it develops like a living tree...
|You can use PayPal to make a contribution, supporting development of this site and speed up access. In case softpanorama.org is down you can use the at softpanorama.info|
The statements, views and opinions presented on this web page are those of the author (or referenced source) and are not endorsed by, nor do they necessarily reflect, the opinions of the author present and former employers, SDNP or any other organization the author may be associated with. We do not warrant the correctness of the information provided or its fitness for any purpose.
Last modified: September 12, 2017