Troubleshooting 101: Packet Captures and TFTP
by Jin QianBy Jin Qian
QA engineers like myself spend a lot of time validating our own products, but in our jobs we also end up troubleshooting plenty of other network and data center devices. When you’ve been doing this kind of work long enough, the process of solving these problems becomes much more automatic, but I thought it might be useful for some readers to walk through the steps I took during a recent troubleshooting job on a piece of network equipment we use in our lab. That way, you can see the logic I followed and the tools I used.
You’ll also see that troubleshooting is recursive: you follow the same basic steps over and over as you solve parts of the problem, but you keep combining the steps in different ways based on the information you collect and what your knowledge and experience tells you. Maybe my approach will help you the next time you’re faced with a networking problem that’s giving you trouble.
Try Something, Collect Information, Think, Repeat
When a problem happens, the first question to ask is which network element is responsible for the problem. To do this, you want to capture and analyze data to figure out what went wrong, and then think about how to correct the fault or work around it. For simple network problems, you need a packet-capture tool; my favorite is Wireshark.
In this case, I used Wireshark to troubleshoot a Trivial File Transfer Protocol (TFTP) failure. I had the task of bringing up a specialized embedded device that we use for QA and development. All I had was a .zip file with various OS kernel images for the device to use. I didn’t know what IP address the device would use, whether it would use TFTP or FTP to get the images, or what server IP address it would look for.
Round 1
- Try something: I hooked up my laptop to the device with an Ethernet cable, started Wireshark, and then powered up the device.
- Collect information: From Wireshark, I could clearly see that the device used the IP address 192.168.0.5, and that it was trying to use Address Resolution Protocol (ARP) to find the address 192.168.0.1. (Click on the screenshot image to see a larger version.)
- Think: How can I use this data to make the device do what I want? It was easy to assign the correct IP address to my laptop’s Ethernet port.
- Result: I could see that the ARP completed, and that the device then used TFTP to ask for a file.
Round 2
- Try something: I used Google to find a TFTP/FTP server from 3Com, then downloaded it and pointed the server to the directory that held all the kernel images (which I had unzipped). I thought that would be the end of it — easiest troubleshooting job ever! — but to my surprise the TFTP server did not respond to the TFTP read request. Hmm . . .
- Collect information: Wireshark showed that the TFTP server didn’t send any packets back to the TFTP client when the read request came through.
- Think: What could be holding things up? Possibility #1: Maybe the TFTP server isn’t listening on the right UDP port.
- Try something/collect information: Using the command “netstat -an” I checked my Windows 7 laptop to make sure that the TFTP server was indeed listening on 0.0.0.0:69. It was.
- Think: What could be holding things up? Possibility #2: Maybe a firewall is getting in the way?
- Try something/collect information: Checked to make sure the firewall was turned off. That didn’t work — I couldn’t get it turned off. A lot of Googling led me to lots of advice for turning off the firewall, but none of it worked.
- Result: Finally I had to give up on that machine because of the firewall problem, so I started up my old Windows XP laptop.
Round 3
- Try something: Did the same steps as before, including assigning the IP address, installing the TFTP server, pointing it to the right directory containing the kernel images, and turning off the firewall. (Since I knew exactly what I needed to do this time, all of this went much faster.) Now I could see the TFTP server responding to the TFTP client’s request. This made me happy — but not for long, because the transaction would stop halfway through transferring one big file. I tried it a couple of times and always got the same story.
- Collect information: Wireshark showed that after transmitting 32,767 data blocks, the TFTP server kept retransmitting the 32,768th block, even though the TFTP client correctly ACKed that block.
- Think: At this point, I was convinced that it was a server-side problem, and started to think maybe there was an issue of a signed short vs. an unsigned short (from experience, I remembered that 32,767 is the largest integer when you use a signed short as the type of variable). This made me suspect that there was a bug in the TFTP server’s code.
- Result: I went to look for a better TFTP server.
Round 4
- Try something: I Googled some more and got an updated version of the 3com TFTP server, which did eventually fix the signed-short problem.
- Collect information: Now a new problem presented itself — the TFTP process would start, but it would not continue. Once again, I used packet capture to figure out that the TFTP server sent each data block with fragmented IP packets. The embedded device obviously had trouble handling the packets, since it didn’t reply with ACKs.
- Think: What’s causing the fragmentation? Maybe the maximum transmission unit (MTU) on the laptop?
- Collect information: I realized that the MTU was set to be 1360 bytes — rather than the usual default of 1500 bytes for an Ethernet port.
- Take action: The solution was easy: use the registry file to change the MTU on the laptop’s network interface controller (NIC), then reboot the laptop.
- Result: Voila! The TFTP process completed beautifully, and the embedded device booted right up and got on with its business.
This was a fairly straightforward troubleshooting problem, but the same pattern of thought and action that helped me solve this one can help with even the toughest problems. The key is to collect relevant information, use it to help you isolate one step of the problem, and then use your training and experience to solve that one step. As you repeat the process, you shrink the big problem down to size.
Please feel free to share your favorite troubleshooting tips in the comments.
Related post:






