Who wants to help me troubleshoot a slow network connection?

[Solved! See the update at the bottom of the post.]

Over the years, I’ve spent more hours than I care to think about delving into the innards of network connections. Sometimes, the solution to slow network throughput is as simple as swapping a cable or updating a driver. But sometimes the problem is more baffling.

It’s so baffling, in fact, that I’m posting this here in hopes that a networking expert (maybe even someone from HP, Intel, or Microsoft) will be able to explain exactly what’s happening.

Over the weekend, I picked up a new HP Pavilion Elite m9600t with a core i7-920 processor. I wiped away the messy Windows Vista installation and replaced it with a clean copy of Windows 7 Ultimate. After a few updates everything appeared to be working fine, until I tried to download a few large files from a server on my local network and discovered that the onboard Intel 82567V-2 Gigabit Ethernet adapter was delivering truly abysmal speeds.

Copying files from the new PC to any other network location were impressively fast. Here’s what the file transfer dialog box looked like for a file copy to the Public folder on that Windows Home Server box:

fast_throughput

That’s truly impressive throughput, with that 4.36GB file (a recorded TV program) copying in under 80 seconds.

But when I reversed the operation and tried to copy that same file to the local PC, the throughput dropped by more than 97%, to roughly 2 MB/sec. I tried different files and folders on different PCs, with similarly depressing results. In some cases transfer speeds were slower than I get on an Internet connection. Yikes!

slow_throughput

This gave me an opportunity to try most of the obvious (and some not-so-obvious) troubleshooting solutions. I’ll write about the details of that process later, but suffice it to say that upgrading to the most recent drivers, forcing the link speed into Full Duplex Gigabit mode, tweaking Windows TCP auto-tuning settings, enabling jumbo frames, and removing or disabling various Windows networking services did no good whatsoever.

Eventually, I zeroed in on some esoteric settings for the Ethernet adapter, available from the properties dialog box in Device Manager.

intel_tweaking

Through trial and error, I found that adjusting three settings “unblocked” the connection and allowed receive speeds to zoom to the levels I was seeing in the other direction:

  • Adaptive Inter-Frame Spacing This setting is disabled by default; enabling it, according to the help text, “compensates for excessive Ethernet packet collisions by dynamically controlling back-to-back timing.”
  • Flow Control The default setting is RX and TX Enabled, which means that the adapter responds to and generates flow control frames that tell the other end of the connection to wait. I set it to Tx Only.
  • Interrupt Moderation Rate This setting “moderates or delays the generation of interrupts … to optimize network throughput and network utilization.” Given that this system has a kick-ass i7 with eight core processor processing threads and four cores, I figured I could spare some CPU cycles, so I changed this setting from the default (Adaptive) to Off.

With these settings in place, receive speeds shot up dramatically, to rates that were exactly what I expected from a Gigabit Ethernet connection on a system with fast disks and controllers on either end.

But when the system resumed from sleep or restarted after being shut down, performance was back at those depressingly low levels again, which led to another round of troubleshooting. The settings I had made to the adapter still appeared to be in place when I checked its properties in Device Manager, but it was behaving as though the default settings were in force. After going down more dead ends and through more experimentation, I discovered a remarkable fix: If I restore the default performance settings using the Advanced Adapter Settings dialog box (clicking OK to reset the adapter) and then manually change the adapter settings back to my tweaked setup, performance returns to the speedy levels I expect.

This is completely reproducible. I’m assuming that somehow, when the network adapter wakes up after sleeping or a shutdown, it is loading its default performance settings rather than the ones I saved previously. As a workaround, I can do this Advanced Settings fandango every time the machine restarts or resumes from sleep, but that is going to get very old, very fast. I’m also considering disabling the onboard network adapter and installing a separate, non-Intel adapter in my one remaining PCI-Express slot. That’s $25 I’d rather not spend, but it’s the logical solution if I can’t find and fix the real cause.

So, what about it, networking experts? Have you ever seen anything like this? I’ll send an autographed copy of Windows 7 Inside Out to the first person who comes up with a successful solution (or at least a detailed explanation of why this is happening).

Update: Thanks to commenter BFT for insisting that I look more carefully at the network switch. When I tested connectivity using a straight-through Ethernet cable to connect two PCs directly, I was unable to replicate the throughput problems. That suggests that the problem is somewhere in the networking hardware itself. Switching to a different cable and using a different port on the switch solved the problem completely. The system now resumes from sleep with full network speeds. In addition, I restored the default settings to the network adapter and found that throughput increased by about 10%.

BFT, use the contact form in the sidebar to send me your contact information so I can get your signed copy of the book to you!

Another update: In response to some questions via Twitter and in the comments, here’s my theory of what happened. I never swapped cables as part of the troubleshooting. Intel’s network adapter control panel has a cable test that told me this cable was good. I assumed (incorrectly) that the fact I could get decent transfer speeds in both directions with the right settings was evidence there was no problem with the cable.

My theory is that the defective cable was causing the switch to get an improper signal at power-on, so the switch was defaulting to slow Ethernet mode and not auto-sensing the Gigabit Ethernet connection. Adjusting the software settings and forcing the adapter to reset also forced the switch to reset.

Bottom line, I think the culprit was mostly the cable, which in turn was causing the switch to behave incorrectly.

And one more PS: This is yet another example of a problem that appeared to be Windows-related but eventually was traced to the simplest of hardware connections. For previous examples, see here and here. This is why I am always reluctant to point a finger at any hardware or software maker until I have all the facts.

34 thoughts on “Who wants to help me troubleshoot a slow network connection?

  1. You can write a scrip to automate the process for the time being, but I don’t know the script-fu that would do that.

    Have you tried using a generic driver over the newest, most specific one? Doubt this would work, but it is a suggestion.

  2. I’ve tried four separate drivers, including the most generic one from 2008. No good.

    I’m looking into scripts but the problem is that they have to force the adapter to accept the settings and restart not once but twice. I can’t see an easy way to do that.

  3. These setting changes could be masking the real issue. Just to have the full picture, what type of switch is it plugged into?

    BT

  4. While your at it, also validate if there are any mainboard BIOS updates from HP. I’ve seen some grumbling on the Ubuntu forums related to this specific chipset causing issues with flow control on some BIOS out there.

  5. No, those netsh tweaks don’t make any difference.

    No, turning off IPv6 doesn’t make any difference.

    It’s plugged into a generic Belkin gigabyte switch that works perfectly with five other systems.

    BIOS is the most recent, version 5.24.

  6. If I had a dime for every user who said “it works perfectly with all my other systems…” 😉

    To clarify, Belkin switch model/rev?

  7. Ed, I know this won’t help, but I have two PC’s that I built with exactly the same motherboard in them. One was running SBS 2008 and the other was running the RC of Windows 7 Ultimate. One day, Windows Update wanted me to download a new driver on the Win7 Beta box. I did so and had the same issue you have experienced. I rolled back to the old driver that shipped with the RC version and network speeds returned to normal. When I “nuked” the Win7 Beta box and installed the RTM version of Win7 Ultimate, the problem returned and I could not find a driver to fix it. The motherboards were ASUS and were not Intel-integrated NIC’s. I looked all over the Internet for suggestions and finally found evidence others were having the same problem and the solution was a new NIC. I went out and bought a Netgear Gigabit NIC and all has been well. In my case, the issue had to be a bad driver from MS. I never had the problem on my SBS 2008 box.

  8. I know this isn’t ideal, but I’d get a $15 PCI or USB nic and call it a day. I had to do this with my XPS 420 to fix dropouts in my recordings from my HDHR. At least you’d know for sure if it was the NIC driver or hardware.

  9. Speaking of things that won’t help, I’m still on the path that it could be the flow control settings on the NIC interacting poorly with the switch. To test that theory, have you tried a straight PC NIC-to-NIC connection using static IPs to validate performance? I know it won’t fix the problem, but it could isolate the culprit? (probably still need a new NIC though, until next month’s latest driver update from Intel…)

    Sorry can’t be more help…back to work.

  10. not sure if this would help, but on the power management tab of the nic in device manager, i always uncheck the box that says allow the computer to turn this device off to save power. not really any need on a desktop and it was causing me issues all through the beta in win7.

    i’ve heard flow control wasn’t needed on the newer switches, but i don’t know for sure.

    do you have all pc’s in the same switch or router?

  11. Ed – perhaps your power management settings are resetting the performance of the NIC? I’m not the hardware guru I used to be… but for a free copy of your book, I thought I’d throw my hat in the ring. -EC

  12. Had the same problem in Windows Vista with this adapter. It appeared to be that the router powers up the line from its side, only when a adapter on an other site do this. But the same this adapter wanted to do too! In other words, they both waited for each other to do initialization 🙂

    So, after i told in router, keep the line always powered on all problems are gone. I found this by playing with different power settings and link speed setting for this inter adapter.

  13. The entire PC is obviously defective and should be sent to me. I will write you an autographed check to cover the postage.

  14. Just getting more info here. When you set the Flow Control to Tx only, did you still get the same speeds when copying files “to” the WHS box?

  15. Ed: This may seem as a crazy suggestion. Wipe clean again and install XP and run the same test and I’ll be you don’t have the same problem. I do a lot of file work and Windows 7 frustrates me all the time with erratic file performance.

  16. Thanks for all the suggestions.

    I’ve tried all combinations of power settings with no luck, and this is going through a simple switch, not a router. All PCs are on the same switch, which is not a Belkin but an EnGenius Gigabit Switch that has performed well for more than a year. I will be testing with a straight-through cable today.

    Installing XP is not an option, thanks. I do a lot of file work as well and have found Windows 7 to be an ideal network citizen, which is what makes this so odd.

    Mike, when I set flow control to Tx only, file speeds remain excellent on the transfer side.

    Rick, nice try. 😉

    Bill, I have a new NIC on order.

  17. I’d replace the NIC on that system too. Read the comments from the purchasers of this add-in NIC card: http://www.newegg.com/Product/Product.aspx?Item=N82E16833106121 (I recomend this NIC card BTW…) Seems a lot of the time, on-board NICs are really step-child like devices. Sure it’s a driver problem, and one that will never get fixed due to the motherboard company not caring about such a small part of their product.

    What always impressed me about ethernet is that the whole concept is purely based on theory. The fact it works at all seems more like luck that true engineering 😛

  18. Narg, that is a nice card. I even have one here. Trouble is, it’s PCI and this system has only PCI Express slots.

    Anyway, after swapping cables and moving to a different port on the switch, all is well and this onboard adapter is working at top speed, so I’m happy.

  19. “Switching to a different cable and using a different port on the switch solved the problem completely.”

    But which was the problem? One or both? You talked about swapping a cable in the first paragraph, so it shouldn’t be just that. If it’s the port and it’s not outright defective, why don’t the “flow control settings on the NIC” interact equally poorly with all ports on the switch? They should all be the same.

  20. Rick, I never swapped cables as part of the troubleshooting (my bad). Intel’s network adapter control panel has a cable test that told me this cable was good. I assumed (again, my bad) that the fact I could get decent transfer speeds in both directions with the right settings was evidence there was no problem with the cable.

    My theory is that the switch was was not getting a proper signal at power-on, so the switch was defaulting to slow Ethernet mode and not auto-sensing the Gigabit connection. Adjusting the software settings and forcing the adapter to reset also forced the switch to reset.

    Bottom line, I think it was mostly the cable, which in turn was causing the switch to behave incorrectly.

  21. as i told 🙂

    I think it’s a problem with Intel drivers. they have to fix them to detect the line correctly. Actually, it is an adapter’s job to tell ‘hello, I’m here!’, not a vice versa.

  22. Alex, I think your issue was similar, but in this case the culprit was physical. The adapter was trying to announce its presence and suggest a speed, but the cable was not cooperating. No amount of playing with power settings and link speeds on the PC side would fix it, and with a dumb switch there are no configurable options on that side of the connection.

  23. Having once solved a network connection problem by switching the Ethernet cable end for end (after reinserting each end separately), I’m prepared to believe almost anything.

    Why did I try end for end? (a) I’m a (former) sailor. (b) It beat two floors down and two back up to get a replacement cable.

  24. I was thinking the same thing as BFT but never thought to post it. I’ve had interesting situations with bad cables before too. Networking is just one of those things that adds more complexity as it scales.

  25. Ed, yours is a most interesting event. So was reading about switching a cable from end to end. I am never surprised, anymore. It must be a BIG relief to have your problem resolved.

  26. Goes back to my networking classes from 20 years ago… 90% of all networking problems are the cables. Still holds true today 😛

  27. Hi Ed,

    May seem like nitpicking, but I wanted to point out that your i7-920 has only 4 cores.

    i7’s max out at 6 cores in the 9x0X Extreme Editions.

    Intel does not currently have a publicly available 8 core processor to my knowledge.

    1. Thanks, Dave. I knew that but got it wrong when I wrote it up. Because of hyper-threading, the Windows Task Manager actually reports eight CPUs in Task Manager on an i7. I’ve corrected the reference.

  28. Ed,
    I’m posting this reply to highlight another network issue – with Windows 7.
    My ADSL Modem was configured as 192.168.1.1 and to supply DHCP addresses in the range 192.168.1.32 to 192.168.1.255 and all was sweetness and light. Then the modem went on the fritz. My ISP sent me a new modem, but this model had a default DGCP range of 192.168.1.2 to 255. Unfortunately I have some devices with fixed IP addresses using 1.2 to 1.10. My laptop obtained 1.2. This generated IP conflicts, of course.
    I quickly reconfigured the DHCP range of the ADSL Modem to 1.32 to 1.255. I powered everything down and then powered back up. My Win 7 laptop still kept the 1.2 address. I used ipconfig to release and renew leases, but the laptop refuses to release 1.2. I have searched various Internet forums, and have found many posts which suggest that Win 7 holds the IP address “somewhere”, but could find no solution to forcing the Win 7 laptop to obtain a DHCP address from the new DHCP range. Having bought the pdf advance edition of Windows 7 Inside-out I searched through the book to see if you cover this issue, but it looks like you do not. Could you and your co-authors comment on this issue and perhaps even find a solution.
    Regards,
    Nick.

Comments are closed.