Industry


Ads by TechWords

See your link here


When Linux does well: the e1000e Ethernet bug fixed

One reason I love Linux is that when there's a problem, it gets fixed. Usually, it gets fixed in a hurry and that's exactly what happened with the e1000e Ethernet bug.

To bring you up to speed, a pre-release version of the 2.6.27 Linux kernel, which was being used in several beta Linux distributions, was sometimes frying the Ethernet firmware in systems equipped with the Intel ICH8 and ICH9 chipset and their 82566 and 82567 Ethernet chipsets. The major distributions to worry about were the Mandriva Linux 2009 pre-releases; Novell's SUSE Linux Enterprise 11 Beta 1 and openSUSE 11 Beta 1; Fedora 10 release candidates 1 and 2; Gentoo Linux; and Ubuntu Intrepid Ibex.

OK, so most people were unlikely to ever see this bug, but, on the other hand, a lot of people play with beta Linux distributions. In particular, Fedora was very close to shipping so it's reasonable to assume that quite a few Linux users were putting it through its paces.

Now, thanks to Intel, and a nudge from Linus Torvalds, there's code that will fix the problem. This fix will be in the next pre-release version of the 2.6.27 kernel -- Linux 2.6.27-rc9 - on October 5th.

Torvalds, in the gentle way he guides the Linux development team, pointed out on the LKML (Linux Kernel Mailing List) that "Btw, the _real_ bug is clearly in the hardware design that allows you to brick those things without apparently even having a lock bit."

Torvalds continued, "I'm hoping Intel doesn't treat this as just a software bug. Some hw designer should be thinking hard about which orifice they put their head up in. It used to be that you could fry some monitors by feeding them out-of-range signals. The _monitors_ got fixed."

The next day, Bruce Allen, a Linux kernel developer and Intel engineer, announced a "patch [which] is meant to prevent all future corruptions of the e1000e NVM (non volatile memory) after the driver is loaded." Torvalds immediately applied it to the next test version of the kerne.

This is not the end of the story. This is a fix that prevents the problem, but it doesn't explain how the problem happened in the first place. But, Allen wrote on the LKML, "This should allow us to move forward with debugging without allowing any other bad element or the e1000e driver, to write to the NVM area unexpectedly."

Currently we (Intel Ethernet) are reproducing the issue on multiple machines in house, we are working on the issue with the other core Linux teams here at Intel and within the community. No resolution yet but we are much closer now."

Once the problem is nailed down, "we will post patches to help users who have had this problem restore their eeprom from either a saved image from ethtool -e or from another identical system."

By the time the next production version of the Linux kernel, 2.6.28, comes out later this year, this problem will be just obscure developer history rather than a current concern.

What People Are Saying

So where are the Intel

So where are the Intel patches promised by Bruce Allen to repair a bricked eeprom?

Surprisingly fast ... And I

Surprisingly fast ... And I had gone to seek another network card to circumvent the problem of the blacklisted within the kernel and resolve the network problem, and they have already a patch for the problem ?!? Sweet :D

Wow

That cool little penguin dude really outdid himself this time. Amazing!

Jiff
www.privacy-center.ru.tc

my comments quoted in this article

I'd like to clear up a factual mistake, Bruce generated the patch, I communicated and drafted the update that you so copiously quoted, as well as doing all the work to post and follow up with the patches. Please check the original data posted at http://marc.info/?l=linux-netdev&m=122290676202190&w=2

Thanks for posting the positive article! Its nice to hear when people are actually pleased as well as when they are unhappy.

fix tipo !

s/kerne/kernel/g !!! !!!

s/\/kernel/g

s/\/kernel/g

Fedora 10 Release Candidates?

The post states that Fedora 10 RC 1 and 2 were affected, but Fedora 10 Beta was just released. The preview release won't be out for another month or so, and they don't really have "release candidates".

just gotta love linux :)

just gotta love linux :)

Sarcasm!

Got to love Linus. I saw this yesterday.

Sarcasm does work!

"I'm hoping Intel doesn't treat this as just a software bug. Some hw designer should be thinking hard about which orifice they put their head up in."

Flurry of commits:
e1000e: write protect ICHx NVM to prevent malicious
e1000e: reset swflag after resetting hardware
e1000e: do not ever sleep in interrupt context
e1000e: remove phy read from inside spinlock
e1000e: drop stats lock
e1000e: debug contention on NVM SWFLAG
e1000e: update version from k4 to k6
e1000e: Fix incorrect debug warning master