SIOCGMIIREG errors on e1000e interface

Bug #763467 reported by Stefan Lapers
16
This bug affects 3 people
Affects Status Importance Assigned to Milestone
keepalived (Ubuntu)
Confirmed
Medium
Unassigned

Bug Description

Binary package hint: keepalived

Running keepalived on 10.04 LTS (running fully updated) throws SIOCGMIIREG errors, it seems to be related to the e1000e driver.
I compiled the latest keepalived version (1.2.2) from source and this resolves the problem.

# dpkg -l | grep keepalived
ii keepalived 1.1.17-2ubuntu1 Failover and monitoring daemon for LVS clust

Errors:
---------
Keepalived_vrrp: SIOCGMIIREG on eth3.4005 failed: Input/output error
Keepalived_healthcheckers: SIOCGMIIREG on eth4.4010 failed: Input/output error
Keepalived_vrrp: SIOCGMIIREG on eth3.4005 failed: Input/output error
Keepalived_healthcheckers: SIOCGMIIREG on eth4.4010 failed: Input/output error
Keepalived_vrrp: SIOCGMIIREG on eth3.4005 failed: Input/output error
Keepalived_healthcheckers: SIOCGMIIREG on eth4.4010 failed: Input/output error
Keepalived_vrrp: SIOCGMIIREG on eth3.4005 failed: Input/output error
Keepalived_healthcheckers: SIOCGMIIREG on eth4.4010 failed: Input/output error
Keepalived_vrrp: SIOCGMIIREG on eth3.4005 failed: Input/output error
Keepalived_healthcheckers: SIOCGMIIREG on eth4.4010 failed: Input/output error
Keepalived_vrrp: SIOCGMIIREG on eth3.4005 failed: Input/output error
Keepalived_healthcheckers: SIOCGMIIREG on eth4.4010 failed: Input/output error

lspci of the nic: (intel quad gig copper nic)
---------------------
06:00.1 Ethernet controller: Intel Corporation 82571EB Gigabit Ethernet Controller (Copper) (rev 06)
 Subsystem: Intel Corporation 82571EB Gigabit Ethernet Controller (Copper)
 Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
 Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
 Latency: 0, Cache Line Size: 64 bytes
 Interrupt: pin A routed to IRQ 33
 Region 0: Memory at fe3a0000 (32-bit, non-prefetchable) [size=128K]
 Region 1: Memory at fe380000 (32-bit, non-prefetchable) [size=128K]
 Region 2: I/O ports at ccc0 [size=32]
 Expansion ROM at e0100000 [disabled] [size=128K]
 Capabilities: [c8] Power Management version 2
  Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold-)
  Status: D0 PME-Enable- DSel=0 DScale=1 PME-
 Capabilities: [d0] Message Signalled Interrupts: Mask- 64bit+ Queue=0/0 Enable+
  Address: 00000000fee0300c Data: 41d1
 Capabilities: [e0] Express (v1) Endpoint, MSI 00
  DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s <512ns, L1 <64us
   ExtTag- AttnBtn- AttnInd- PwrInd- RBE- FLReset-
  DevCtl: Report errors: Correctable+ Non-Fatal+ Fatal+ Unsupported+
   RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+
   MaxPayload 128 bytes, MaxReadReq 512 bytes
  DevSta: CorrErr- UncorrErr+ FatalErr- UnsuppReq+ AuxPwr- TransPend-
  LnkCap: Port #4, Speed 2.5GT/s, Width x4, ASPM L0s, Latency L0 <4us, L1 <64us
   ClockPM- Suprise- LLActRep- BwNot-
  LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- Retrain- CommClk-
   ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
  LnkSta: Speed 2.5GT/s, Width x4, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
 Capabilities: [100] Advanced Error Reporting <?>
 Capabilities: [140] Device Serial Number 1e-bb-f6-ff-ff-17-15-00
 Kernel driver in use: e1000e
 Kernel modules: e1000e

Revision history for this message
Andres Rodriguez (andreserl) wrote :

Hi there,

Thank you for taking the time to report bugs and trying to make Ubuntu better.

Unfortunately,. I've been unable to reproduce this error. However, after some online research this problem appears to be with version 1.1.17 and maybe 1.1.18. Starting from 1.1.19 this seems to be fixed. Unfortunately there doesn't seem to exist a patch to fix this issue specifically but rather just to use a newer upstream version according to what I've read.

For now, I'll be marking this bug as incomplete until a bit more research is done in the issue and maybe finding the appropriate patch in upstream.

Changed in keepalived (Ubuntu):
status: New → Incomplete
Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for keepalived (Ubuntu) because there has been no activity for 60 days.]

Changed in keepalived (Ubuntu):
status: Incomplete → Expired
Revision history for this message
Anders Bruun Olsen (abo-dsl.dk) wrote :

This problem also affects servers in my farm.

From syslog:
Aug 8 11:08:22 webcache1 Keepalived_vrrp: SIOCGMIIREG on eth1 failed: Input/output error
Aug 8 11:08:22 webcache1 Keepalived_vrrp: last message repeated 21 times
Aug 8 11:08:22 webcache1 Keepalived_healthcheckers: SIOCGMIIREG on eth1 failed: Input/output error

# dmesg | grep eth1
[ 1.833254] 0000:05:00.0: eth1: (PCI Express:2.5GB/s:Width x1) 00:23:7d:fd:81:12
[ 1.833258] 0000:05:00.0: eth1: Intel(R) PRO/1000 Network Connection
[ 1.833343] 0000:05:00.0: eth1: MAC: 1, PHY: 4, PBA No: d70413-004
[ 8.543925] ADDRCONF(NETDEV_UP): eth1: link is not ready
[ 11.463533] e1000e: eth1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None
[ 11.464415] ADDRCONF(NETDEV_CHANGE): eth1: link becomes ready
[ 21.962572] eth1: no IPv6 routers present

# dpkg -l keepalived
ii keepalived 1.1.17-2ubuntu1 Failover and monitoring daemon for LVS clusters

# lspci -vvv -s 05:00.0
05:00.0 Ethernet controller: Intel Corporation 82572EI Gigabit Ethernet Controller (Copper) (rev 06)
 Subsystem: Hewlett-Packard Company Device 704a
 Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+ Stepping- SERR+ FastB2B- DisINTx+
 Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
 Latency: 0, Cache Line Size: 32 bytes
 Interrupt: pin A routed to IRQ 28
 Region 0: Memory at ec120000 (32-bit, non-prefetchable) [size=128K]
 Region 1: Memory at ec100000 (32-bit, non-prefetchable) [size=128K]
 Region 2: I/O ports at 2000 [size=32]
 [virtual] Expansion ROM at e8000000 [disabled] [size=128K]
 Capabilities: [c8] Power Management version 2
  Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
  Status: D0 PME-Enable- DSel=0 DScale=1 PME-
 Capabilities: [d0] Message Signalled Interrupts: Mask- 64bit+ Queue=0/0 Enable+
  Address: 00000000fee0100c Data: 4191
 Capabilities: [e0] Express (v1) Endpoint, MSI 00
  DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s <512ns, L1 <64us
   ExtTag- AttnBtn- AttnInd- PwrInd- RBE- FLReset-
  DevCtl: Report errors: Correctable+ Non-Fatal+ Fatal+ Unsupported+
   RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+
   MaxPayload 128 bytes, MaxReadReq 512 bytes
  DevSta: CorrErr- UncorrErr+ FatalErr- UnsuppReq+ AuxPwr+ TransPend-
  LnkCap: Port #0, Speed 2.5GT/s, Width x1, ASPM L0s, Latency L0 <4us, L1 <64us
   ClockPM- Suprise- LLActRep- BwNot-
  LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- Retrain- CommClk+
   ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
  LnkSta: Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
 Capabilities: [100] Advanced Error Reporting <?>
 Capabilities: [140] Device Serial Number 12-81-fd-ff-ff-7d-23-00
 Kernel driver in use: e1000e
 Kernel modules: e1000e

Can we please have a fix for this?

Changed in keepalived (Ubuntu):
status: Expired → New
Revision history for this message
Dave Walker (davewalker) wrote :

A minimal patch that resolves this needs to be identified for it to be fixed in lucid.

Thanks.

Changed in keepalived (Ubuntu):
importance: Undecided → Medium
Revision history for this message
Dave Walker (davewalker) wrote :

Someone wanting to fix this, who also has suitable hardware could bisect:
http://git.formilux.org/?p=people/alex/keepalived.git;a=shortlog

Revision history for this message
jure (jure-koren+launchpad) wrote :

I also got the same messages:
Keepalived_vrrp: SIOCGMIIREG on eth2 failed: Input/output error
Keepalived_healthcheckers: SIOCGMIIREG on eth2 failed: Input/output error

on this nic, even though I'm not using it for keepalived:
1c:00.0 Ethernet controller: Intel Corporation 82572EI Gigabit Ethernet Controller (Copper) (rev 06)

The problem is that I got about 300000 error msgs/day in syslog, so it would fill up the syslog database.

Upgrading to 1.1.20 from upstream fixes the problem.

Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in keepalived (Ubuntu):
status: New → Confirmed
Revision history for this message
Eddy Ribeyrol (a-eddv-n) wrote :

Hi,
On :
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=10.04
DISTRIB_CODENAME=lucid
DISTRIB_DESCRIPTION="Ubuntu 10.04.4 LTS"

with keepalived 1.1.17-2ubuntu Failover and monitoring daemon for LVS clust

I have the same problem :
May 11 06:30:55 x Keepalived_healthcheckers: last message repeated 21 times
May 11 06:30:55 x Keepalived_vrrp: SIOCGMIIREG on eth0 failed: Input/output error
May 11 06:30:55 x Keepalived_vrrp: last message repeated 21 times
May 11 06:30:55 x Keepalived_vrrp: SIOCGMIIREG on eth1 failed: Input/output error

Thanks.

Revision history for this message
Eddy Ribeyrol (a-eddv-n) wrote :

The 2 network cards, if help :

 *-network
                description: Ethernet interface
                product: 82573V Gigabit Ethernet Controller (Copper)
                vendor: Intel Corporation
                physical id: 0
                bus info: pci@0000:05:00.0
                logical name: eth0
                version: 03
                serial: 00:e0:81:4a:db:7e
                size: 1GB/s
                capacity: 1GB/s
                width: 32 bits
                clock: 33MHz
                capabilities: pm msi pciexpress bus_master cap_list ethernet physical tp 10bt 10bt-fd 100bt 100bt-fd 1000bt-fd autonegotiation
                configuration: autonegotiation=on broadcast=yes driver=e1000e driverversion=1.0.2-k2 duplex=full firmware=1.0-2 latency=0 link=yes multicast=yes port=twisted pair speed=1GB/s
                resources: irq:29 memory:d8080000-d809ffff memory:d8000000-d807ffff ioport:4000(size=32)

           *-network
                description: Ethernet interface
                product: 82573V Gigabit Ethernet Controller (Copper)
                vendor: Intel Corporation
                physical id: 0
                bus info: pci@0000:06:00.0
                logical name: eth1
                version: 03
                serial: 00:e0:81:4a:db:7f
                size: 100MB/s
                capacity: 1GB/s
                width: 32 bits
                clock: 33MHz
                capabilities: pm msi pciexpress bus_master cap_list ethernet physical tp 10bt 10bt-fd 100bt 100bt-fd 1000bt-fd autonegotiation
                configuration: autonegotiation=on broadcast=yes driver=e1000e driverversion=1.0.2-k2 duplex=full firmware=1.0-2 latency=0 link=yes multicast=yes port=twisted pair speed=100MB/s
                resources: irq:30 memory:d8180000-d819ffff memory:d8100000-d817ffff ioport:5000(size=32)

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.