Ubuntu 14.04.03 did not detect Link down on Houston CU network adapter ports when cable pulled

Bug #1513980 reported by bugproxy
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Fix Released
Medium
Canonical Kernel Team
Vivid
Fix Released
Undecided
Tim Gardner
Wily
Fix Released
Undecided
Unassigned
Xenial
Fix Released
Medium
Canonical Kernel Team

Bug Description

Problem Description:
====================

I installed Houston CU (feature code 2CC1) PCIe2 10GbE SFP+ CU 4-port Converged Network Adapter in slot P1-C2 of tul117fp1 system. The system is configured as PowerNV and running Ubuntu 14.04.03

- I configured 2 copper ports of the adapter and running network stress on it for several hours. I then pulled cable and waited for sometime.

- The OS did not detect the link down. Even though the TX/RX traffic stopped on the ports.

- Kernel log and dmesg did not have any entry to indicate link down after cable pulled.

0007:00:00.0 PCI bridge: IBM Device 03dc
0007:01:00.0 Ethernet controller: Emulex Corporation OneConnect NIC (Lancer) (rev 10)
0007:01:00.1 Ethernet controller: Emulex Corporation OneConnect NIC (Lancer) (rev 10)
0007:01:00.2 Ethernet controller: Emulex Corporation OneConnect NIC (Lancer) (rev 10)
0007:01:00.3 Ethernet controller: Emulex Corporation OneConnect NIC (Lancer) (rev 10)
0007:01:00.4 Fibre Channel: Emulex Corporation OneConnect FCoE Initiator (Lancer) (rev 10)
0007:01:00.5 Fibre Channel: Emulex Corporation OneConnect FCoE Initiator (Lancer) (rev 10)

- ifconfig -a cmd still show link up

eth43 Link encap:Ethernet HWaddr 00:90:fa:6e:e0:f2
          inet addr:103.1.1.15 Bcast:103.1.1.255 Mask:255.255.255.0
          UP BROADCAST MULTICAST MTU:1500 Metric:1
          RX packets:7543074880 errors:1 dropped:9 overruns:0 frame:1
          TX packets:5926170683 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:8997647070924 (8.9 TB) TX bytes:7177651764733 (7.1 TB)

eth44 Link encap:Ethernet HWaddr 00:90:fa:6e:e0:f3
          inet addr:103.1.2.15 Bcast:103.1.2.255 Mask:255.255.255.0
          UP BROADCAST MULTICAST MTU:1500 Metric:1
          RX packets:5926170613 errors:1 dropped:0 overruns:0 frame:1
          TX packets:7543089239 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:7488707854629 (7.4 TB) TX bytes:8619256135370 (8.6 TB)

Paul, looking at the source code for the driver in Ubuntu 14.04.3 the driver doesn't log any messages after an update to the link status.

The following commit upstream adds this functionality:

https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=18824894dbec3eb2202fc92d52a0c8bd27c8a63f

I'll create a test kernel with this patch applied and will attach it here so you can test this commit

- I installed Ubuntu patched kernel

- I setup and re-ran the test on Houston adapter. I verified that with the patch link Up/Down events now get logged properly when cable pulls.

I verified the patch and it fixed the problem

CVE References

bugproxy (bugproxy)
tags: added: architecture-ppc64le bugnameltc-127514 severity-high targetmilestone-inin14043
Changed in ubuntu:
assignee: nobody → Taco Screen team (taco-screen-team)
affects: ubuntu → linux (Ubuntu)
Changed in linux (Ubuntu):
assignee: Taco Screen team (taco-screen-team) → Canonical Kernel Team (canonical-kernel-team)
importance: Undecided → Medium
status: New → Triaged
Revision history for this message
Tim Gardner (timg-tpi) wrote :

git describe --contains 18824894dbec3eb2202fc92d52a0c8bd27c8a63f
v4.2-rc1~130^2~396

Changed in linux (Ubuntu Wily):
status: New → Fix Released
Changed in linux (Ubuntu Xenial):
status: Triaged → Fix Released
Revision history for this message
bugproxy (bugproxy) wrote : Comment bridged from LTC Bugzilla

------- Comment From <email address hidden> 2015-11-16 18:50 EDT-------
The test team wants to know if the small change can be added to a 14.04.3 SRU?

tags: removed: bugnameltc-127514 severity-high
Tim Gardner (timg-tpi)
Changed in linux (Ubuntu Vivid):
assignee: nobody → Tim Gardner (timg-tpi)
status: New → In Progress
Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2015-11-19 13:07 EDT-------
==== State: Verify by: bellivea on 19 November 2015 06:58:32 ====

Paul - What is the outlook to verify this?

tags: added: bugnameltc-127514 severity-high
Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2015-11-19 23:59 EDT-------
==== State: Verify by: nguyenp on 19 November 2015 17:36:27 ====

Hi Mike,
I did verified the private fix a while ago.

I saw Luciano's comment request to add the change to Ubuntu 14.04.03 SRU release. Once the fix make it to the Ubutun SRU release I'll re-test and verify it again.

Revision history for this message
Breno Leitão (breno-leitao) wrote :

Just a heads up for this bug. We are waiting it to show up on 14.04.3.

Luis Henriques (henrix)
Changed in linux (Ubuntu Vivid):
status: In Progress → Fix Committed
Revision history for this message
Brad Figg (brad-figg) wrote :

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-vivid' to 'verification-done-vivid'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-vivid
Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2015-12-10 16:47 EDT-------
Tested with the vivid-proposed kernel and it is fixed in it.

tags: added: verification-done-vivid
removed: verification-needed-vivid
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (14.4 KiB)

This bug was fixed in the package linux - 3.19.0-41.46

---------------
linux (3.19.0-41.46) vivid; urgency=low

  [ Luis Henriques ]

  * Release Tracking Bug
    - LP: #1522918

  [ Upstream Kernel Changes ]

  * Revert "dm: fix AB-BA deadlock in __dm_destroy()"
    - LP: #1522766
  * dm: fix AB-BA deadlock in __dm_destroy()
    - LP: #1522766

linux (3.19.0-40.45) vivid; urgency=low

  [ Luis Henriques ]

  * Release Tracking Bug
    - LP: #1522786

  [ Andy Whitcroft ]

  * [Packaging] control -- prepare for new kernel-wedge semantics
    - LP: #1516686
  * [Debian] rebuild should only trigger for non-linux packages
    - LP: #1498862, #1516686
  * [Tests] gcc-multilib does not exist on ppc64el
    - LP: #1515541

  [ Joseph Salisbury ]

  * SAUCE: scsi_sysfs: protect against double execution of
    __scsi_remove_device()
    - LP: #1509029

  [ Luis Henriques ]

  * [Config] updateconfigs after 3.19.8-ckt10 stable update

  [ Upstream Kernel Changes ]

  * Revert "ARM64: unwind: Fix PC calculation"
    - LP: #1520309
  * Revert "md: allow a partially recovered device to be hot-added to an
    array."
    - LP: #1520309
  * tty: fix stall caused by missing memory barrier in drivers/tty/n_tty.c
    - LP: #1512815
  * HID: rmi: Print the firmware id of the touchpad
    - LP: #1515503
  * HID: rmi: Add functions for writing to registers
    - LP: #1515503
  * HID: rmi: Disable scanning if the device is not a wake source
    - LP: #1515503
  * HID: rmi: Set F01 interrupt enable register when not set
    - LP: #1515503
  * be2net: log link status
    - LP: #1513980
  * xhci: Workaround to get Intel xHCI reset working more reliably
  * Drivers: hv: hv_balloon: refuse to balloon below the floor
    - LP: #1294283
  * Drivers: hv: hv_balloon: survive ballooning request with num_pages=0
    - LP: #1294283
  * Drivers: hv: hv_balloon: correctly handle val.freeram<num_pages case
    - LP: #1294283
  * Drivers: hv: hv_balloon: correctly handle num_pages>INT_MAX case
    - LP: #1294283
  * Drivers: hv: balloon: check if ha_region_mutex was acquired in
    MEM_CANCEL_ONLINE case
    - LP: #1294283
  * mm: meminit: make __early_pfn_to_nid SMP-safe and introduce
    meminit_pfn_in_nid
    - LP: #1294283
  * mm: meminit: inline some helper functions
    - LP: #1294283
  * mm, meminit: allow early_pfn_to_nid to be used during runtime
    - LP: #1294283
  * mm: initialize hotplugged pages as reserved
    - LP: #1294283
  * gut proc_register() a bit
    - LP: #1519106
  * arm: factor out mmap ASLR into mmap_rnd
    - LP: #1518483
  * x86: standardize mmap_rnd() usage
    - LP: #1518483
  * arm64: standardize mmap_rnd() usage
    - LP: #1518483
  * mips: extract logic for mmap_rnd()
    - LP: #1518483
  * powerpc: standardize mmap_rnd() usage
    - LP: #1518483
  * s390: standardize mmap_rnd() usage
    - LP: #1518483
  * mm: expose arch_mmap_rnd when available
    - LP: #1518483
  * s390: redefine randomize_et_dyn for ELF_ET_DYN_BASE
    - LP: #1518483
  * mm: split ET_DYN ASLR from mmap ASLR
    - LP: #1518483
  * mm: fold arch_randomize_brk into ARCH_HAS_ELF_RANDOMIZE
    - LP: #1518483
  * isdn_ppp: Add checks for allocation failure in isdn_ppp_open()
   ...

Changed in linux (Ubuntu Vivid):
status: Fix Committed → Fix Released
Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2015-12-17 13:36 EDT-------
Tested fix in the released kernel and it is working properly. Thank you for the help. Closing this bug on our side.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.