ISST-KTE:PowerNV:UBUNTU14.10: Shiner Adapter ethernet port does not come up

Bug #1356948 reported by bugproxy
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Fix Released
Medium
Tim Gardner
Utopic
Fix Released
Medium
Tim Gardner

Bug Description

---Problem Description---
When trying to bring up an ethernet port on a Shiner Network adapter, the terminal outputs the following:

root@podkvm:~# ifconfig eth7 up
[ 3126.678507] bnx2x: [bnx2x_attn_int_deasserted2:4099(eth7)]CFC hw attention 0x2
[ 3126.678586] bnx2x: [bnx2x_attn_int_deasserted2:4102(eth7)]FATAL error from CFC
[ 3136.698592] bnx2x: [bnx2x_state_wait:308(eth7)]timeout waiting for state 1
[ 3136.698678] bnx2x: [bnx2x_setup_queue:8625(eth7)]Queue(0) SETUP failed
[ 3136.698688] bnx2x: [bnx2x_nic_load:2721(eth7)]Setup leading failed!
SIOCSIFFLAGS: Device or resource busy

modules loaded:
root@podkvm:~# lsmod
Module Size Used by
rtc_generic 2711 0
powernv_rng 3244 0
ses 10118 0
enclosure 12767 1 ses
mlx4_en 118002 0
bnx2x 920334 0
lpfc 836357 0
mlx4_core 311074 1 mlx4_en
mdio 6270 1 bnx2x
libcrc32c 1995 1 bnx2x
ipr 142194 2
be2net 144413 0
scsi_transport_fc 80636 1 lpfc
scsi_tgt 18399 1 scsi_transport_fc
vxlan 48609 2 be2net,mlx4_en

---uname output---
Linux podkvm 3.15.0-6-generic #11-Ubuntu SMP Thu Jun 12 00:40:49 UTC 2014 ppc64le ppc64le ppc64le GNU/Linux

---Additional Hardware Info---
System firmware: 1427A

Machine Type = 8247-22L

I can get the interface up on petitboot and on a 3.10 kernel used for PowerKVM installer. I can't do it for Ubuntu 14.04 kernel, based on 3.13. I can reproduce the same thing with a 3.13 kernel on another PowerNV system.

However, it works on a PCI passthrough environment on PowerKVM, with both 14.04 kernel and 14.10. So, this is specific to PowerNV. I will try upstream kernel versions and see if it's possible to find a culprit.

I collected this with msglevel options on the driver. The driver specific option SP, timer, interrupt, link, ifup, and probe.

Cascardo.

>>However, it works on a PCI passthrough environment on PowerKVM, with both >>14.04 kernel and 14.10. So, this is specific to PowerNV. I will try upstream >>kernel versions and see if it's possible to find a culprit.

I couldn't find Thadeu's kernel. I built the newer kernel + bnx2x driver from upstream, still saw the same failure. I will look into more.

Thanks,
Wendy

Shiner info:

root@podkvm:~# ethtool -i eth7
driver: bnx2x
version: 1.78.19-0
firmware-version: bc 7.10.4

Revision history for this message
bugproxy (bugproxy) wrote : dmesg output

Default Comment by Bridge

tags: added: architecture-ppc64le bugnameltc-112905 severity-high targetmilestone-inin1410
Revision history for this message
bugproxy (bugproxy) wrote : dmesg with msglvl set to a more verbose option

Default Comment by Bridge

Revision history for this message
bugproxy (bugproxy) wrote : Comment bridged from LTC Bugzilla

------- Comment From <email address hidden> 2014-08-14 16:00 EDT-------
Please add Qlogic developer: Gary(<email address hidden>) in cc list in LanuchPad bug.

Thanks,
Wendy

Luciano Chavez (lnx1138)
affects: ubuntu → linux (Ubuntu)
Changed in linux (Ubuntu):
status: New → Confirmed
Revision history for this message
bugproxy (bugproxy) wrote : insmod ./bnx2x.ko debug=0xfff with the latest bnx2x driver I got from Qlogic

------- Comment (attachment only) From <email address hidden> 2014-08-14 21:18 EDT-------

Revision history for this message
bugproxy (bugproxy) wrote : Comment bridged from LTC Bugzilla

------- Comment From <email address hidden> 2014-08-15 20:01 EDT-------
Since yesterday afternoon, I have worked with Qlogic developer and sent him several debug logs. After looking at the debug logs, Gary needs to take hardware grcdump for the issue. At the same time, I helped him convert his P8 to OPAL mode. Hope he can re-create the issue in his lab.

Siraj, I gave your email address to Gary(Qlogic developer). If Gary needs to take grcdump on your system next Monday, he will contact you next Monday, Medha from hardware team can help you to take grcdump on your system.

I am going to send you a note how to load the bnx2x debug driver on your system if Gary needs to take grcdump.

Thanks,
Wendy

Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2014-08-15 21:17 EDT-------
How to load bnx2x debug kernel on sp3p01?

(1)After system boot up, run "rmmod bnx2x"
(2) cd /root/gary/bnx2x/src
(3)load debug bnx2x driver:
#insmod ./bnx2x.ko debug=0xffffffff

Mehda is in building 45 and she can take grcdump if needs next Monday.

Thanks,
Wendy

Changed in linux (Ubuntu):
importance: Undecided → Medium
status: Confirmed → Incomplete
status: Incomplete → Confirmed
tags: added: ppc64el
Changed in linux (Ubuntu):
status: Confirmed → Triaged
Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2014-08-20 18:22 EDT-------
FYI, tthere will be a power shutdown in building 45 from Friday, August 26 until Tuesday, September 2 due to the holiday weekend.

bugproxy (bugproxy)
tags: added: severity-critical
removed: severity-high
Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2014-08-25 19:30 EDT-------
I am back from leave and I am looking at this. I believe this is not a driver issue, but a problem with PowerNV, LE and firmware. Not sure about the nature of the problem, but since the adapter works on petitboot, and Ubuntu 14.10 works fine as a PowerKVM guest, I don't think the driver or adapter firmware is the problem here.

Cascardo.

Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2014-08-27 05:57 EDT-------
Since Thadeu didn't call into daily meeting with Qlogic and IBM hardware team, I am still following up all action items.

There are lots of changes in bnx2x driver from 3.10 kernel to upstream kernel. I only backported bnx2x_shutdown(), bnx2x_remove_one() and __bnx2x_remove() from upstream to 3.10 petitboot kernel. With adding bnx2x_shutdown(), I still saw the issue.

After that, I built the 3.10 petitboot kernel without bnx2x driver. bnx2x works fine in native Ubuntu14.10. Looks adapter didn't get clean up before loading the LE driver. Probably we/qlogic developers need to look at shutdown path.

Thanks,
Wendy

Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2014-08-28 14:44 EDT-------
Our team has installed the patch on test system and are able to set up the interface up

Please verify the patch and let us know.

Thanks,
Wendy

Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2014-08-28 20:00 EDT-------
Ok. The patches fixed "Port doesn't come up" issue!!

For HTX issue, please ask HTX developer how to drop system into xmon if data miscompare happens. Also ask them to debug xmon session. Thanks!

Revision history for this message
bugproxy (bugproxy) wrote : this patch by qlogic fixes the problem

------- Comment on attachment From <email address hidden> 2014-08-29 15:24 EDT-------

This patch, by QLogic, fixes the bug. It will set the card back to LE mode after it's booted from a BE kernel, that leaves it at BE mode.

Revision history for this message
bugproxy (bugproxy) wrote : set device back to LE mode during shutdown, in case of kexec

------- Comment on attachment From <email address hidden> 2014-08-29 15:27 EDT-------

This second patch is going upstream soon, and fixes the problem from the point of the BE kernel that kexecs the second, LE, kernel. Also by QLogic.

Revision history for this message
bugproxy (bugproxy) wrote : Comment bridged from LTC Bugzilla

------- Comment From <email address hidden> 2014-09-02 14:40 EDT-------
petitboot patch is zImage.epapr in http://ausgsa.ibm.com/~wenxiong/public/.

Thanks,
Wendy

Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2014-09-02 17:39 EDT-------
This is probably what is going upstream.

http://marc.info/?l=linux-netdev&m=140964868503853&w=2

Cascardo.

Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2014-09-02 23:35 EDT-------
This is on David Miller's net tree with commit 04860eb7d911bbd958463416cc045b69ffdf73b3. I already sent a cherry-picked patch to kernel-team mailing list.

I have built a kernel with the patch as submitted upstream and tested it.

Cascardo.

Revision history for this message
bugproxy (bugproxy) wrote : Patch as submitted upstream with cherry-pick, my sign-off and bug link

------- Comment (attachment only) From <email address hidden> 2014-09-02 23:36 EDT-------

Revision history for this message
Breno Leitão (breno-leitao) wrote :

This bug is a ship issue, although the bug is marked as Medium, and I am not able to change it to ship issue.

Changed in linux (Ubuntu):
assignee: nobody → Taco Screen team (taco-screen-team)
Tim Gardner (timg-tpi)
Changed in linux (Ubuntu Utopic):
assignee: Taco Screen team (taco-screen-team) → Tim Gardner (timg-tpi)
milestone: none → ubuntu-14.10
status: Triaged → In Progress
Revision history for this message
bugproxy (bugproxy) wrote : Comment bridged from LTC Bugzilla

------- Comment From <email address hidden> 2014-09-04 21:30 EDT-------
I am testing the patch on our PowerNV system, Ubuntu 14.10 now. So far there's no issues.

Revision history for this message
Diane Brent (drbrent) wrote :

If no issues in test, is Canonical clear to pick up this change up identified in comment #17?
We need this asap!

Revision history for this message
Tim Gardner (timg-tpi) wrote :

Cherry-picked from linux-next

UBUNTU: SAUCE: bnx2x: Configure device endianity on driver load and reset endianity on removal.

Changed in linux (Ubuntu Utopic):
status: In Progress → Fix Committed
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package linux - 3.16.0-14.20

---------------
linux (3.16.0-14.20) utopic; urgency=low

  [ Tim Gardner ]

  * Release Tracking Bug
    - LP: #1366431

  [ dann frazier ]

  * [Config] CONFIG_HW_RANDOM_XGENE=m on arm64

  [ Manish Chopra ]

  * SAUCE: bnx2x: Configure device endianity on driver load and reset
    endianity on removal.
    - LP: #1356948

  [ Tim Gardner ]

  * [Config] CONFIG_XMON=y
    - LP: #1365655
  * [Config] CONFIG_KVM_BOOK3S_64=m for ppc64el
    - LP: #1362514
  * [Config] CONFIG_KVM_BOOK3S_64_HV=m
    - LP: #1362514

  [ Upstream Kernel Changes ]

  * hwrng: xgene - add support for APM X-Gene SoC RNG support
    - LP: #1365593
  * Documentation: rng: Add X-Gene SoC RNG driver documentation
    - LP: #1365593
  * arm64: dts: add random number generator dts node to APM X-Gene
    platform.
  * KVM: PPC: Book3S: Move vcore definition to end of kvm_arch struct
    - LP: #1362514
  * KVM: Move more code under CONFIG_HAVE_KVM_IRQFD
    - LP: #1362514

  [ Upstream Kernel Changes ]

  * rebase to v3.16.2
    - LP: #1358116
    - LP: #1334950
    - LP: #1350148
 -- Tim Gardner <email address hidden> Sat, 06 Sep 2014 07:52:15 -0700

Changed in linux (Ubuntu Utopic):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.