rpi3bp+: ethernet leds don't blink

Bug #1802320 reported by Paolo Pisati
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux-raspi2 (Ubuntu)
Fix Committed
Undecided
Unassigned
Xenial
Fix Released
Undecided
Unassigned

Bug Description

Impact:

Using a xenial/raspi2 kernel on a RaspberryPi3B+ board, ethernet leds don't blink (though ethernet port works fine).

Leds not working are due to a missing (actualy reverted) upstream stable patch in xenial/raspi2 (1111a9 "UBUNTU: SAUCE: Revert "lan78xx: Correctly indicate invalid OTP""), but simply reverting that patch uncovers another more severe bug: the ethernet port completely stops to work since two hardware registers are not updated when, during bootup, the mac address is set (i explained this part of the bug here in detail: https://lkml.org/lkml/2018/11/7/860).

When doing the initial enablement of the rpi3bp+ board in xenial, i found that the ethernet port was not working and after a bisection i found an upstream stable commit (the above "lan78xx: Correctly indicate invalid OTP") that was causing it, reverted it.

But it turned out that this patch was legit, and the buggy behaviour of the lan78xx_read_otp() function (that was being fixed by the above "lan78xx: Correctly indicate invalid OTP" patch) was actually hiding another bug in the lan78xx ethernet driver:

https://lkml.org/lkml/2018/11/7/860

"
Upon invocation, lan78xx_init_mac_address() checks that the mac address present in the RX_ADDRL & RX_ADDRH registers is a valid address, if not, it first tries to read a new address from an external eeprom or the otp area, and in case both read fail (or the address read back is invalid), it randomly generates a new one.

Unfortunately, due to the way the above logic is laid out, if both read_eeprom() and read_otp() fail, a new mac address is correctly generated but is never written back to RX_ADDRL & RX_ADDRH, leaving the chip in an incosistent state and with an invalid mac address (e.g. the nic appears to be completely dead, and doesn't receive any packet, etc):

lan78xx_init_mac_address()
...
if (lan78xx_read_eeprom(addr ...) || lan78xx_read_otp(addr ...)) {
 if (is_valid_ether_addr(addr) {
  // nop...
 } else {
  random_ether_addr(addr);
 }

 // correctly writes back the new address
 lan78xx_write_reg(RX_ADDRL, addr ...);
 lan78xx_write_reg(RX_ADDRH, addr ...);
} else {
 // XXX if both eeprom and otp read fail, we land here and skip
 // XXX the RX_ADDRL & RX_ADDRH update completely
 random_ether_addr(addr);
}

This bug went unnoticed because lan78xx_read_otp() was buggy itself and would never fail, up until 4bfc338 "lan78xx: Correctly indicate invalid OTP" fixed it and as a side effect uncovered this bug.
"

Upstream later decided to take an entire patch from 4.18.y instead of the fix i proposed, but that is fine, and is one of the patch i'm presenting here (patch 003 "lan78xx: Read MAC address from DT if present").

Fix:

The whole fix consist in importing a new upstream patch from 4.18.y for the mac address changing logic (patch 003), reverting two Raspberry BSP patches that clash (and are obsoleted) by this new patch (patch 001 and 002), and then reverting a SAUCE patch, or in other words, reapply the lan78xx_read_otp() upstream fix (patch 004), that is the actual fix for the ethernet leds.

Chronologically, first the two reverts:

commit 3f25fbb82f00a80e9eb3be0ce60abebfc263c84a
Author: Paolo Pisati <email address hidden>
Date: Thu Nov 8 10:35:09 2018 +0000

    Revert "lan78xx: Ignore DT MAC address if already valid"

    This reverts commit 17f23a96597810ddd56b0c10584fce77d7c3707f.

    Signed-off-by: Paolo Pisati <email address hidden>

and

commit 6c8bdff882656296adc20b6ae0fb727483a73c7c
Author: Paolo Pisati <email address hidden>
Date: Thu Nov 8 10:35:11 2018 +0000

    Revert "lan78xx: Read MAC address from DT if present"

    This reverts commit a23d928781936b51a61a67a0799b77b2a6becfa2.

    Signed-off-by: Paolo Pisati <email address hidden>

then the upstream cherry pick for the mac changing logic:

commit 5d9c81e3aa1dd39dafd9a6ea30da05b26b655eca
Author: Phil Elwell <email address hidden>
Date: Thu Apr 19 17:59:38 2018 +0100

    lan78xx: Read MAC address from DT if present

    There is a standard mechanism for locating and using a MAC address from
    the Device Tree. Use this facility in the lan78xx driver to support
    applications without programmed EEPROM or OTP. At the same time,
    regularise the handling of the different address sources.

    Signed-off-by: Phil Elwell <email address hidden>
    Signed-off-by: David S. Miller <email address hidden>
    (cherry picked from commit 760db29bdc97b73ff60b091315ad787b1deb5cf5)
    Signed-off-by: Paolo Pisati <email address hidden>

and finally revert the SAUCE patch that fixes lan78xx_read_otp(), and makes the led code work again:

commit 24dc8f22b15cc983e13a0ae88b372dd1bfad09f9
Author: Paolo Pisati <email address hidden>
Date: Thu Nov 8 10:37:17 2018 +0000

    Revert "UBUNTU: SAUCE: Revert "lan78xx: Correctly indicate invalid OTP""

    This reverts commit 1111a9e0bd2ebab88e736d8a9773df2ec76dc52c.

    Signed-off-by: Paolo Pisati <email address hidden>

See also the attached patches.

How to test:

Boot a patched kernel, connect an ethernet cable to the board and check that the ethernet leds are blinking when there's traffic

Regression potential:

None, we are dropping two raspberry BSP patches in favor of an upstream patch and re-applying another upstream fix: all patches have been upstream for awhile, and are clean cherry picks.

Tags: patch
Revision history for this message
Paolo Pisati (p-pisati) wrote :
description: updated
Revision history for this message
Paolo Pisati (p-pisati) wrote :
Revision history for this message
Paolo Pisati (p-pisati) wrote :
Revision history for this message
Paolo Pisati (p-pisati) wrote :
description: updated
description: updated
description: updated
Paolo Pisati (p-pisati)
description: updated
description: updated
Paolo Pisati (p-pisati)
description: updated
description: updated
Paolo Pisati (p-pisati)
description: updated
description: updated
description: updated
description: updated
description: updated
Paolo Pisati (p-pisati)
description: updated
description: updated
description: updated
Paolo Pisati (p-pisati)
description: updated
description: updated
tags: added: patch
Changed in linux-raspi2 (Ubuntu):
status: New → Fix Committed
Changed in linux-raspi2 (Ubuntu Xenial):
status: New → Fix Committed
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (8.8 KiB)

This bug was fixed in the package linux-raspi2 - 4.4.0-1101.109

---------------
linux-raspi2 (4.4.0-1101.109) xenial; urgency=medium

  * linux-raspi2: 4.4.0-1101.109 -proposed tracker (LP: #1802780)

  * rpi3bp+: ethernet leds don't blink (LP: #1802320)
    - Revert "lan78xx: Ignore DT MAC address if already valid"
    - Revert "lan78xx: Read MAC address from DT if present"
    - lan78xx: Read MAC address from DT if present
    - Revert "UBUNTU: SAUCE: Revert "lan78xx: Correctly indicate invalid OTP""

  [ Ubuntu: 4.4.0-140.166 ]

  * linux: 4.4.0-140.166 -proposed tracker (LP: #1802776)
  * Bypass of mount visibility through userns + mount propagation (LP: #1789161)
    - mount: Retest MNT_LOCKED in do_umount
    - mount: Don't allow copying MNT_UNBINDABLE|MNT_LOCKED mounts
  * kdump fail due to an IRQ storm (LP: #1797990)
    - SAUCE: x86/PCI: Export find_cap() to be used in early PCI code
    - SAUCE: x86/quirks: Add parameter to clear MSIs early on boot
    - SAUCE: x86/quirks: Scan all busses for early PCI quirks
  * crash in ENA driver on removing an interface (LP: #1802341)
    - SAUCE: net: ena: fix crash during ena_remove()
  * xenial guest on arm64 drops to busybox under openstack bionic-rocky
    (LP: #1797092)
    - [Config] CONFIG_PCI_ECAM=y
    - PCI: Provide common functions for ECAM mapping
    - PCI: generic, thunder: Use generic ECAM API
    - PCI, of: Move PCI I/O space management to PCI core code
    - PCI: Move ecam.h to linux/include/pci-ecam.h
    - PCI: Add parent device field to ECAM struct pci_config_window
    - PCI: Add pci_unmap_iospace() to unmap I/O resources
    - PCI/ACPI: Support I/O resources when parsing host bridge resources
    - [Config] CONFIG_ACPI_MCFG=y
    - PCI/ACPI: Add generic MCFG table handling
    - PCI: Refactor pci_bus_assign_domain_nr() for CONFIG_PCI_DOMAINS_GENERIC
    - PCI: Factor DT-specific pci_bus_find_domain_nr() code out
    - ARM64: PCI: Add acpi_pci_bus_find_domain_nr()
    - ARM64: PCI: ACPI support for legacy IRQs parsing and consolidation with DT
      code
    - ARM64: PCI: Support ACPI-based PCI host controller
  * [GLK/CLX] Enhanced IBRS (LP: #1786139)
    - x86/speculation: Remove SPECTRE_V2_IBRS in enum spectre_v2_mitigation
    - x86/speculation: Support Enhanced IBRS on future CPUs
  * Update ENA driver to version 2.0.1K (LP: #1798182)
    - net: ena: remove ndo_poll_controller
    - net: ena: fix warning in rmmod caused by double iounmap
    - net: ena: fix rare bug when failed restart/resume is followed by driver
      removal
    - net: ena: fix NULL dereference due to untimely napi initialization
    - net: ena: fix auto casting to boolean
    - net: ena: minor performance improvement
    - net: ena: complete host info to match latest ENA spec
    - net: ena: introduce Low Latency Queues data structures according to ENA spec
    - net: ena: add functions for handling Low Latency Queues in ena_com
    - net: ena: add functions for handling Low Latency Queues in ena_netdev
    - net: ena: use CSUM_CHECKED device indication to report skb's checksum status
    - net: ena: explicit casting and initialization, and clearer error handling
    - net: ena: limit refill Rx th...

Read more...

Changed in linux-raspi2 (Ubuntu Xenial):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.