Comment 22 for bug 1903048

Revision history for this message
Dave Jones (waveform) wrote : Re: [Bug 1903048] Re: [SRU] Bluetooth won't activate on the pi 400

On Thu, Nov 12, 2020 at 11:11:23AM -0000, Sebastien Bacher wrote:
>Sorry but I'm reverting that upload for now until the patches are
>properly upstreamed. We have been bitten too often by unforwarded
>changes that create issues or create maintainance burden over the years
>and we currently don't have the team capacity to deal with extra cost.
>If foundations would like to step up and take over bluez though that's a
>discussion we could have...

My apologies; I omitted to note in the patch that the Pi 400 is a
certified platform, i.e. we're effectively committed to making it work
as best we can, which makes the status of whether upstream accept the
patches or not rather moot:

If upstream say "yes" to the patches, without further discussion: that's
great, but there'd presumably still be some delay before another version
makes its way down to us, so we'd be applying *some* patch to make it
work in the interim.

If upstream say "with these changes" to the patches: again great, but in
the interim, we'd again be applying *some* patch to make things work on
a certified platform while working through changes upstream.

If upstream say "not in a month of Sundays" to these patches: we'd be
left carrying *some* patch, and we'd do it because it's a certified
platform and we're committed to making it work.

 From our end user's perspective therefore, the application of this
patch-set, whether via upstream or not, and whether in this form or not,
is not a matter of "if", but "when". Given we have in our possession a
patch-set that fixes the problem, I at least don't have a good answer to
the hypothetical user's obvious follow-up question: "why not now?"

Anyway, onto technical matters:

>I'm not asking for the patches to be merged upstream but at least to be
>sent there for review and have the appropriate tagging, it's easy to
>unblock from your side. Also I would like to see if we can get for a
>better way to probe for the system to be ready rather than relying on a
>random sleep

The following involves a certain amount of educated guessing. I haven't
dug into this extensively, but I can offer some reasons for why sleep(1)
is being used (and how this could be refined a bit, but not much):

Consider the bcm43xx_load_firmware routine in hciattach_bcm43xx.c [1]
which is being called prior to the apparently arbitrarily inserted
sleep(1) line in the patch. It's a fairly typical firmware load routine
by all accounts:

1. Calls write() with a command (0x2e 0xfc) to place the device into a
    state where it's read to receive new firmware

2. Calls read_hci_event() to check the device responded "okay!"

3. Calls nanosleep() to wait a short while (50ms) for the device to be
    ready

4. Calls read() and write() repeatedly to write the firmware blob from a
    file to the UART

5. Calls nanosleep() again to wait another short while (200ms) after the
    firmware load

So the existing (pre-patch) firmware load routine already has a
seemingly arbitrary post-firmware-load delay of 200ms. If we have a look
at the BCM4334 datasheet [2], p.159 we can see "wait at least 150ms
after [power on] before initiating SDIO accesses" (SDIO being one of
several interfaces for this particular module). However, that's just for
the BCM4334. There's also the BCM4330, BCM4329, and the Cypress 305 (on
the Pi 400, which uses the BCM43xx interface; Cypress, incidentally,
acquired Broadcom's wireless business which explains why their chips
suddenly have Broadcom's interfaces).

The bluez interface *could* presumably check precisely which device it
was dealing with and adjust its post-firmware-load delay accordingly. Or
it could simply go with a delay that's "long enough" to cover all the
supported chipsets, and avoid all that logic (which appears to be what's
favoured in the original code). Given it's a one-time setup delay, and
it's likely to occur during boot-up (or system wake-up) when half a
dozen other things are happening in parallel, it's unlikely the user
will notice a difference between a 150ms, 200ms, or even 1.2s delay to
their Bluetooth module becoming available.

Hence one possibility for the sleep(1) delay is that the Cypress 305,
being a later and more feature-rich device, requires a longer delay. I
can't confirm that as I can't find the datasheet online; sadly, this
isn't remotely surprising, but I could reasonably assume that the Pi
Foundation *do* have access to it and thus the 1 second delay isn't
*entirely* arbitrary.

In some of the BCM43xx modules it does appear that the UART CTS line is
asserted to indicate the device is "ready" (e.g. the BCM4330 datasheet
mentions this under the Bluetooth PMU section). However, one of the
configurations (not the default, but a supported configuration
nonetheless) for the Bluetooth module on all Pi models is to be operated
via the mini-UART instead of the PL011. The mini-UART, as noted
previously, explicitly lacks any hardware flow-control ([3], p.10) so in
such a configuration there'll be no way to tell the module is ready
post-firmware-load beyond "wait some reasonable length of time and hope
all the buffers are flushed".

Again, we could add plenty of logic to determine "are we on a Raspberry
Pi?", "which UART precisely are we communicating over?", and "is that
UART currently mapped to the mini-UART hardware?". But is all that logic
really worth it to avoid a delay the user doesn't notice?

To summarize: while the sleep(1) line may look rather arbitrary, it's
not that unusual for a firmware load routine (though ideally, looking at
bcm43xx_load_firmware now, I would be tempted to tweak the second
nanosleep in preference) and there are probably good technical reasons
for it, but they're likely buried in an NDA'd data-sheet which we don't
have.

Still, this doesn't change the simple fact that we're committed to
making this work, and we have a patch-set that makes it work. I'm happy
to continue refining that, and to shove as much, or as little, of this
commentary into that patch-set as is deemed necessary to justify its
logic, but I don't think we should keep our users waiting on our (or
upstream's) pedantry :)

[1]
https://git.launchpad.net/ubuntu/+source/bluez/tree/tools/hciattach_bcm43xx.c#n229

[2] http://www.cypress.com/file/298676/download

[3]
http://www.raspberrypi.org/app/uploads/2012/02/BCM2835-ARM-Peripherals.pdf