Comment 4 for bug 1371703

Zygmunt Krynicki (zyga) wrote :

Back nearly ten years ago I was working on an upgrade process for a STB running Linux on a 32MB NOR flash. We did consider the A/B option that we used earlier on models with more memory. Here, that was not an option. What we did implement instead is a A/x model. A is the working (current) image. The little x is a side-image that we flash at the factory that can only do one thing. Unattended upgrades.

If we get enough boot failures we go to the x image. That image has basic static image on screen showing the user to plug the Ethernet cable (this is STB we're talking about). The image was displayed on all possible outputs. Once the cable is plugged in the STB downloads the image, block by block, and flashes it live (after checking each checksum) to the A image. Once that is done we reset the failed boot counter and reboot (to A).

No matter what happens, you can reboot and install A.

So how do we upgrade x? Ha. We had a few options.

We can never upgrade x, surprisingly this is a good option by many standards. This is pretty reliable in practice. This means that we have lower support cost. This means that users can really always reboot and recover. The downsides? Obviously security updates here are important (though this is a STB so it's not a security critical product, at least, not the way we think of security *g*). The other downside is that the x image must be extremely reliable, with defensive coding, reviews and -- the best option -- proven track record of working on other projects. Some customers used this option.

We can choose to update x after booting A successfully. This is is also tricky. What if the upgrade fails? Assuming that A is still working it can try again, and again, and again. Even if it fails you still have a working A image (and the only reason it can fail is if the power cable gets yanked mid-way or if the user has worn-out the flash memory. This is pretty much a dead-end scenario anyway. Since this was a networked STB we could always try to update x from A. If you had little storage available, this is a pretty good option. Also, the user doesn't need to see anything as the x update can be done without any UI. From the users' POV the product works.

Lastly, for extra paranoid safety you can do A/x1/x2. Then you can really do recover anything. Depending on available non-volatile memory size, this option is the safest bet. I don't think we ever used this in production though.

So back to the phone. We can do A/x here IMHO, iff the recovery partition had a special, unattended recovery process. This is harder to do on a mobile device. Perhaps we could have an image that shows "plug the phone". On Ubuntu we could have appropriate udev rules that would auto-recover bricked Ubuntu phones. We could also have a QR code with a link to a page that has Windows / OS X software (once we get some) or explanation on how to do this from an Ubuntu machine. A minimal kernel, minimal userspace (no unity, no anything, just mir to display a single static image, this can even be a part of the bootloader if we look hard enough). The image doesn't need any i18n in it. On the desktop software side we'd have to have a way to do recovery flashes and I see this option was disabled. Perhaps that's not the best way forward though.

Anyway, my 0.02 euro cents.