On Thu, May 29, 2014 at 09:40:34AM -0400, Phillip Susi wrote: > Indeed, part of the problem is that everyone piled into the same bug > with several different issues rather than troubleshooting it on a case > by case basis. This certainly happens, and I realise that's annoying; any bug like this is likely to be only partially fixed, and at some point people who still have problems will need to be directed to file new bugs rather than continuing to comment on the closed bug. However, closing the bug without making any technical changes is likely to be read as blowing *everyone* off, no matter the good intentions, and just compounds the problem. > It wasn't a starting point for a conversation; I had tried dozens of > times for weeks to get more information, identify the cause(es), and > explain why it was a result of incorrect action on the user's part. > That statement was made in direct response to someone saying that as a > user they felt they needed to reopen it ( yet again ) without > understanding why I had closed it, or offering any real > counter-argument. By that point I was throwing my arms in the air. When people repeatedly reopen a bug, it's often worth considering whether it was actually the right thing to do to close it in the first place. The sheer number of people affected by this class of bugs is an indication that we shouldn't be closing it out of hand, even if you don't immediately see what we can do about it. Given that we have extensive maintainer script code for dealing with situations like this, there's clearly scope for further improvement. > It would be helpful if you would comment if you think there actually > is something that might be done. Since this had gone on for some time > without any comment from you, I assumed you were ignoring it as just > another kvetch fest. I certainly would be interested in any ideas you > might have. I'm afraid I don't have time to read more than a tiny fraction of the bug mail I get, although this had been escalated to me by several folks in my management chain and I'd put it on my to-do list for 14.04.1; I'd just been heads-down in the image build infrastructure changes I'm currently doing, so hadn't emerged for long enough to dig through the bug. I don't yet have specific fixes in mind, but there is certainly plenty of fodder for investigation here. For example, skimming through the bug log, I see an instance (https://bugs.launchpad.net/ubuntu/+source/grub2/+bug/1289977/comments/207) where somebody swapped disks and then the maintainer scripts didn't realise that they needed to install GRUB to the new disk. This situation is *specifically* intended to be handled by the maintainer script code I wrote some time ago (and wrote up in http://www.chiark.greenend.org.uk/ucgi/~cjwatson/blosxom/debian/2010-06-21-grub2-boot-problems.html), so if it's failing then I need to investigate that, not discard it as a situation we can't fix. This is a long-standing class of bug, although the precise details have varied over time. The reason it's so difficult to address is that the root causes are often far removed in time: if you get your configuration wrong then you often don't find out about it until the next upgrade. That makes this very challenging to deal with, although not impossible. In many cases this is user error, narrowly defined (that is, the user did not do the "right thing", but perhaps we didn't do much to help them know what the right thing would be). Still, it's still sometimes possible to detect it heuristically and offer to correct the situation on upgrade: given that the result of failure is a failed boot, it's worth going beyond what we would ordinarily do to handle user error. For example, I'm considering approaches such as looking for binary signatures which would serve to identify GRUB across a wide range of versions, or patching grub-install to leave a note for future grub-pc.postinst runs, or going through my existing detection code again to try to find paths where it's supposed to ask questions but fails to do so. The other strand of investigation is to try to track down reasons why this happened in the first place. For example, I suspect that there may be some paths where installing Ubuntu leaves the wrong thing in grub-pc/install_devices. I'd also like to go through some of our user-facing documentation such as https://help.ubuntu.com/community/Grub2, and try to cut it down a bit and review closely for any inaccuracies. If I find time I'd also like to review tools such as boot-repair and see if I can make sure that they don't fix immediate problems while leaving future timebombs around (which might relate to patching grub-install). That's a rough idea of what I plan to look at here. As you can see it's extensive and will require a good deal of continuous concentration; I expect to have to carve out at least three solid days to work on this. -- Colin Watson [