tl;dr: A workaround to this problem is updating plainbox so the xml "summary" block has the "architecture" element before the "distroseries" element. The story is long but hopefully interesting, see below.... Yay! So I found a difference between the plainbox and checkbox XML files which is triggering this problem. It will be worth continuing to look at the parser until we understand this problem; I'm uneasy about just magically changing something and then having things work without understanding at least why the parser behaves like this. In the summary section (at the very end of the xml file), checkbox produces this summary block:

Whereas plainbox has the same fields but in different order:

As it turns out, if the architecture field comes *after* distroseries, the parser starts acting strange. For instance, I was able to reproduce the weird "dmi devices first, then a block of packages that breaks things, then a second block of udev devices" behavior, using a checkbox submission, and simply moving the architecture field to be after distroseries. In my investigations of the parser I found that the separate blocks are there "by design": if the parser starts finding devices, it will only add them to the existing "devices" block if the last block (or message, in checkbox parlance) is another "devices" message. So what happens is: - Parser starts finding devices from the dmi attachment, creates a devices message, and appends the devices to this one. - Parser starts processing packages (WHY? this is still a mystery, triggered by the architecture/distroseries problem in the summary). - Parser finds another batch of devices from the udev attachment, but since the packages "got in the way", it creates a new devices message and puts the devices there. Code for this is in hexr, in apps/uploads/checkbox_parser.py, in TestRun.addDeviceState (at the beginning of the method). A quick solution for this would be changing plainbox's xml exporter so that it produces the summary fields in the same order as checkbox (or at least, ensuring architecture comes before distroseries, in my experiments that seems to be enough). However it may be worth taking a closer look at this on the parser and c4 side. For instance, the checkbox/lib/parsers/submission.xml clearly says: # Iterate over the root children, "summary" first So even if the summary is at the end, that's the first thing that's parsed, and I need to look at whether one of those fields triggers this weird behavior. Next, the "messages" format/protocol is meant to be additive, so perhaps c4 should read the messages one by one and do something like: - Get and process the first block of devices - Get and process the packages - Get and process (adding to the existing ones) the second block of devices. I'd have to look at the rest of the code that slurps json from submissions into actual c4 reports to figure out how to do this and if it's a feasible fix on the c4 side. Ideally the parser should not be xml-ordering-dependent, so even if we have a quick workaround, it's worth understanding and fixing this behavior in the parser, or possibly on the c4 side (though I'd prefer not to have a parser with mysterious behavior).