inconsistent error message when uploading results

Bug #1025381 reported by Jeff Lane 
20
This bug affects 2 people
Affects Status Importance Assigned to Milestone
checkbox-certification
Fix Released
High
Marc Tardif

Bug Description

Ignore what I said earlier. the submission.xml file IS being written out, something else is causing us to get an "Inconsistend Message" error when submitting. When I looked at the submission file yesterday, I was expecting some additional closing tag, but after looking at it again, and digging through it more thoroughly, the file appears to be all there.

So something else is going on here...

Related branches

Revision history for this message
Juergen Chiu (juergen-chiu) wrote :

I did the test and list the procedure below.
1. run checkbex-certification-server
2. ctrl-c to stop the testing during CPU or Memory stressing (forget the point)
3. rerun the test and use "continue" to start the checkbox-certification-server tool
4. final the test and no error in the end
5. use checkbox-certification-submit tool to upload the report, but failed.

Revision history for this message
Marc Tardif (cr3) wrote :

The steps of the user interface are defined in the user_interface plugin in this order:

             "prompt-begin",
             "prompt-gather",
             "prompt-jobs",
             "prompt-report",
             "prompt-exchange",
             "prompt-finish"]

The submission.xml file is generated in the prompt-report step and the messages are exchanged in the prompt-exchange step. So, considering these steps are in order, the former should happen before the latter.

Just in case, I tried running checkbox-certification from the trunk of the project. These are the steps I followed:

1. Remove all traces of files like the submission.xml that might be there already:

  $ rm -fr store messages checkbox.xsl checkbox-certification.log certification_plugins.bpickle* sub*

2. Ran checkbox-certification-server with a whitelist only containing resources, __info__ and __graphics__:

  $ CHECKBOX_DATA=. ./bin/checkbox-certification-server -W graphics.whitelist

3. Pressed ctrl-c at the report step:

Test case results report

a: Graphics tests
 : Space when finished

Please choose (a/ ): ./bin/checkbox-certification-server: line 18: 13181 I/O possible python3 $CHECKBOX_SHARE/run "$@" $CHECKBOX_CERTIFICATION_SHARE/configs/$(basename $0).ini

4. Checked that the submission.xml file exists and has a size:

  $ wc -c submission.xml
  228160 submission.xml

Everything looks good, I don't understand how to reproduce this problem.

Jeff Lane  (bladernr)
summary: - Submission.xml is not written before checkbox attempts to submit results
+ inconsistent error message when uploading results
description: updated
affects: checkbox → checkbox-certification
Revision history for this message
Jeff Lane  (bladernr) wrote :

Marc, yeah, I'm an idiot :/ see above, I went through the same steps this morning more thoroughly than yesterday and submission.xml is fine. at least, it appears to be complete. Something else is causing an "Inconsistent Message" error.

Here's the basic info I get when trying to submit:

bladernr@clarence:~/juergen$ checkbox-certification-submit --hwid a00D000000LU9Jd -v -l inconsistent-message.log submission.xml
2012-07-17 11:09:05,676 INFO Setting log level to DEBUG for verbosity
2012-07-17 11:09:05,677 INFO The debug log can be found at /home/bladernr/juergen/inconsistent-message.log
2012-07-17 11:09:05,677 INFO NOTE: The debug log is very large. You can skip large sections of text by searching the log for the keyword: ADDING
2012-07-17 11:09:05,677 INFO Preparing to submit results using Hardware ID: a00D000000LU9Jd
2012-07-17 11:09:05,677 INFO Getting messages
2012-07-17 11:09:21,258 CRITICAL Error: Inconsistent message

Revision history for this message
Jeff Lane  (bladernr) wrote :

Here is juergen's submission.xml

Revision history for this message
Jeff Lane  (bladernr) wrote :

After poking around a bit more, I think this is the problem:

checkbox-certitification-submit:

def get_request(messages):
    return bpickle.dumps({
        "messages": messages,
        "sequence": 0,
        "total-messages": len(messages),
        })

In python 2.x, pickle.dumps returns a string object. In python3, this returns a bytes object. Looking through the debug logs for checkbox-certification-submit helped me catch this.

To confirm, I reverted checkbox-certification-submit to work in Python 2.7 and tried again and this time it was successful:

bladernr@klaatu:~/development/checkbox-certification-0.12$ PYTHONPATH=.,/home/bladernr/development/trunks/checkbox-0.12,$PYTHONPATH bin/checkbox-certification-submit --hwid a00D000000LU9Jd /home/bladernr/Downloads/x3550/checbox/submission.xml
2012-07-17 11:54:04,220 INFO Preparing to submit results using Hardware ID: a00D000000LU9Jd
2012-07-17 11:54:04,221 INFO Getting messages
2012-07-17 11:54:25,032 INFO Exchanged 7 of 7 messages
2012-07-17 11:54:25,033 INFO Results have been successfully submitted. To review your test results now, please go to the following URL: https://certification.canonical.com/submission/PG39ZJOrWejkAw4

So that confirms that the underlying issue is the change in behaviour for pickle.dumps() for Python3

Revision history for this message
Jeff Lane  (bladernr) wrote :

I should add that I also tried a very simple modification to the submit script by adding .decode() to the return line in get_request(), and that generated an entirely different set of errors:

2012-07-17 11:57:16,893 INFO Setting log level to DEBUG for verbosity
2012-07-17 11:57:16,894 INFO The debug log can be found at /home/bladernr/Downloads/x3550/checbox/jeff1.log
2012-07-17 11:57:16,894 INFO NOTE: The debug log is very large. You can skip large sections of text by searching the log for the keyword: ADDING
2012-07-17 11:57:16,895 INFO Preparing to submit results using Hardware ID: a00D000000LU9Jd
2012-07-17 11:57:16,895 INFO Getting messages
Traceback (most recent call last):
  File "./checkbox-certification-submit", line 266, in <module>
    sys.exit(main())
  File "./checkbox-certification-submit", line 236, in main
    response = transport.exchange(request, headers)
  File "/usr/lib/python2.7/dist-packages/checkbox/lib/transport.py", line 306, in exchange
    (content_type, body) = self._encode_body(body)
  File "/usr/lib/python2.7/dist-packages/checkbox/lib/transport.py", line 278, in _encode_body
    raise TypeError
TypeError: Invalid non-string sequence or mapping

Revision history for this message
Jeff Lane  (bladernr) wrote :

Forget comment 6's traceback. I forgot to reset the shebang to python3. Here is the correct error after adding .decode() to get_results():

2012-07-17 12:05:04,303 INFO Setting log level to DEBUG for verbosity
2012-07-17 12:05:04,304 INFO The debug log can be found at /home/bladernr/Downloads/x3550/checbox/jeff1.log
2012-07-17 12:05:04,306 INFO NOTE: The debug log is very large. You can skip large sections of text by searching the log for the keyword: ADDING
2012-07-17 12:05:04,306 INFO Preparing to submit results using Hardware ID: a00D000000LU9Jd
2012-07-17 12:05:04,307 INFO Getting messages
Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/checkbox/lib/transport.py", line 247, in _encode_body
    raise TypeError
TypeError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "./checkbox-certification-submit", line 266, in <module>
    sys.exit(main())
  File "./checkbox-certification-submit", line 236, in main
    response = transport.exchange(request, headers)
  File "/usr/lib/python3/dist-packages/checkbox/lib/transport.py", line 276, in exchange
    (content_type, body) = self._encode_body(body)
  File "/usr/lib/python3/dist-packages/checkbox/lib/transport.py", line 250, in _encode_body
    raise TypeError("Invalid non-string sequence or mapping").with_traceback(tb)
  File "/usr/lib/python3/dist-packages/checkbox/lib/transport.py", line 247, in _encode_body
    raise TypeError
TypeError: Invalid non-string sequence or mapping

Revision history for this message
Jeff Lane  (bladernr) wrote :

Hrmmm... more indepth, debug from submit script ends with :

2012-07-17 11:11:04,862 DEBUG Server Response: 200
2012-07-17 11:11:04,867 DEBUG Preparing to upload payload
2012-07-17 11:11:04,868 DEBUG {'type': 'report-submission', 'secure_id': 'skfsPU1QuWlyRGQ'}
2012-07-17 11:11:04,868 DEBUG Report Submission detected
2012-07-17 11:11:04,868 DEBUG {'message': 'Inconsistent message', 'type': 'error'}
2012-07-17 11:11:04,868 CRITICAL Error: Inconsistent message

So it seems to be getting past the first part of submission and gets a response from c3.

Then it goes to actually upload the payload.

So... hrmmm... server side? sigh, this is giving me a headache...

Revision history for this message
Daniel Manrique (roadmr) wrote :

OK, so this was apparently observed on system https://certification.canonical.com/hardware/201112-10215, I found the relevant traceback server-side:

Traceback (most recent call last):
  File "./src/canonical/certify/message/messageapis.py", line 230, in _run
    self.handle(message["type"], message)
  File "./src/canonical/message/api.py", line 57, in handle
    return handler(type, body)
  File "./src/canonical/checkbox/arguments.py", line 82, in replacement
    return original(*new_args, **new_kwargs)
  File "./src/canonical/certify/message/handlers/device.py", line 49, in __call__
    device_values, self._submission.info.creation_time)
  File "./src/canonical/certify/model/device.py", line 644, in get_device_states
    value.category_name)
  File "./src/canonical/certify/model/device.py", line 599, in get_device
    product_id, vendor_id, subproduct_id, subvendor_id)
  File "./src/canonical/certify/model/device.py", line 580, in get_product
    vendor = bus.get_vendor(vendor_name, vendor_id)
  File "./src/canonical/certify/model/device.py", line 139, in get_vendor
    DeviceVendor.name == vendor_name).one()
  File "/srv/certification.canonical.com/certify-storm/storm/store.py", line 1141, in one
    result = self._store._connection.execute(select)
  File "/srv/certification.canonical.com/certify-storm/storm/databases/postgres.py", line 249, in execute
    return Connection.execute(self, statement, params, noresult)
  File "/srv/certification.canonical.com/certify-storm/storm/database.py", line 237, in execute
    raw_cursor = self.raw_execute(statement, params)
  File "/srv/certification.canonical.com/certify-storm/storm/databases/postgres.py", line 259, in raw_execute
    return Connection.raw_execute(self, statement, params)
  File "/srv/certification.canonical.com/certify-storm/storm/database.py", line 321, in raw_execute
    self._check_disconnect(raw_cursor.execute, *args)
  File "/srv/certification.canonical.com/certify-storm/storm/database.py", line 366, in _check_disconnect
    return function(*args, **kwargs)
DataError: invalid input syntax for type bytea
LINE 1: ...vendor.vendor_id > 65535 AND device_vendor.name = E'USB\\x20...

To reproduce, use the submission.xml from comment #4 and do something like:

checkbox-certification-submit -v --hwid a00D000000LU9Jd submission.xml

I tried this from a checkbox trunk with the latest checkbox-certification-submit from checkbox-certification, and I got the same local Inconsistent Message, plus matching server-side output :)

I'll set this to Triaged since we have steps to reproduce and a nice error log on the server.

Changed in checkbox-certification:
status: New → Triaged
importance: Undecided → High
Revision history for this message
Jeff Lane  (bladernr) wrote :

Any chance this can get looked at soon? It's impacting Juergen's work.

Revision history for this message
Daniel Manrique (roadmr) wrote :

This device's information in the submission.xml file seems to be confusing c3:

P: /devices/pci0000:00/0000:00:1a.7/usb1/1-1
  N: bus/usb/001/002
  E: BUSNUM=001
  E: DEVNAME=/dev/bus/usb/001/002
  E: DEVNUM=002
  E: DEVPATH=/devices/pci0000:00/0000:00:1a.7/usb1/1-1
  E: DEVTYPE=usb_device
  E: DRIVER=usb
  E: ID_BUS=usb
  E: ID_MODEL=USB_Flash_Drive
  E: ID_MODEL_ENC=USB\x20Flash\x20Drive
  E: ID_MODEL_ID=1167
  E: ID_REVISION=0100
  E: ID_SERIAL=USB_2.0_USB_Flash_Drive_000000005538679C
  E: ID_SERIAL_SHORT=000000005538679C
  E: ID_USB_INTERFACES=:080650:
  E: ID_VENDOR=USB_2.0
  E: ID_VENDOR_ENC=USB\x202.0
  E: ID_VENDOR_ID=048d
  E: MAJOR=189
  E: MINOR=1
  E: PRODUCT=48d/1167/100
  E: SUBSYSTEM=usb
  E: TYPE=0/0/0
  E: UDEV_LOG=3
  E: USEC_INITIALIZED=203699593

Particularly, the ID_VENDOR_ENC value is what's making c3 choke. If this is changed to USB\\x202.0 (note the double backslash from USB\x202.0), the submission goes through.

This is just an observation though, the actual fix will involve ensuring data is correctly-formatted when passing it to Postgres, which is the one saying this:

DataError: invalid input syntax for type bytea

this would be much easier to figure out if I could see the entire query as sent to the DBMS. Still, I'll go and read up on postgres bytea syntax, see what I can learn.

Revision history for this message
Marc Tardif (cr3) wrote :

So, I tried running the command suggested by Jeff with the submission.xml provided by Juergen:

$ PYTHONPATH=../../checkbox/trunk ./bin/checkbox-certification-submit --hwid a00D000000LU9Jd /tmp/submission.xml
2012-07-20 14:48:00,794 INFO Preparing to submit results using Hardware ID: a00D000000LU9Jd
2012-07-20 14:48:00,794 INFO Getting messages
2012-07-20 14:49:01,858 INFO Exchanged 7 of 7 messages
2012-07-20 14:49:01,858 INFO Results have been successfully submitted. To review your test results now, please go to the following URL: https://certification.canonical.com/submission/h8K7M1AQAVb3D2e
$ md5sum /tmp/submission.xml
b555c770159fc351b32804e9a8fde5f9 /tmp/submission.xml

I'm running on Precise with both checkbox and checkbox-certification from trunk. Please confirm whether the md5sum matches, the release of Ubuntu, and the versions of both checkbox and checkbox-certification. Thanks!

Revision history for this message
Daniel Manrique (roadmr) wrote :

The code pointed to by the trace is the get_vendor method in canonical/certify/model/device.py. It has this:

if vendor_id is not None:
            vendor = store.get(DeviceVendor, (self.id, vendor_id,))
            if not vendor:
                if vendor_name:
                    vendor = DeviceVendor(self, vendor_name, vendor_id)
                    vendor = store.add(vendor)
                    store.flush()
                else:
                    vendor = None
elif vendor_name:
            vendor = store.find(DeviceVendor,
                DeviceVendor.bus == self,
                DeviceVendor.vendor_id > DEVICE_MAX_ID,
                DeviceVendor.name == vendor_name).one() #This is where the bug hits
            if not vendor:
                vendor = DeviceVendor(self, vendor_name)
                vendor = store.add(vendor)
                store.flush()

The reason why Marc was unable to reproduce is that, on my attempts with Juergen's submission, I managed to make c3 create the vendor in question (apparently a new, never-seen-before one: "USB 2.0" - lame vendor if you ask me). So when he ran it, he hit instead the "if" clause which works correctly, rather than the elif one that fails.

To test this theory, I edited the submission.xml file to invent a new vendor with the same "faily" characteristics as the one Juergen had. I modified line 7451 to change:

E: ID_VENDOR_ENC=USB\x202.0

for:

E: ID_VENDOR_ENC=TESTUSB\x202.0

I then ran checkbox-certification-submit and got the failure again.

I then changed it to:

E: ID_VENDOR_ENC=TESTUSB\\x202.0

This worked correctly and created the TESTUSB\x202.0 vendor. I then moved the ID_VENDOR_ENC back to the failing version:

E: ID_VENDOR_ENC=TESTUSB\x202.0

And *now* I was able to re-submit the result because the vendor exists and is found by id rather than by name.

Anyway, convoluted, but this seems to be what happens and to reproduce it, you can change that particular ID_VENDOR_ENC to something that's not already in the database.

Marc Tardif (cr3)
Changed in checkbox-certification:
assignee: nobody → Marc Tardif (cr3)
Marc Tardif (cr3)
Changed in checkbox-certification:
status: Triaged → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.