UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 386: ordinal not in range(128) in nova.virt.libvirt.vif:unplug with unicode instance.display_name

Bug #1580728 reported by Matt Riedemann
18
This bug affects 3 people
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
Medium
melanie witt
Ocata
Fix Committed
Medium
Lee Yarwood
devstack
Invalid
Undecided
Unassigned
oslo.log
Fix Released
Undecided
melanie witt
Declined for Newton by Ben Nemec
Ocata
Incomplete
Medium
Unassigned
oslo.versionedobjects
Invalid
Undecided
Unassigned

Bug Description

I saw this in the n-cpu logs for a xenproject CI run:

http://logs.openstack.xenproject.org/00/315100/1/check/dsvm-tempest-xen/9649dc5/logs/screen-n-cpu.txt.gz

2016-05-11 16:19:09.457 27252 INFO nova.virt.libvirt.driver [-] [instance: 76c4ad96-87dd-4300-acdc-cbe65d3aa0a6] Instance destroyed successfully.
Traceback (most recent call last):
  File "/usr/lib/python2.7/logging/__init__.py", line 851, in emit
    msg = self.format(record)
  File "/usr/local/lib/python2.7/dist-packages/oslo_log/handlers.py", line 73, in format
    return logging.StreamHandler.format(self, record)
  File "/usr/lib/python2.7/logging/__init__.py", line 724, in format
    return fmt.format(record)
  File "/usr/local/lib/python2.7/dist-packages/oslo_log/formatters.py", line 265, in format
    return logging.Formatter.format(self, record)
  File "/usr/lib/python2.7/logging/__init__.py", line 464, in format
    record.message = record.getMessage()
  File "/usr/lib/python2.7/logging/__init__.py", line 328, in getMessage
    msg = msg % self.args
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 386: ordinal not in range(128)
Logged from file vif.py, line 966

That would be logging the vif object in unplug:

https://github.com/openstack/nova/blob/15abb39ef20ae76d602d50e67e43c3500a00cd3e/nova/virt/libvirt/vif.py#L966

Revision history for this message
Matt Riedemann (mriedem) wrote :

Looks like this is the VIF object when we plug it:

vif=VIF({'profile': None, 'ovs_interfaceid': None, 'preserve_on_delete': False, 'network': Network({'bridge': u'br100', 'subnets': [Subnet({'ips': [FixedIP({'meta': {}, 'version': 4, 'type': u'fixed', 'floating_ips': [], 'address': u'10.1.13.3'})], 'version': 4, 'meta': {u'dhcp_server': u'10.1.0.1'}, 'dns': [IP({'meta': {}, 'version': 4, 'type': u'dns', 'address': u'8.8.4.4'})], 'routes': [], 'cidr': u'10.1.0.0/20', 'gateway': IP({'meta': {}, 'version': 4, 'type': u'gateway', 'address': u'10.1.0.1'})}), Subnet({'ips': [], 'version': None, 'meta': {u'dhcp_server': None}, 'dns': [], 'routes': [], 'cidr': None, 'gateway': IP({'meta': {}, 'version': None, 'type': u'gateway', 'address': None})})], 'meta': {u'tenant_id': None, u'should_create_bridge': True, u'bridge_interface': u'br_flat'}, 'id': u'9a5f172f-de10-4cd0-af3c-d2f1568b206e', 'label': u'private'}), 'devname': None, 'vnic_type': u'normal', 'qbh_params': None, 'meta': {}, 'details': {}, 'address': u'fa:16:3e:33:4b:c0', 'active': False, 'type': u'bridge', 'id': u'7d1d85f4-086a-4400-88dd-f90e64af7756', 'qbg_params': None})

tags: added: unicode
Changed in nova:
status: New → Confirmed
importance: Undecided → Medium
Revision history for this message
Matt Riedemann (mriedem) wrote :

From comment 1 the vif looks fine when it's plugged, something must be changing that before we unplug it.

Revision history for this message
Matt Riedemann (mriedem) wrote :

I guess it could be the instance that's got something that's failing to decode.

Revision history for this message
Matt Riedemann (mriedem) wrote :

This is the instance:

instance=Instance(access_ip_v4=None,access_ip_v6=None,architecture=None,auto_disk_config=False,availability_zone=None,cell_name=None,cleaned=False,config_drive='',created_at=2016-05-11T16:18:57Z,default_ephemeral_device=None,default_swap_device=None,deleted=False,deleted_at=None,disable_terminate=False,display_description='tempest.common.compute-instance-1871204054',display_name='tempest.common.compute-instance-1871204054',ec2_ids=EC2Ids,ephemeral_gb=0,ephemeral_key_uuid=None,fault=<?>,flavor=Flavor(11),host='ds-xen-1462964400',hostname='tempest.common.compute-instance-1871204054',id=80,image_ref='b388e343-1c9e-4169-b4c7-5880bda41689',info_cache=InstanceInfoCache,instance_type_id=11,kernel_id='d71b24f6-c0b5-4c6d-ab16-bbfbc79a2846',key_data=None,key_name=None,launch_index=0,launched_at=None,launched_on='ds-xen-1462964400',locked=False,locked_by=None,memory_mb=64,metadata={},migration_context=<?>,new_flavor=None,node='ds-xen-1462964400',numa_topology=None,old_flavor=None,os_type=None,pci_devices=PciDeviceList,pci_requests=InstancePCIRequests,power_state=0,progress=0,project_id='fc814a2e49064d999de62dbe32351bd5',ramdisk_id='c5ebf415-08ef-4c6d-a872-e707569c4fa7',reservation_id='r-lqqepx6k',root_device_name='/dev/xvda',root_gb=0,security_groups=SecurityGroupList,services=<?>,shutdown_terminate=False,system_metadata={image_base_image_ref='b388e343-1c9e-4169-b4c7-5880bda41689',image_container_format='ami',image_disk_format='ami',image_kernel_id='d71b24f6-c0b5-4c6d-ab16-bbfbc79a2846',image_min_disk='0',image_min_ram='0',image_ramdisk_id='c5ebf415-08ef-4c6d-a872-e707569c4fa7',network_allocated='True'},tags=<?>,task_state='spawning',terminated_at=None,updated_at=2016-05-11T16:18:58Z,user_data=None,user_id='f9c313947c0a4792bb13dde84e536cec',uuid=76c4ad96-87dd-4300-acdc-cbe65d3aa0a6,vcpu_model=None,vcpus=1,vm_mode=None,vm_state='building')

Revision history for this message
Matt Riedemann (mriedem) wrote :

OK, the problem is the instance in this test has a unicode name:

http://git.openstack.org/cgit/openstack/tempest/tree/tempest/api/compute/servers/test_servers.py#n102

    @test.idempotent_id('5e6ccff8-349d-4852-a8b3-055df7988dd2')
    def test_update_server_name(self):
        # The server name should be changed to the provided value
        server = self.create_test_server(wait_until='ACTIVE')
        # Update instance name with non-ASCII characters
        prefix_name = u'\u00CD\u00F1st\u00E1\u00F1c\u00E9'
        self._update_server_name(server['id'], 'ACTIVE', prefix_name)

Revision history for this message
Matt Riedemann (mriedem) wrote :

This is the PUT:

http://logs.openstack.org/98/282398/7/check/gate-tempest-dsvm-cells-ubuntu-xenial/55f5ceb/logs/tempest.txt.gz#_2016-09-13_17_48_58_799

2016-09-13 17:48:58.799 17352 DEBUG tempest.lib.common.rest_client [req-e4190840-33d8-4a73-af7c-7ecfbd5b28db ] Request - Headers: {'X-Auth-Token': '<omitted>', 'Accept': 'application/json', 'Content-Type': 'application/json'}
        Body: {"server": {"name": "tempest-\u00cd\u00f1st\u00e1\u00f1c\u00e9-1977980857"}}
    Response - Headers: {'openstack-api-version': 'compute 2.1', 'content-type': 'application/json', 'content-length': '1017', 'date': 'Tue, 13 Sep 2016 17:48:58 GMT', 'x-compute-request-id': 'req-e4190840-33d8-4a73-af7c-7ecfbd5b28db', 'status': '200', 'vary': 'X-OpenStack-Nova-API-Version', 'connection': 'close', 'x-openstack-nova-api-version': '2.1', 'content-location': 'http://127.0.0.1:8774/v2.1/servers/c1d0db5c-464b-4d98-b82d-f843a3f2b887'}
        Body: {"server": {"status": "ACTIVE", "updated": "2016-09-13T17:48:59Z", "hostId": "e9e1b80210cd3d668057493fef9cf9778bcea3a895ac720a0fadef31", "user_id": "8cbfc3b6125c43c69cebb7a2ebfc9c10", "name": "tempest-\u00cd\u00f1st\u00e1\u00f1c\u00e9-1977980857", "links": [{"href": "http://127.0.0.1:8774/v2.1/servers/c1d0db5c-464b-4d98-b82d-f843a3f2b887", "rel": "self"}, {"href": "http://127.0.0.1:8774/servers/c1d0db5c-464b-4d98-b82d-f843a3f2b887", "rel": "bookmark"}], "created": "2016-09-13T17:48:53Z", "tenant_id": "869f26d125b14598928ab4a657065a44", "image": {"id": "4ebf1585-f4f0-442b-b941-894cefff648b", "links": [{"href": "http://127.0.0.1:8774/images/4ebf1585-f4f0-442b-b941-894cefff648b", "rel": "bookmark"}]}, "OS-DCF:diskConfig": "MANUAL", "addresses": {"private": [{"version": 4, "addr": "10.1.10.2"}]}, "accessIPv4": "", "accessIPv6": "", "progress": 0, "flavor": {"id": "42", "links": [{"href": "http://127.0.0.1:8774/flavors/42", "rel": "bookmark"}]}, "id": "c1d0db5c-464b-4d98-b82d-f843a3f2b887", "metadata": {}}} _log_request_full tempest/lib/common/rest_client.py:431

Changed in nova:
status: Confirmed → Triaged
summary: UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position
- 386: ordinal not in range(128) in nova.virt.libvirt.vif:unplug
+ 386: ordinal not in range(128) in nova.virt.libvirt.vif:unplug with
+ unicode instance.display_name
Revision history for this message
Matt Riedemann (mriedem) wrote :

I thought about fixing this with using six.text_type here:

https://github.com/openstack/oslo.versionedobjects/blob/7f0edcdb6430f5ccfdeed81806ecc780753da4d1/oslo_versionedobjects/base.py#L314

But I couldn't get a unit test to work that would recreate the failure, which makes me wonder if it's somehow related to how the logging module handles the formatting.

Revision history for this message
Sean Dague (sdague) wrote :

I dug on this a bit further, and I don't think it's actually an issue with ovo, I think it's an issue with our environment (which I still have an issue replicating locally).

The unicode for the utf8 name turns into: Íñstáñcé

However... in a devstack run we're setting LC_ALL=C because we grep the output of commands, and things get weird if people do those commands in other languages. This effectively squashes a lot of working utf8 support.

The following on a Linux desktop demonstrates what this issue would be:

os4:~> python -c "a = u'\u00CD\u00F1st\u00E1\u00F1c\u00E9'; print a"
Íñstáñcé
os4:~> LC_ALL=C python -c "a = u'\u00CD\u00F1st\u00E1\u00F1c\u00E9'; print a"
Traceback (most recent call last):
  File "<string>", line 1, in <module>
UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-1: ordinal not in range(128)

Revision history for this message
Michael Dovgal (mdovgal) wrote :

Am I right or this bug duplicates this one?
https://bugs.launchpad.net/oslo.db/+bug/1430399

Actually the reasons, that triggered this bug, were different,but we have the same place where the code crashes.

Matt Riedemann (mriedem)
Changed in devstack:
status: New → Confirmed
Revision history for this message
Matt Riedemann (mriedem) wrote :

This started spiking in failures in the last 24 hours once oslo.log 3.26.0 was used in upper-constraints on master, and it's failing in the journal handler:

https://github.com/openstack/oslo.log/blob/3.26.0/oslo_log/handlers.py#L120

The switch in devstack to use systemd by default happened yesterday too:

https://github.com/openstack-dev/devstack/commit/7f8df450dbdfe754eb3f7311de56327878a3d1c3

So everything is running with that handler now and we're seeing the spike in failures:

http://logstash.openstack.org/#dashboard/file/logstash.json?query=message%3A%5C%22Setting%20instance%20vm_state%20to%20ERROR%5C%22%20AND%20message%3A%5C%22_unplug_vifs%5C%22%20AND%20message%3A%5C%22UnicodeDecodeError%5C%22%20AND%20tags%3A%5C%22screen-n-cpu.txt%5C%22&from=7d

It's failing specifically hard in the cells v1 job, I'm not really sure why.

Revision history for this message
Matt Riedemann (mriedem) wrote :

melwitt tried this with devstack but it didn't help:

https://review.openstack.org/#/c/461840/

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to nova (master)

Related fix proposed to branch: master
Review: https://review.openstack.org/461975

Changed in nova:
assignee: nobody → melanie witt (melwitt)
status: Triaged → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on nova (master)

Change abandoned by Matt Riedemann (<email address hidden>) on branch: master
Review: https://review.openstack.org/461975

Revision history for this message
ChangBo Guo(gcb) (glongwave) wrote :

Similar error like UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 386: ordinal not in range(128) usually occurred when sys.getdefaultencoding != 'utf-8".
defaultencoding is 'ascii' on Python 2, but 'uft-8' on Python 3 and there is now way to set on Python 3. So we prefer to use 'utf-8' on Python 2.

[changboguo@localhost oslo.log]$ python
Python 2.7.10 (default, Sep 24 2015, 17:50:09)
[GCC 5.1.1 20150618 (Red Hat 5.1.1-4)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import sys
>>> sys.getdefaultencoding()
'ascii'
>>>
[changboguo@localhost oslo.log]$ python3
Python 3.4.2 (default, Jul 9 2015, 17:24:30)
[GCC 5.1.1 20150618 (Red Hat 5.1.1-4)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import sys
>>> sys.getdefaultencoding()
'utf-8'
>>>

The following code may fix the issue.

import sys
default_encoding = 'utf-8'
if sys.getdefaultencoding() != default_encoding:
   reload(sys)
   sys.setdefaultencoding(default_encoding)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on devstack (master)

Change abandoned by melanie witt (<email address hidden>) on branch: master
Review: https://review.openstack.org/461840

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.openstack.org/461912
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=564958dba6c280eb8e11ae9bbc819c7da6e204bd
Submitter: Jenkins
Branch: master

commit 564958dba6c280eb8e11ae9bbc819c7da6e204bd
Author: melanie witt <email address hidden>
Date: Tue May 2 21:47:12 2017 +0000

    Use six.text_type() when logging Instance object

    We're seeing a trace in gate jobs, for example:

      UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position
      402: ordinal not in range(128)

    when attempting to log an Instance object with a unicode display name.

    This resurfaced relatively recently because of the change in devstack
    to use the new OSJournalHandler with use_journal=True which is
    suspected of causing some deadlock issues [1] unrelated to this bug.

    The problem occurs in code that logs an entire Instance object when
    the object has a field with unicode characters in it (display_name).
    When the object is sent to logging, the UnicodeDecodeError is raised
    while formatting the log record here [2]. This implies an implicit
    conversion attempt to unicode at this point.

    I found that with the Instance object, the conversion to unicode fails
    with the UnicodeDecodeError unless the encoding 'utf-8' is explicitly
    specified to six.text_type(). And when specifying an encoding to
    six.text_type(), the argument to convert must be a string, not an
    Instance object, so this does the conversion in two steps as a utility
    function:

      1. Get the string representation of the Instance with repr()
      2. Call six.text_type(instance_repr, 'utf-8') passing the encoding
         if not six.PY3

    Closes-Bug: #1580728

    [1] https://review.openstack.org/#/c/462163
    [2] https://github.com/python/cpython/blob/2e576f5/Lib/logging/__init__.py#L338

    Change-Id: I0fc3ae02cb2e401b3240faf0d8b6aa5dc52b91fc

Changed in nova:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to oslo.log (master)

Fix proposed to branch: master
Review: https://review.openstack.org/462735

Changed in oslo.log:
assignee: nobody → melanie witt (melwitt)
status: New → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to oslo.log (master)

Reviewed: https://review.openstack.org/462735
Committed: https://git.openstack.org/cgit/openstack/oslo.log/commit/?id=72e5c3c1e258589957011506c36dd8af0a481513
Submitter: Jenkins
Branch: master

commit 72e5c3c1e258589957011506c36dd8af0a481513
Author: melanie witt <email address hidden>
Date: Thu May 4 20:22:47 2017 +0000

    Use dict arg values for unicode checks in ContextFormatter

    In ContextFormatter.format(), for python2 it checks each arg to
    determine whether unicode should be used for the format message.
    The problem is the code assumes the args are a list, when they can
    also be a dict, for example:

      LOG.info('%(thing)s', {'thing': '...'})

    and in that case, the dict keys were implicitly being used for the
    checks. The checks will always pass on string dict keys, so the
    format message gets converted to unicode even though the corresponding
    args will ultimately fail decoding to unicode. Then, the logging fails
    with:

      UnicodeDecodeError: 'ascii' codec can't decode byte 0xc6 in
      position 0: ordinal not in range(128)

    when the unicode format message causes an implicit conversion attempt
    of the args to unicode [1].

    This adds a check for the args type and uses the dict values for the
    unicode checks so that dict args with values that fail decoding will
    have: should_use_unicode = False.

    Closes-Bug: #1580728

    [1] https://github.com/python/cpython/blob/2e576f5/Lib/logging/__init__.py#L338

    Change-Id: I8c479e507efcf8acd3e3faa4a702fa6e1f18772f

Changed in oslo.log:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/oslo.log 3.26.1

This issue was fixed in the openstack/oslo.log 3.26.1 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/ocata)

Fix proposed to branch: stable/ocata
Review: https://review.openstack.org/466677

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (stable/ocata)

Reviewed: https://review.openstack.org/466677
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=62a5b4bd206d6feb509cd397cea51dcd3620a21e
Submitter: Jenkins
Branch: stable/ocata

commit 62a5b4bd206d6feb509cd397cea51dcd3620a21e
Author: melanie witt <email address hidden>
Date: Tue May 2 21:47:12 2017 +0000

    Use six.text_type() when logging Instance object

    We're seeing a trace in gate jobs, for example:

      UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position
      402: ordinal not in range(128)

    when attempting to log an Instance object with a unicode display name.

    This resurfaced relatively recently because of the change in devstack
    to use the new OSJournalHandler with use_journal=True which is
    suspected of causing some deadlock issues [1] unrelated to this bug.

    The problem occurs in code that logs an entire Instance object when
    the object has a field with unicode characters in it (display_name).
    When the object is sent to logging, the UnicodeDecodeError is raised
    while formatting the log record here [2]. This implies an implicit
    conversion attempt to unicode at this point.

    I found that with the Instance object, the conversion to unicode fails
    with the UnicodeDecodeError unless the encoding 'utf-8' is explicitly
    specified to six.text_type(). And when specifying an encoding to
    six.text_type(), the argument to convert must be a string, not an
    Instance object, so this does the conversion in two steps as a utility
    function:

      1. Get the string representation of the Instance with repr()
      2. Call six.text_type(instance_repr, 'utf-8') passing the encoding
         if not six.PY3

    Closes-Bug: #1580728

    [1] https://review.openstack.org/#/c/462163
    [2] https://github.com/python/cpython/blob/2e576f5/Lib/logging/__init__.py#L338

    Change-Id: I0fc3ae02cb2e401b3240faf0d8b6aa5dc52b91fc
    (cherry picked from commit 564958dba6c280eb8e11ae9bbc819c7da6e204bd)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 15.0.5

This issue was fixed in the openstack/nova 15.0.5 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 16.0.0.0b2

This issue was fixed in the openstack/nova 16.0.0.0b2 development milestone.

Revision history for this message
Ben Nemec (bnemec) wrote :

Do we still need the oslo.log change backported to ocata?

Changed in oslo.versionedobjects:
status: New → Invalid
Revision history for this message
Matt Riedemann (mriedem) wrote :

@bnemec, at this point probably not.

Revision history for this message
melanie witt (melwitt) wrote :

Yeah, things were fixed in-project for nova (and presumably for any other projects that were affected) and the oslo.log change was intended to help handle future issues where log arguments containing unicode characters are passed as a dict. It doesn't have to be backported.

Revision history for this message
Dr. Jens Harbott (j-harbott) wrote :

Change for devstack has been abandoned, assuming this has been fixed in other projects.

Changed in devstack:
status: Confirmed → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.