"type error" for Nikon images with chr(128) values in EXIF tags

Bug #1016066 reported by Hobson Lane
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
pyexiv2
New
Undecided
Unassigned

Bug Description

Upgrade to 12.04 Precise and pyexiv2 package is brittle (fails on unicoded EXIF tags for Nikon images). Still looking into it. The error could be my fault (bad upgrade or pyexiv2 package download) but...

Attempting apply str() to a Nikon UserCommen exif tag gives:

Traceback (most recent call last):
  File "/home/hobs/bin/tagim", line 345, in <module>
    tagim.display_meta(im)
  File "/home/hobs/src/tagim/tg/tagim.py", line 194, in display_meta
    print "{0}: {1}".format(k,str(im[k].value))
UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-217: ordinal not in range(128)
hobs@hobs-laptop:~$ gedit src/tagim/tagim.py

The problematic Nikon tag contains Japanese characters in unicode

Exif.Photo.UserComment: 䔀洀瀀椀爀攀 栀漀琀攀氀 戀甀椀氀琀 戀礀 琀栀攀 匀甀氀琀愀渀✀猀 戀爀漀琀栀攀爀 椀渀 䈀爀甀渀攀椀⸀ 伀爀椀最椀渀愀氀氀礀 戀甀椀氀琀 昀漀爀 栀椀猀 瀀爀椀瘀愀琀攀 攀渀琀攀爀琀愀椀渀洀攀渀琀 愀猀 瀀愀爀琀 漀昀 戀椀氀氀椀漀渀猀 漀昀 猀焀甀愀渀搀攀爀攀搀 搀漀氀氀愀爀猀 猀瀀攀渀琀 漀渀 攀砀琀爀愀瘀愀最攀渀琀 瀀爀漀樀攀挀琀猀⸀ 匀琀愀渀搀椀渀最 漀渀 琀栀攀 攀搀最攀 漀昀 琀栀攀 匀漀甀琀栀 䌀栀椀渀愀 匀攀愀

Attempting to unicode() all tags gets past the Japanese unicode but fails on the XMP tags in the same Nikon image:

Traceback (most recent call last):
  File "/home/hobs/bin/tagim", line 345, in <module>
    tagim.display_meta(im)
  File "/home/hobs/src/tagim/tg/tagim.py", line 202, in display_meta
    print "{0}: {1}".format(k,unicode(im[k].value))
  File "/usr/lib/python2.7/dist-packages/pyexiv2/xmp.py", line 225, in _get_value
  File "/usr/lib/python2.7/dist-packages/pyexiv2/xmp.py", line 208, in _compute_value
  File "/usr/lib/python2.7/dist-packages/pyexiv2/xmp.py", line 208, in <lambda>
  File "/usr/lib/python2.7/dist-packages/pyexiv2/xmp.py", line 431, in _convert_to_python
  File "/usr/lib/python2.7/dist-packages/pyexiv2/xmp.py", line 143, in _choice_type
TypeError: expected string or buffer

Revision history for this message
Olivier Tilloy (osomon) wrote :

Thanks for the report Hobson.
Could you please attach or send me an image that contains such a tag that reproduces the issue?

Revision history for this message
Hobson Lane (hobs) wrote :

I think this is the same old bug that you fixed before and should rear it's head on any of those previously submitted Nikon images that have binary or unicode in the the UserComment string. I've attached one that I used to duplicate it this morning on the latest pyexiv2 for precise pangolin ('0.3.2').

To duplicate, run this following function on an image instance that has been loaded and read by pexiv2 (or you can clone tagim from <email address hidden>:hobsonlane/tagim.git and do `tagim -i 'DSCN2162.JPG' --debug` ....

def display_meta_str(im):
    keysets = {'EXIF':im.exif_keys, 'IPTC':im.iptc_keys, ' XMP':im.xmp_keys}
    for name,keys in keysets.items():
        title = ' %s Data '%name
        print '-'*30 + title + '-'*30
        for k in keys:
            print u'{0}: {1}'.format(str(k),str(im[k].value))
        print '-'*(60+len(title))
    print '-'*30 + ' Comment '+'-'*30
    print im.comment
    print '-'*(60+len(title))
    return keysets.values()

Here's the output for this particular image (with error message at end):

Image file name: 'DSCN2161.JPG'
------------------------------ IPTC Data ------------------------------
-----------------------------------------------------------------------
------------------------------ XMP Data ------------------------------
-----------------------------------------------------------------------
------------------------------ EXIF Data ------------------------------
Exif.Image.ImageDescription:
Exif.Image.Make: NIKON
Exif.Image.Model: COOLPIX L18
Exif.Image.Orientation: 1
Exif.Image.XResolution: 300
Exif.Image.YResolution: 300
Exif.Image.ResolutionUnit: 2
Exif.Image.Software: COOLPIX L18 V1.1
Exif.Image.DateTime: 2009-07-23 22:04:56
Exif.Image.YCbCrPositioning: 2
Exif.Image.ExifTag: 230
Exif.Photo.ExposureTime: 1/60
Exif.Photo.FNumber: 14/5
Exif.Photo.ExposureProgram: 2
Exif.Photo.ISOSpeedRatings: 565
Exif.Photo.ExifVersion: 0220
Exif.Photo.DateTimeOriginal: 2009-07-23 22:04:56
Exif.Photo.DateTimeDigitized: 2009-07-23 22:04:56
Exif.Photo.ComponentsConfiguration: 
Exif.Photo.CompressedBitsPerPixel: 2
Exif.Photo.ExposureBiasValue: 0
Exif.Photo.MaxApertureValue: 3
Exif.Photo.MeteringMode: 5
Exif.Photo.LightSource: 0
Exif.Photo.Flash: 25
Exif.Photo.FocalLength: 57/10
Traceback (most recent call last):
  File "/home/hobs/bin/tagim", line 346, in <module>
    tagim.display_meta_str(im)
  File "/home/hobs/src/tagim/tg/tagim.py", line 212, in display_meta_str
    print u'{0}: {1}'.format(str(k),str(im[k].value))
UnicodeDecodeError: 'ascii' codec can't decode byte 0x8e in position 52: ordinal not in range(128)

And here's the workaround -- use unicode() instead of str()):

print u'{0}: {1}'.format(unicode_noerr(k),unicode_noerr(im[k].value, errors='replace'))
def unicode_noerr(s,errors='replace'):
    """
    Coerce input into a unicode (multibyte) string regardless of the type of input, without raising exceptions.

    Assumes any single-byte str is UTF-8 or ASCII.
    """
    if type(s)==unicode: return s
    elif type(s)==str: return s.decode('UTF-8',errors=errors)
    else: return unicode(s)

Or run tagim without the `--debug` flag.

Revision history for this message
Hobson Lane (hobs) wrote :

Correction. It's the MakerNote tag that is the problem in the particular image that I tested and uploaded

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.