2017-03-29 13:25:43 |
Jamie Strandboge |
description |
The following script works fine on 16.04 LTS:
#!/usr/bin/python3
import magic
import os
dir = "/usr/share/ca-certificates/mozilla"
mime = magic.open(magic.MAGIC_MIME)
mime.load()
for root, dirnames, filenames in os.walk(dir):
for f in filenames:
fn = os.path.join(root, f)
print("%s: %s" % (fn, mime.file(fn)))
Eg:
$ python3 /tmp/test.py
/usr/share/ca-certificates/mozilla/TWCA_Root_Certification_Authority.crt: text/plain; charset=us-ascii
/usr/share/ca-certificates/mozilla/Baltimore_CyberTrust_Root.crt: text/plain; charset=us-ascii
/usr/share/ca-certificates/mozilla/Comodo_AAA_Services_root.crt: text/plain; charset=us-ascii
/usr/share/ca-certificates/mozilla/Hellenic_Academic_and_Research_Institutions_RootCA_2011.crt: text/plain; charset=us-ascii
/usr/share/ca-certificates/mozilla/TC_TrustCenter_Class_3_CA_II.crt: text/plain; charset=us-ascii
/usr/share/ca-certificates/mozilla/Security_Communication_RootCA2.crt: text/plain; charset=us-ascii
/usr/share/ca-certificates/mozilla/EBG_Elektronik_Sertifika_Hizmet_Sağlayıcısı.crt: text/plain; charset=us-ascii
...
(notice the last filename before the ellipsis)
But on 17.04, this happens:
$ python3 /tmp/test.py
/usr/share/ca-certificates/mozilla/TWCA_Root_Certification_Authority.crt: text/plain; charset=us-ascii
/usr/share/ca-certificates/mozilla/Baltimore_CyberTrust_Root.crt: text/plain; charset=us-ascii
/usr/share/ca-certificates/mozilla/Comodo_AAA_Services_root.crt: text/plain; charset=us-ascii
/usr/share/ca-certificates/mozilla/Hellenic_Academic_and_Research_Institutions_RootCA_2011.crt: text/plain; charset=us-ascii
/usr/share/ca-certificates/mozilla/TC_TrustCenter_Class_3_CA_II.crt: text/plain; charset=us-ascii
/usr/share/ca-certificates/mozilla/Security_Communication_RootCA2.crt: text/plain; charset=us-ascii
Traceback (most recent call last):
File "/home/ubuntu/test.py", line 15, in <module>
print("%s: %s" % (fn, mime.file(fn)))
File "/usr/lib/python3/dist-packages/magic.py", line 130, in file
bi = bytes(filename, 'utf-8')
UnicodeEncodeError: 'utf-8' codec can't encode character '\udcc4' in position 69: surrogates not allowed
I'm guessing this is a change in python3 that python3-magic hasn't accounted for, but I'm not sure. Adding python3 task just in case. |
The following script works fine on 16.04 LTS:
#!/usr/bin/python3
import magic
import os
dir = "/usr/share/ca-certificates/mozilla"
mime = magic.open(magic.MAGIC_MIME)
mime.load()
for root, dirnames, filenames in os.walk(dir):
for f in filenames:
fn = os.path.join(root, f)
print("%s: %s" % (fn, mime.file(fn)))
Eg:
$ python3 /tmp/test.py
/usr/share/ca-certificates/mozilla/TWCA_Root_Certification_Authority.crt: text/plain; charset=us-ascii
/usr/share/ca-certificates/mozilla/Baltimore_CyberTrust_Root.crt: text/plain; charset=us-ascii
/usr/share/ca-certificates/mozilla/Comodo_AAA_Services_root.crt: text/plain; charset=us-ascii
/usr/share/ca-certificates/mozilla/Hellenic_Academic_and_Research_Institutions_RootCA_2011.crt: text/plain; charset=us-ascii
/usr/share/ca-certificates/mozilla/TC_TrustCenter_Class_3_CA_II.crt: text/plain; charset=us-ascii
/usr/share/ca-certificates/mozilla/Security_Communication_RootCA2.crt: text/plain; charset=us-ascii
/usr/share/ca-certificates/mozilla/EBG_Elektronik_Sertifika_Hizmet_Sağlayıcısı.crt: text/plain; charset=us-ascii
...
(notice the last filename before the ellipsis)
But on 17.04, this happens:
$ python3 /tmp/test.py
/usr/share/ca-certificates/mozilla/TWCA_Root_Certification_Authority.crt: text/plain; charset=us-ascii
/usr/share/ca-certificates/mozilla/Baltimore_CyberTrust_Root.crt: text/plain; charset=us-ascii
/usr/share/ca-certificates/mozilla/Comodo_AAA_Services_root.crt: text/plain; charset=us-ascii
/usr/share/ca-certificates/mozilla/Hellenic_Academic_and_Research_Institutions_RootCA_2011.crt: text/plain; charset=us-ascii
/usr/share/ca-certificates/mozilla/TC_TrustCenter_Class_3_CA_II.crt: text/plain; charset=us-ascii
/usr/share/ca-certificates/mozilla/Security_Communication_RootCA2.crt: text/plain; charset=us-ascii
Traceback (most recent call last):
File "/home/ubuntu/test.py", line 15, in <module>
print("%s: %s" % (fn, mime.file(fn)))
File "/usr/lib/python3/dist-packages/magic.py", line 130, in file
bi = bytes(filename, 'utf-8')
UnicodeEncodeError: 'utf-8' codec can't encode character '\udcc4' in position 69: surrogates not allowed
I'm guessing this is a change in python3 that python3-magic hasn't accounted for, but I'm not sure. Adding python3 task just in case. |
|