json config files get corrupted when using Python 2.7.6 on Linux
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
calibre |
Fix Released
|
Undecided
|
Unassigned |
Bug Description
Hi,
I noticed when running calibre 1.28.0 on Fedora-21 that I got the following messages when starting calibre from a terminal:
Traceback (most recent call last):
File "/usr/lib64/
d = self.raw_
File "/usr/lib64/
return json.loads(
File "/usr/lib64/
return cls(encoding=
File "/usr/lib64/
raise ValueError(
ValueError: Extra data: line 19 column 2 - line 37 column 2 (char 349 - 698)
loaded the Generic plugin
And that my preferences were no longer honored / remembered.
I've done some debugging and the problem is that ie gui.json gets any new config settings appended rather then that then contents gets overwritten. IE gui.json looks like this after one run (abbreviated):
{
"quick_
"library_
"/home/
},
}{
"quick_
"library_
"/home/
},
}
I've tracked this down to calibre/
doing (abbreviated again):
self.file = open(self.path, 'a+b')
When running under anything but windows. Note the a+ stands for append mode, now
calibre/
f.seek(0)
f.truncate()
On the file, but this seems to no longer lead to the desired result, given what the python docs say:
http://
"file.seek(offset[, whence]) ... Note that if the file is opened for appending (mode 'a' or 'a+'), any seek() operations will be undone at the next write."
This is not really surprising as a truncate is a write. As the calibre code used to work fine until
recently, I can only assume that truncate used to be a loophole around this limitation, and
that this has now been fixed.
So I've audited all the users of ExclusiveFile in calibre, and whenever the write they always
first do a truncate. So there really is no need to use "a" as mode when opening the file,
so replacing the open in ExclusiveFile,
self.file = open(self.path, 'r+b')
Fixes this.
Thanks & Regards,
Hans
I cannot replicate this with python 2.7.6 on my gentoo system. See the test below:
$ python -c "f = open('/tmp/test', 'wb'); f.write('abcd'); f.close(); f = open('/tmp/test', 'a+b'); f.seek(0); f.truncate(); f.write('abcd'); f.close(); print (open(' /tmp/test' ).read( )); import sys; print sys.version"
abcd
2.7.6 (default, Dec 3 2013, 21:15:46)
[GCC 4.8.2]
As you can see, seek(0) followed by truncate() works as expected on python 2.7.6.
IIRC, the python file object is just a thin wrapper around the libc FILE* pointer. So I suspect that this has something to do with a change in how truncate works in whatever C library your system is using. It seems rather odd that something as fundamental as the C library would change the behavior of something as widely used as truncate(), but...
You cannot change a+b to r+b. That would break when the file being used does not exist. That was the entire reason for using a+b in the first place. Using r+b will break, for example, the set() method of the Config class in config_base.py
The only way to accommodate your broken libc would be to first try to open the file in r+b and then fallback to w+b. There is a race condition in that. It annoys me to be forced to use code that is vulnerable to a race condition just to accommodate one distro's broken libc.
It may be possible to open in r+b and promote to w+b using fcntl, I will have to look into that.
What version/ implementation of libc is your system using?