re-republisher complains about corrupt jp2s when md5 matches

Bug #821446 reported by paul.n
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Internet Archive - Tech Support
Confirmed
Undecided
Jude Coelho

Bug Description

Hi Jude,

We're trying to re-republish a book and keeping getting a jp2 corrupt error when attempting to process. I checked file size and md5 against files.xml and everything looks in order, so I'm not sure why it's not happy w/ them.

Can you take a look at novelsfirsttimei00kock on:
http://scribe1.chapelhill.archive.org/scribe/admin/rerepublisher.php

Thanks,

Paul

Revision history for this message
Hank Bromley (hank-archive) wrote :

Have you tried viewing the image through archive_view.php?

http://scribe1.chapelhill.archive.org/scribe/archive_view.php

(You didn't mention which image it was, so I couldn't check myself.)

Revision history for this message
Elizabeth MacLeod (scanner-elizabeth-macleod) wrote :

novelsfirsttimei00kock_orig_jp2.tar is the book I'm attempting to Re-Republish, but throws "JP2s corrupt" after it is downloaded, directly after selecting "Process". It is a 4600 page book..

Revision history for this message
Hank Bromley (hank-archive) wrote :

Yes, I could tell from Paul's report above which book it was. What I don't know is which image in that book is supposedly corrupted.

Revision history for this message
Jude Coelho (judec) wrote :

Looking into this matter, it does in fact appear to be a bug. Rerepublisher compares the filesize listed in files.xml with the actual size of the file to determine whether the files downloaded properly. It uses the php function filesize() to do this.

Looking at the manual for filesize() ( http://php.net/manual/en/function.filesize.php ), it appears there is a problem when trying to get the size of files larger than 2GB. While there is a fix using sprintf, this is only effective up to 4GB.

There are some suggestions on the manual page to get proper filesizes, or perhaps I could switch to checking the MD5 instead, but I don't know that I'll be able to get to this in the next couple of days. Can this book wait a bit?

Changed in ia-techsupport:
assignee: nobody → Jude Coelho (judec)
Revision history for this message
Elizabeth MacLeod (scanner-elizabeth-macleod) wrote :

Since most of this book is online, I think this book can indeed wait a week. I'll post here, if I hear otherwise from UNC. Thanks!

Jude Coelho (judec)
Changed in ia-techsupport:
status: New → Confirmed
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.