ayrpostofficegen189697uns has 2 pdf pages that are too small

Bug #704850 reported by Zarya Rathe
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Scribe2
Fix Committed
Undecided
danh

Bug Description

When our library partner was QA-ing ayrpostofficegen189697uns, they found 2 pdf pages that are too small. I've attached screenshots below.

When we downloaded this book in re-republisher, the originals were the correct size and we didn't find anything that should have caused this. Help please!

Thanks,

Zarya

Revision history for this message
Zarya Rathe (zarya) wrote :
Revision history for this message
Zarya Rathe (zarya) wrote :
Revision history for this message
Jude Coelho (judec) wrote :

For some reason, though this book was shot at 500 PPI, that leaf has an individual ppi assigned of 400:

<page leafNum="102">
<handSide>LEFT</handSide>
<pageType>Delete</pageType>
<addToAccessFormats>false</addToAccessFormats>
<rotateDegree>-90</rotateDegree>
<skewAngle>-0.94</skewAngle>
<skewAngleDetect>-0.94</skewAngleDetect>
<skewScore>6.76</skewScore>
<skewActive>true</skewActive>
<origWidth>2912</origWidth>
<origHeight>4368</origHeight>

<cropBox>
<x>1122</x>
<y>580</y>
<w>1693</w>
<h>2940</h>
</cropBox>
<ppi>400</ppi>
</page>

What's special about that leaf? Was it shot at foldout?

Revision history for this message
Zarya Rathe (zarya) wrote :

No, we don't have a foldout machine...(??)

Revision history for this message
Zarya Rathe (zarya) wrote :

Ah - I think I found it - I've just seen that we have corrections logged for those 2 pages. It looks like they were reshot at the wrong ppi in republisher. So, presumably if we reshoot those pages at the correct ppi, that'll fix it (right?).

Revision history for this message
Hank Bromley (hank-archive) wrote :

The page Jude pulled from scandata appears to be a different one. I believe it's image 0100, not 0102, that's appearing too small. Morever, Jude's page doesn't appear in the pdf at all, as it has <addToAccessFormats>false</addToAccessFormats>.

Which brings up another problem - that spread is skipped in the pdf, causing the document to skip from p. 96 to p. 99.

Still looking into why 0100 came out smaller in the pdf.

Revision history for this message
Hank Bromley (hank-archive) wrote :

Okay, if you set your pdf-viewer to show each page at a fixed zoom level, say 100%, rather than letting it zoom to whatever size will fit in your window, you can see that the pages Zarya selected are actually not smaller than those on the other 2-page spreads - the problem is the opposite page on those spreads is bigger than the rest! The page shown opposite 0100 is 0103 (because, again, 0101 and 0102 have addToAccessFormats=false, although that seems to be a mistake), and 0103 has ppi=400 set in the scandata for that page. Most of the book uses the ppi=500 set at the beginning of the scandata for the whole book.

You can also track this down by asking Adobe Reader what size (in inches) each page is. It will tell you most of the pages are 3.38" x 5.88", but the two that appear bigger, and have ppi=400 in the scandata, are 4.22" x 7.35". Zarya, which size is correct for the physical book? That will tell us how the ppi should be set (400 or 500).

Revision history for this message
Jude Coelho (judec) wrote :

Hank is correct - I was just looking for the ppi anomaly.

So Hank, if those two pages were shot through rerepublisher at 400 PPI, that would mean both are technically correct, but the pages should be reshot at the correct PPI of 500. I think.

Revision history for this message
Zarya Rathe (zarya) wrote :

Hi all,

Ok - the correct ppi level should be that of the majority of the book (ppi=500).

Also - the book itself skips from printed page 32 to p.37, and from p.96 to p.99 (0100 to 0103). There are tabs which cover text between 32/37 and 96/99, and so each spread was shot twice, and then the middle 2 leaves deleted. From what I can tell from our notes, we had scanned the book straight through, and noticed that the tabs were covering text during republishing. We then inserted duplicate spreads in order to include the missing text, but mistakenly did this at the wrong ppi setting.

So, my understanding is that if we reshoot the problem spreads at 500 ppi, we should be ok?

Revision history for this message
Hank Bromley (hank-archive) wrote :

As I said, we can tell the correct ppi from the physical size of the pages. Are they 3.38" x 5.88", or 4.22" x 7.35" (or something else entirely)?

Revision history for this message
Zarya Rathe (zarya) wrote :

The size of the physical page is around 3.5" x 6.31"

Revision history for this message
Hank Bromley (hank-archive) wrote :

Then yes, 500 ppi appears to be correct.

Revision history for this message
Zarya Rathe (zarya) wrote :

Great. Thanks everyone!!

Revision history for this message
Zarya Rathe (zarya) wrote :

Hi all,

I just wanted to raise a question following on from this bug. In our scanning center, we typically have a time gap between scanning & republishing for any given identifier, so ID's need to be re-opened to insert missing spreads, etc. Whenever we do this, we currently have to reset the ppi, and this increases the likelihood of re-setting by mistake to a different ppi from that used in scanning (which is what happened above). In future, we can just be extra careful to make sure the re-set ppi matches the original ppi, but this adds extra time.

So, my question: is there any way to make the ppi 'stick' when we set it the first time before scanning, so that we don't have to re-set it for republishing?

Thanks,

Zarya

Revision history for this message
Jude Coelho (judec) wrote :

Hey Zarya, I've subscribed Dan to this bug for your latest request. I'm not sure if he would prefer that you submit that request as a new bug in scribe2, or if he'd rather have the history here for explanation, so I'll leave that up to him. Feature requests should also go through Paul, I think.

Changed in ia-techsupport:
status: New → Fix Committed
Revision history for this message
Zarya Rathe (zarya) wrote :

Thanks, Jude!

Revision history for this message
Jude Coelho (judec) wrote :

Giving this to Dan to do with as he wishes.

Changed in ia-techsupport:
assignee: nobody → danh (danh-archive)
Revision history for this message
Jude Coelho (judec) wrote :

I.E. move to scribe2 or close and create a new bug for the request.

Revision history for this message
danh (danh-archive) wrote :

We should keep the history i think.

affects: ia-techsupport → scribe2
Revision history for this message
danh (danh-archive) wrote : Re: [Bug 704850] Re: ayrpostofficegen189697uns has 2 pdf pages that are too small

Jude Coelho wrote:
> I.E. move to scribe2 or close and create a new bug for the request.
>

Hi Jude,

I moved it to scribe2 so that we could keep the history.

I think we need a clear understanding of when the ppi should be
hinted at and under what circumstances.

Right now, for a newly opened book we try to provide no hints
(that was some activity last year to try to make sure that
when we start shooting we really have to choose a ppi).

For foldouts, we provide a hint for each additional shot
after the operator has set the ppi for the first shot.

For reshooting, then, i guess we want to look at the
page before and page after, or the ppi for the book,
to get some idea of how to hint?

That would be fine with me. Well, anything would
be fine with me, just need to get a sharp idea of
just what we want to do.

[And if some or all of this is wrong, please correct me.]

dan

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.