Scandata, Scandate and Operator not showing for sheetfed items

Bug #1049106 reported by Stacy
14
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Internet Archive - Tech Support
Confirmed
High
Unassigned

Bug Description

Scandata, Scandate and Sheetfed Operator are all not showing up for sheetfed items in metamanager. If you physically pull up the items, you can see all the info in the item history but for some reason metamanager is unable to collect this data by itself. Example: http://archive.org/details/nationalidentity00elli

We are unable to search for items scanned at the sheetfed and/or see how many pages our operator is putting through. Furthermore, this information is not be collected and placed in our gross margin since there is no scandate attached to each scan either. This has been an ongoing issue (over 6 months) that has never been resolved and the previous bug has seemed to disappear regarding this issue.

Revision history for this message
Hank Bromley (hank-archive) wrote :

In the ordinary case, we extract scandate, scanner, operator (and, when present, missingpages, foldout-operator, and republisher) from the scandata file at the beignning of the derive. Sheetfed items have no scandata file initially - although a bare-bones one is created for them during the derive - so there's no place to extract those metadata from. It's true that some of that info is visible in the task history, but most is not, and in any case we have no mechanism for finding such metadata in the task history and using it.

I don't remember whether we discussed these metadata when the sheetfed process was set up, but if we do need to have scandate, etc., we'll need to establish some mechanism for uploading and extracting that info.

Two ideas occur to me right off: (1) the upload script could add a next_cmd arg for a modify_xml.php task that would add the desired info to meta.xml, and (2) we could make use of the metadata derivation pathway - you'd upload an additional simple file containing the element names and values, and we'd add those to meta.xml during the derive.

Revision history for this message
danh (danh-archive) wrote : Re: [Bug 1049106] Re: Scandata, Scandate and Operator not showing for sheetfed items

Thanks for the analysis Hank.

In fact Jude had put in your option (1) [a next_cmd/modify_xml],
but somehow or other the code was not up-to-date.

Stacy is testing now as i understand it.

Revision history for this message
Jude Coelho (judec) wrote :

I checked in and pushed out a change to scribe2-upload.php to add it via
next_cmd earlier this morning.

On 9/11/12 11:32 AM, Hank Bromley wrote:
> In the ordinary case, we extract scandate, scanner, operator (and, when
> present, missingpages, foldout-operator, and republisher) from the
> scandata file at the beignning of the derive. Sheetfed items have no
> scandata file initially - although a bare-bones one is created for them
> during the derive - so there's no place to extract those metadata from.
> It's true that some of that info is visible in the task history, but
> most is not, and in any case we have no mechanism for finding such
> metadata in the task history and using it.
>
> I don't remember whether we discussed these metadata when the sheetfed
> process was set up, but if we do need to have scandate, etc., we'll need
> to establish some mechanism for uploading and extracting that info.
>
> Two ideas occur to me right off: (1) the upload script could add a
> next_cmd arg for a modify_xml.php task that would add the desired info
> to meta.xml, and (2) we could make use of the metadata derivation
> pathway - you'd upload an additional simple file containing the element
> names and values, and we'd add those to meta.xml during the derive.
>

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.