Scandata, Scandate and Operator not showing for sheetfed items
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Internet Archive - Tech Support |
Confirmed
|
High
|
Unassigned |
Bug Description
Scandata, Scandate and Sheetfed Operator are all not showing up for sheetfed items in metamanager. If you physically pull up the items, you can see all the info in the item history but for some reason metamanager is unable to collect this data by itself. Example: http://
We are unable to search for items scanned at the sheetfed and/or see how many pages our operator is putting through. Furthermore, this information is not be collected and placed in our gross margin since there is no scandate attached to each scan either. This has been an ongoing issue (over 6 months) that has never been resolved and the previous bug has seemed to disappear regarding this issue.
In the ordinary case, we extract scandate, scanner, operator (and, when present, missingpages, foldout-operator, and republisher) from the scandata file at the beignning of the derive. Sheetfed items have no scandata file initially - although a bare-bones one is created for them during the derive - so there's no place to extract those metadata from. It's true that some of that info is visible in the task history, but most is not, and in any case we have no mechanism for finding such metadata in the task history and using it.
I don't remember whether we discussed these metadata when the sheetfed process was set up, but if we do need to have scandate, etc., we'll need to establish some mechanism for uploading and extracting that info.
Two ideas occur to me right off: (1) the upload script could add a next_cmd arg for a modify_xml.php task that would add the desired info to meta.xml, and (2) we could make use of the metadata derivation pathway - you'd upload an additional simple file containing the element names and values, and we'd add those to meta.xml during the derive.