validate warc contents (warcvalidator) before upload
Bug #661520 reported by
siznax
This bug affects 1 person
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Archive Widecrawl |
Confirmed
|
Medium
|
siznax |
Bug Description
consider using Hanzo tools "warcvalidator" to validate warc contents before uploading.
set aside warc ("warc.gz.invalid") if warcvalidator fails.
Changed in archivewidecrawl: | |
status: | New → Confirmed |
importance: | Undecided → Low |
Changed in archivewidecrawl: | |
importance: | Low → Medium |
assignee: | nobody → siznax (siznax) |
To post a comment you must log in.
on second thought, may want to instead use Heritrix's WARCReader