Comment 3 for bug 158435

Revision history for this message
danh (danh-archive) wrote :

Here's a tentative plan for invocation of the program.

Please let me know of ways it could be more convenient (e.g., by environment variables, files, or any other mechanism) or just better (e.g., more consistent with other programs we use or have written, or more logical, or just easier to invoke for whatever reason).

The basic idea is that you give the program one input jpg file, and specify leptonica operations (zero, one, or more) as well as one or more output jp2 files. For every output file we can gather statistics on the distribution of the pixels which go into it (because we have to handle all the samples in the leptonica-to-back-end interface, so can count them while we handle them).

The overall command would be:
program {leptonica args}{outfile and backend args} {more leptonica args}{another outfile, back end args}...

(Here the curly-braces are just for visibly grouping the args, but can't be used in practice since the shell would eat them.)

The leptonica args would be zero or more of
  --force_gray
  --rotate degrees
  --crop x y w h
  --contrast_enhance b w
  --stat outfile

The outfile and backend args would be of the form
  --start_jp2 outfile.jp2 {backend arguments} --end_jp2

The backend arguments would be passed directly to it.

So a typical call might be something like
program infile.jpg --start_jp2 outfile_orig.jp2 -rate 0.4 --end_jp2 \
  --rotate 1.2 --crop 10 12 4500 5000 --contrast 30 240 --stat statfile--start_jp2 processed.jp2 -rate 0.4 --end_jp2

Here, for reference, -rate 0.4 is a backend argument which the backend interprets as a target compression rate: 0.4 bits per pixel. This would produce two outputs, one of which was just a compressed form of the input, and one of which was deskewed, cropped, and enhanced (and then compressed).

For the format of the statfile, i'm sort of semi-planning to make it consist of key=value lines, where the keys are [rgb]{min,max,mean,std} and the values are normalized to lie in the interval [0,1]. So typical keys would be rmin, gstd, bmean, etc. For grayscale, we could just leave off [rgb] in the key name.

But we could instead emit xml, or any other kind of output --- basically, whatever would be easiest for another program to parse. (I plan to do the stats last, so there's time to think about just what's easiest to read.)

(One more item for reference here: we can't quite use getopt() because we really want to pass the backend arguments to the backend just as they arrive to us [except for the name of the outfile, which will always be present, so we can just pack that up ourselves]. If we use getopt, it will consume arguments intended for the backend, which has its own scheme for arguments.)

Thanks in advance for any suggestions.