duplicates in models.dat and hpaio.desc

Bug #217642 reported by Johannes Meixner
2
Affects Status Importance Assigned to Milestone
HPLIP
Triaged
Undecided
dwelch91

Bug Description

On my workstation with HPLIP 2.8.4:

models.dat contains several duplicates:

grep '^model[0-9]*=' /usr/share/hplip/data/models/models.dat \
 | grep -o '=.*' \
 | tr '[:upper:]' '[:lower:]' \
 | tr -c -d '[:alnum:]\n' \
 | wc -l
1308

grep '^model[0-9]*=' /usr/share/hplip/data/models/models.dat \
 | grep -o '=.*' \
 | tr '[:upper:]' '[:lower:]' \
 | tr -c -d '[:alnum:]\n' \
 | sort -u \
 | wc -l
1283

grep '^model[0-9]*=' /usr/share/hplip/data/models/models.dat \
 | grep -o '=.*' \
 | tr '[:upper:]' '[:lower:]' \
 | tr -c -d '[:alnum:]\n' \
 | sort \
 | uniq -c -d
      2 businessinkjet1000
      2 deskjet3920
      2 deskjet3940
      2 deskjet690c
      2 deskjet690cplus
      2 hp910
      2 hp915
      2 hplaserjetm1120mfp
      2 laserjerp2014
      2 laserjet9040dn
      2 officejet5105
      2 officejet5110
      2 officejet5110v
      2 photosmart7760v
      2 psc1110
      2 psc1110v
      2 psc1118
      2 psc1300
      2 psc1340allinone
      2 psc1350allinone
      2 psc1350vallinone
      2 psc1350xiallinone
      2 psc1355allinone
      3 psc760

hpaio.desc contains tons of duplicates:

grep '^:model' /usr/share/hplip/hpaio.desc \
 | grep -o '".*"' \
 | tr '[:upper:]' '[:lower:]' \
 | tr -c -d '[:alnum:]\n' \
 | wc -l
1292

grep '^:model' /usr/share/hplip/hpaio.desc \
 | grep -o '".*"' \
 | tr '[:upper:]' '[:lower:]' \
 | tr -c -d '[:alnum:]\n' \
 | sort -u \
 | wc -l
509

grep '^:model' /usr/share/hplip/hpaio.desc \
 | grep -o '".*"' \
 | tr '[:upper:]' '[:lower:]' \
 | tr -c -d '[:alnum:]\n' \
 | sort \
 | uniq -c -d \
 >/tmp/hpaio.desc.duplicates

Revision history for this message
Johannes Meixner (jsmeix) wrote :
Revision history for this message
Johannes Meixner (jsmeix) wrote :

Furthermore hpaio.desc seems to contain
many (all?) plain printers too, i.e. devices
with "scan-type=0" in models.dat, e.g. my
"Deskjet 3320" for which no scanner unit
is available (in contrast to e.g. a LaserJet 1200
for which a scanner unit is available which
make it a LaserJet 1220).

Because hpaio.desc is currently totally broken,
I use for openSUSE 11.0 the attached script
which creates hpaio.desc from models.dat
and tests for scan-type != 0 in models.dat.

Currently I inherit the duplicates from models.dat
and I don't have e.g. a LaserJet 1200 in hpaio.desc
because it has "scan-type=0" in models.dat.

Nevertheless what the script creates is much better
than the upstream hpaio.desc.

The license of the script is under the same license as
the HPLIP package itself (i.e. feel free to use it and
adapt it to your needs).

Revision history for this message
Johannes Meixner (jsmeix) wrote :
Revision history for this message
Johannes Meixner (jsmeix) wrote :

As you can see, my script adds the USB IDs to hpaio.desc
which are required by scanner setup tools (the YaST scanner
setup tool "yast2-scanner" in my case) so that those tools
could automatically assign the right driver.

Unfortunately "hp-probe -busb" does not show the USB IDs
(and "sane-find-scanner" does not detect HP all-in-one devices)
so that currently I use some weird "0x03f0 0x0000" magic
to make yast2-scanner happy.

Revision history for this message
Johannes Meixner (jsmeix) wrote :

By the way:
How can I specify the MIME type for attachments?
Currently the "create_hpaio.desc_from_models.dat" attachment
has MIME type "chemical/x-mopac-input" which is nonsense
and must be just "text/plain" like my other attachments.

Revision history for this message
dwelch91 (dwelch91) wrote : Re: [Bug 217642] Re: duplicates in models.dat and hpaio.desc
  • unnamed Edit (3.0 KiB, text/html; charset=ISO-8859-1)
  • hpaio.desc Edit (16.6 KiB, application/octet-stream; name=hpaio.desc)

Johannes-

Thanks for this defect report.

First, on the issue of duplicates, I believe that you are misinterperting
the models.dat file format. Unique device types (that are differentialable
by device ID MDL: name) are each have a separate .ini [section]. The
model1=, model2=, key/values do not represent unique devices, but rather
case part SKUs. For example. with the Deskjet F300:

[deskjet_f300_series] # <- "normalization" of device ID MDL: field
(spaces->_, lowercase, etc)
... snip...
model1=Deskjet F310 # <- List of "case part" names (only viewable on the
front of the device)
model10=Deskjet F388
model11=Deskjet F390
model12=Deskjet F394
model13=Deskjet F375
model2=Deskjet F325
model3=Deskjet F335
model4=Deskjet F340
model5=Deskjet F350
model6=Deskjet F370
model7=Deskjet F378
model8=Deskjet F380
model9=Deskjet F385
...snip...

The [deskjet_f300_series] represents a unique firmware device ID MDL: name.
The model1-9, only represent various case part numbers that appear on
various printer SKUs. They do not represent unique devices from the
standpoint of software detection, software features, etc. If you were to
purchase each of these 9 printers, they would all identify as
"deskjet_f300_series" (normalized) - you would not be able to differentiate
between each of them by reading their device IDs. In HPLIP, the modelx=
key/values are used for documentation ONLY (that is, it allows someone with
a "Deskjet F350" to get to the information for the actual device
"deskjet_f300_series."). In some cases, the device ID and case part model
name are very close, so it will appear that the [device ID MDL:] and model1=
are duplicates.

On the subject of the .desc file, there indeed was a bug introduced into the
script that creates the file during a re-org of the documentation system in
2.8.4. A fixed version is attached.

Reagrds,

Don

On Tue, Apr 15, 2008 at 3:03 AM, Johannes Meixner <email address hidden> wrote:

> By the way:
> How can I specify the MIME type for attachments?
> Currently the "create_hpaio.desc_from_models.dat" attachment
> has MIME type "chemical/x-mopac-input" which is nonsense
> and must be just "text/plain" like my other attachments.
>
> --
> duplicates in models.dat and hpaio.desc
> https://bugs.launchpad.net/bugs/217642
> You received this bug notification because you are a member of HP Linux
> Imaging and Printing, which is subscribed to HPLIP.
>

Revision history for this message
Johannes Meixner (jsmeix) wrote :

Hello Don,

I believe that you are misinterperting my report regarding
the models.dat file ;-)

I think I know the models.dat file format and I used the
model* keys intentionally to find the duplicates because
what my report actually means is that same model values
can be found under different device types (and the same
model value should also not appear more than once under
one device type).

Using my "uniq -c -d" output you get for example
for the first of the "uniq -c -d" output

[business_inkjet_1000]
model1=Business Inkjet 1000

but also

[hp_business_inkjet_1000]
model1=Business Inkjet 1000

And for the last of the "uniq -c -d" output

[psc_760]
model1=PSC 760

but also

[psc_780]
model1=PSC 760

and also

[psc_780xi]
model1=PSC 760

At least this looks somehow buggy.
Perhaps technically this is all correct (i.e. when HP sells devices
under the same name which are actually different device types)
but I would not understand how a user who has such a device
which is e.g. labeled "PSC 760" should know which of the possible
device types the right one is when all he can select are three
times the same name "PSC 760"?

Revision history for this message
Johannes Meixner (jsmeix) wrote :

There are still differences between my created hpaio.desc file
and the new one from you.

1.
A typo which is fixed by my script (see my script)
comes from the "Aii-in-One" in your models.dat:

model2=Photosmart C5240 Aii-in-One
model3=Photosmart C5250 Aii-in-One
model4=Photosmart C5280 Aii-in-One

2.
My created hpaio.desc file contains more models
than yours but yours contains some models which
mine does not contain.
Attached the "comm" differences shown as "unified"
model names (only [0-9a-z] characters).
First column: models unique in your new hpaio.desc
Second column: models unique in my cretated hpaio.desc
Third column: models in both hpaio.desc files
(see "man comm").

Revision history for this message
Johannes Meixner (jsmeix) wrote :

I think I know the reason why my created hpaio.desc
contains more models than your new hpaio.desc:

My script uses also the value of the [...] ini
from models.dat as one possible model name
because I found out that there are some entries
in models.dat where there is no model* value
so that the [...] value is the only value which
represents a model name (is this perhaps another
bug in models.dat?).

Those unique device entries in models.dat do not
have a model value:

egrep '^\[|^model[0-9]*' /usr/share/hplip/data/models/models.dat \
 | tr -d '\n' \
 | tr '[' '\n' \
 | grep -v 'model[0-9]*' \
 | tr -d ']'

deskjet_950c
hp_color_inkjet_cp1700
hp_color_laserjet_2605
hp_color_laserjet_2605dn
hp_color_laserjet_2605dtn
photosmart_pro_b9100_series
psc_900_series
psc_920
psc_950
psc_950vr
psc_950xi

Revision history for this message
dwelch91 (dwelch91) wrote :
  • unnamed Edit (2.8 KiB, text/html; charset=ISO-8859-1)

"I believe that you are misinterperting my report regarding
the models.dat file ;-)"

I suppose so! :-)

The PSC 780/760/780xi does indeed look buggy.

The BIJ 1000 is probably because the device ID either changed during
development (and we wanted all the devices to continue to work), or a user
reported a new device ID from the field that was different than what we had
in-house (so, they must have changed the device ID between pre-production
and production). In either case, we tend to leave both of them in the file
just in case, so that devices are more apt to work than not (but, as you
point out, this can cause some confusion for documentation).

I will look into the other duplicates and see if I can make sense of any of
them.

Thanks,

Don

On Wed, Apr 16, 2008 at 2:59 AM, Johannes Meixner <email address hidden> wrote:

> Hello Don,
>
> I believe that you are misinterperting my report regarding
> the models.dat file ;-)
>
> I think I know the models.dat file format and I used the
> model* keys intentionally to find the duplicates because
> what my report actually means is that same model values
> can be found under different device types (and the same
> model value should also not appear more than once under
> one device type).
>
> Using my "uniq -c -d" output you get for example
> for the first of the "uniq -c -d" output
>
> [business_inkjet_1000]
> model1=Business Inkjet 1000
>
> but also
>
> [hp_business_inkjet_1000]
> model1=Business Inkjet 1000
>
> And for the last of the "uniq -c -d" output
>
> [psc_760]
> model1=PSC 760
>
> but also
>
> [psc_780]
> model1=PSC 760
>
> and also
>
> [psc_780xi]
> model1=PSC 760
>
> At least this looks somehow buggy.
> Perhaps technically this is all correct (i.e. when HP sells devices
> under the same name which are actually different device types)
> but I would not understand how a user who has such a device
> which is e.g. labeled "PSC 760" should know which of the possible
> device types the right one is when all he can select are three
> times the same name "PSC 760"?
>
> --
> duplicates in models.dat and hpaio.desc
> https://bugs.launchpad.net/bugs/217642
> You received this bug notification because you are a member of HP Linux
> Imaging and Printing, which is subscribed to HPLIP.
>

Revision history for this message
Johannes Meixner (jsmeix) wrote :

It seems you have to deal with multiple device IDs
and multiple model names (and perhaps multiple USB IDs
and whatever else multiple stuff).

What do you think about a change in the models.dat format
for a future HPLIP version so that multiple device IDs are allowed?

For example let the [...] entry be the default/fallback device ID
and have additional optional entries for the device IDs like

[business_inkjet_1000]
id1=hp_business_inkjet_1000
model1=Business Inkjet 1000

Perhaps also if there are several USB IDs for a technically
same model like:

[funprinter_1000]
id1=hp_funprinter_1000
id2=hewlett-packard_funprinter_1000_series
model1=Fun Printer 1000
model2=Fun Printer 1000 XL
usb-pid1=1a2b
usb-pid2=3c4f

Perhaps so that the numbers after the keywords must match
and that the [...] entry is only an abstract unique value which
describes the device class, i.e. that the above means that
there is a funprinter_1000 device class which contains
exactly two models
one with
  id1=hp_funprinter_1000
  model1=Fun Printer 1000
  usb-pid1=1a2b
and the other one with
  id2=hewlett-packard_funprinter_1000_series
  model2=Fun Printer 1000 XL
  usb-pid2=3c4f

If the "hewlett-packard_funprinter_1000_series" consists
of several model names it could be
  id2=hewlett-packard_funprinter_1000_series
  id3=hewlett-packard_funprinter_1000_series
  id4=hewlett-packard_funprinter_1000_series
  model2=Fun Printer 1000 XL
  model3=Fun Printer 1000 XXL
  model4=Fun Printer 1000 MFP
  usb-pid2=3c4f
  usb-pid3=3c4f
  usb-pid4=3c4f

If the "Fun Printer 1000 MFP" has a scanner unit
out of the box but the scanner unit is available also
(only) for the "Fun Printer 1000 XXL" (think about
the "LaserJet 1200" versus "LaserJet 1220") it could be
  id2=hewlett-packard_funprinter_1000_series
  id3=hewlett-packard_funprinter_1000_series
  id4=hewlett-packard_funprinter_1000_series
  id5=hewlett-packard_funprinter_1000_series
  model2=Fun Printer 1000 XL
  model3=Fun Printer 1000 XXL
  model4=Fun Printer 1000 MFP
  model5=Fun Printer 1000 XXL with scanner unit
  scan-type2=0
  scan-type3=0
  scan-type4=1
  scan-type5=1
  usb-pid2=3c4f
  usb-pid3=3c4f
  usb-pid4=3c4f
  usb-pid5=3c4f

The advantage would be that it is much more clear
what the actual hardware is out there in the world,
in particular you would not have to guess why there is
[business_inkjet_1000] and [hp_business_inkjet_1000] ;-)

Even for (automated) setup tools it would help
because an (automated) setup tool which detects
the USB ID 1a2b would know that this ID is only
a "Fun Printer 1000" which has never a scanner unit
but if the USB ID "3c4f" is autodetected, the tool
would know that several models match to it
and it could do whatever seems appropriate:
- either a conservative setup (only the printer)
- or a blind full-feature setup (printer + scanner)
- or no automated setup and leave it up to the user
  to select his actual model name manually.

Revision history for this message
dwelch91 (dwelch91) wrote :
  • unnamed Edit (2.0 KiB, text/html; charset=ISO-8859-1)

>
> 1.
> A typo which is fixed by my script (see my script)
> comes from the "Aii-in-One" in your models.dat:
>
> model2=Photosmart C5240 Aii-in-One
> model3=Photosmart C5250 Aii-in-One
> model4=Photosmart C5280 Aii-in-One
>

Fixed.

On Wed, Apr 16, 2008 at 3:27 AM, Johannes Meixner <email address hidden> wrote:

> There are still differences between my created hpaio.desc file
> and the new one from you.
>
> 1.
> A typo which is fixed by my script (see my script)
> comes from the "Aii-in-One" in your models.dat:
>
> model2=Photosmart C5240 Aii-in-One
> model3=Photosmart C5250 Aii-in-One
> model4=Photosmart C5280 Aii-in-One
>
> 2.
> My created hpaio.desc file contains more models
> than yours but yours contains some models which
> mine does not contain.
> Attached the "comm" differences shown as "unified"
> model names (only [0-9a-z] characters).
> First column: models unique in your new hpaio.desc
> Second column: models unique in my cretated hpaio.desc
> Third column: models in both hpaio.desc files
> (see "man comm").
>
>
>
>
> ** Attachment added: "hpaio.desc.comm-output"
> http://launchpadlibrarian.net/13503109/hpaio.desc.comm
>
> --
> duplicates in models.dat and hpaio.desc
> https://bugs.launchpad.net/bugs/217642
> You received this bug notification because you are a member of HP Linux
> Imaging and Printing, which is subscribed to HPLIP.
>

Revision history for this message
dwelch91 (dwelch91) wrote :
  • unnamed Edit (2.8 KiB, text/html; charset=ISO-8859-1)

>
> Those unique device entries in models.dat do not
> have a model value:
>
> egrep '^\[|^model[0-9]*' /usr/share/hplip/data/models/models.dat \
> | tr -d '\n' \
> | tr '[' '\n' \
> | grep -v 'model[0-9]*' \
> | tr -d ']'
>
> deskjet_950c
> hp_color_inkjet_cp1700
> hp_color_laserjet_2605
> hp_color_laserjet_2605dn
> hp_color_laserjet_2605dtn
> photosmart_pro_b9100_series
> psc_900_series
> psc_920
> psc_950
> psc_950vr
> psc_950xi
>

Fixed. ... although, for things ones like "psc_900_series", that seems
rather duplicative of the other psc_9xx entries, so I'm inclined to leave it
(so it won't break any devices), but just put in "PSC 900 Series" into the
model1 field, with the understanding that there really isn't any device that
has a case model number (SKU) with that name. This is probably another
example of varying device IDs.

-Don

On Wed, Apr 16, 2008 at 3:39 AM, Johannes Meixner <email address hidden> wrote:

> I think I know the reason why my created hpaio.desc
> contains more models than your new hpaio.desc:
>
> My script uses also the value of the [...] ini
> from models.dat as one possible model name
> because I found out that there are some entries
> in models.dat where there is no model* value
> so that the [...] value is the only value which
> represents a model name (is this perhaps another
> bug in models.dat?).
>
> Those unique device entries in models.dat do not
> have a model value:
>
> egrep '^\[|^model[0-9]*' /usr/share/hplip/data/models/models.dat \
> | tr -d '\n' \
> | tr '[' '\n' \
> | grep -v 'model[0-9]*' \
> | tr -d ']'
>
> deskjet_950c
> hp_color_inkjet_cp1700
> hp_color_laserjet_2605
> hp_color_laserjet_2605dn
> hp_color_laserjet_2605dtn
> photosmart_pro_b9100_series
> psc_900_series
> psc_920
> psc_950
> psc_950vr
> psc_950xi
>
> --
> duplicates in models.dat and hpaio.desc
> https://bugs.launchpad.net/bugs/217642
> You received this bug notification because you are a member of HP Linux
> Imaging and Printing, which is subscribed to HPLIP.
>

Revision history for this message
dwelch91 (dwelch91) wrote :
  • unnamed Edit (3.2 KiB, text/html; charset=ISO-8859-1)
  • hpaio.desc Edit (16.6 KiB, application/octet-stream; name=hpaio.desc)

New, corrected hpaio.desc file attached.

-Don

On Wed, Apr 16, 2008 at 11:07 AM, dwelch91 <email address hidden> wrote:

> Those unique device entries in models.dat do not
> > have a model value:
> >
> > egrep '^\[|^model[0-9]*' /usr/share/hplip/data/models/models.dat \
> > | tr -d '\n' \
> > | tr '[' '\n' \
> > | grep -v 'model[0-9]*' \
> > | tr -d ']'
> >
> > deskjet_950c
> > hp_color_inkjet_cp1700
> > hp_color_laserjet_2605
> > hp_color_laserjet_2605dn
> > hp_color_laserjet_2605dtn
> > photosmart_pro_b9100_series
> > psc_900_series
> > psc_920
> > psc_950
> > psc_950vr
> > psc_950xi
> >
>
> Fixed. ... although, for things ones like "psc_900_series", that seems
> rather duplicative of the other psc_9xx entries, so I'm inclined to leave it
> (so it won't break any devices), but just put in "PSC 900 Series" into the
> model1 field, with the understanding that there really isn't any device that
> has a case model number (SKU) with that name. This is probably another
> example of varying device IDs.
>
>
> -Don
>
>
>
> On Wed, Apr 16, 2008 at 3:39 AM, Johannes Meixner <email address hidden> wrote:
>
> > I think I know the reason why my created hpaio.desc
> > contains more models than your new hpaio.desc:
> >
> > My script uses also the value of the [...] ini
> > from models.dat as one possible model name
> > because I found out that there are some entries
> > in models.dat where there is no model* value
> > so that the [...] value is the only value which
> > represents a model name (is this perhaps another
> > bug in models.dat?).
> >
> > Those unique device entries in models.dat do not
> > have a model value:
> >
> > egrep '^\[|^model[0-9]*' /usr/share/hplip/data/models/models.dat \
> > | tr -d '\n' \
> > | tr '[' '\n' \
> > | grep -v 'model[0-9]*' \
> > | tr -d ']'
> >
> > deskjet_950c
> > hp_color_inkjet_cp1700
> > hp_color_laserjet_2605
> > hp_color_laserjet_2605dn
> > hp_color_laserjet_2605dtn
> > photosmart_pro_b9100_series
> > psc_900_series
> > psc_920
> > psc_950
> > psc_950vr
> > psc_950xi
> >
> > --
> > duplicates in models.dat and hpaio.desc
> > https://bugs.launchpad.net/bugs/217642
> > You received this bug notification because you are a member of HP Linux
> > Imaging and Printing, which is subscribed to HPLIP.
> >
>
>

Revision history for this message
Johannes Meixner (jsmeix) wrote :

In the HPLIP 2.8.5 sources scan/sane/hpaio.desc
looks correct.

Nevertheless I will still use my script to generate
/usr/share/sane/descriptions-external/hpaio.desc
in our sane-backends RPM from models.dat because
I like to have the USB IDs in the hpaio.desc file.

I.e. I would appreciate it if your hpaio.desc source file
has also the USB IDs - the syntax would be:
-------------------------------
:model "LaserJet 1220"
:usbid "0x03f0" "0x0417"
:status :good
-------------------------------

Changed in hplip:
assignee: nobody → kalosaurusrex
status: New → Fix Released
Revision history for this message
Aaron Albright (albrigha-deactivatedaccount) wrote :

Going to verify tomorrow. Sorry about that.

A

Changed in hplip:
status: Fix Released → Triaged
Changed in hplip:
assignee: kalosaurusrex → dwelch91
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.