Running catman makes man display junk data
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
man-db (Debian) |
Fix Released
|
Unknown
|
|||
man-db (Ubuntu) |
Fix Released
|
High
|
Colin Watson | ||
Lucid |
Fix Released
|
High
|
Colin Watson |
Bug Description
After running "sudo catman" to update the man page cache, the display of some man pages will be completely corrupted. Deleting the corresponding entries in /var/cache/
I can reproduce this problem with man-db 2.5.7-3, but not with previous 2.5.6-2. I feel that it can be related to the following upstream change in 2.5.7-1:
> - Always save cat pages in UTF-8 (closes: #446741).
I see the following pipe being run:
/usr/bin/zsoelim | /usr/lib/
This is very wrong because it tries to convert gzip-compressed data from ASCII to UTF-8! Oh noes!
== Regression details ==
Discovered in version: 2.5.7-2 (lucid), 2.5.7-3 (maverick)
Last known good version: 2.5.6-2 (karmic)
== SRU details ==
Impact: Preformatted manual pages ("cat pages") may be corrupted by running iconv after compression rather than before. This is a regression introduced upstream in man-db 2.5.7.
Patch: Fixed upstream in http://
TEST CASE: Run 'sudo catman 1', then make sure you're using an 80-column terminal window (so that cat pages are used) and run 'LC_ALL=C man a2p'. The broken version will show binary garbage. To clear out cat pages to test the working version, run 'sudo rm /var/cache/
Regression potential: None seems likely, and the test case should be sufficient to catch misbuilds and the like.
---
Original question by Thinboy00:
Some man pages are appearing corrupt (i.e. if I type man foo at the terminal, I get a bunch of caret escaped characters and a few ascii characters). It looks as if man is reading the compressed data in /usr/share/man instead of uncompressing it first. I will soon attach a screenshot of the problem. The really confusing part, though, is that some man pages consistently appear corrupt and the others consistently appear correctly. I've tried the following, to no avail:
mandb
mandb -t
mandb -c
catman
And yes, I did remember to use sudo on those. mandb -t said "whatis parse for /usr/share/
zcat /usr/man/
and it gave unformatted but readable output, even though man nethack doesn't work. I tried the same trick with slashem, which does work correctly for man, and the zcat trick worked too. I'm surprised their man pages behave differently since the pages themselves are practically identical.
Is my copy of man broken?
affects: | ubuntu → man-db (Ubuntu) |
Changed in man-db (Ubuntu): | |
status: | New → Confirmed |
tags: | added: regression-release |
summary: |
- Cached (compressed) man pages are corrupted by conversion to UTF-8 + Running catman makes man display junk data |
description: | updated |
Changed in man-db (Ubuntu Lucid): | |
status: | New → Triaged |
importance: | Undecided → High |
assignee: | nobody → Colin Watson (cjwatson) |
description: | updated |
Changed in man-db (Ubuntu Lucid): | |
status: | Triaged → In Progress |
Changed in man-db (Debian): | |
status: | Unknown → Fix Released |
tags: | added: testcase |
If I run `sudo man -L fr_CA.utf8 -caM /usr/share/man 6 nethack' from the command line, no cache file is created and so the bug does not happen. If I create a small script that runs the same command, but clears the environment, I get a junk man page. Here is a such a script:
#!/usr/bin/python "/usr/bin/ man", ["man", "-L", "fr_CA.utf8", "-caM", "/usr/share/man", "6", "nethack"], {})
import os
os.execve(
If I restore the environment variable about the locale, a cat file is created, but I do not get the bug:
#!/usr/bin/python "/usr/bin/ man", ["man", "-L", "fr_CA.utf8", "-caM", "/usr/share/man", "6", "nethack"], {'LANG': 'fr_CA.utf8'})
import os
os.execve(
So a possible solution would be for catman to add a LANG environment variable to the execve call to man. I don't know enough about the usage cases of catman; is the locale supposed to be valid? Is it good enough to be used for system-wide purposes? Otherwise, should catman fake an utf-8 locale?