shar fails on some UTF-8 text files

Bug #1952491 reported by Paulo Ney de Souza
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
sharutils (Ubuntu)
Confirmed
Low
Unassigned

Bug Description

When I do "shar -T test.tex > file.shar" on the one-liner file, encoded in UTF-8:

    \chapter{Séries}

everything works fine, and when executing sh file.shar the file is extracted fine and it is the same as before.

Adding one line to it and turning the file into:

     \chapter{Séries}
     Séries de potências são essenciais no estudo das funções.

the same command produces a shar file that contains the text. But when we execute sh (or unshar, or change it into an executable file and execute it) it fails the MD5 check upon extraction:

$ sh file.shar
x - created lock directory _sh06764.
x - extracting test.tex (text)
test.tex: MD5 check failed
x - removed lock directory _sh06764.
and it produces a file that is recognized as ISO-8859 text and it has all accents messed up, as in:

    \chapter{Séries}
    S©ries de potências são essenciais no estudo das funções.

I know the manual says that you may have problems e-mailing files produced with the -T flag, but this goes beyond mailing -- the file itself is corrupted.

Tags: utf8
Revision history for this message
Paulo Ney de Souza (pauloney) wrote :
Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Hi,
I can confirm your report and have tested it in Focal (1:4.15.2-5) and Jammy (1:4.15.2-5build1) which both behave that way.

I want to thank you for taking the time to report this bug and helping to make Ubuntu better. I appreciate the quality of this bug report and I'm sure it'll be helpful to others experiencing the same issue.

But this sounds like an upstream bug to me. Even the latest behaves that way and I didn#t see Ubuntu/Debian delta that seems related.

You might consider verifying this by building directly from the latest upstream source? If this can be confirmed as an upstream bug, the best route to getting it fixed in Ubuntu in this case would be to file an upstream bug if you're able to do that.
Seems the project wants mails (https://www.gnu.org/software/sharutils/manual/sharutils.html#shar-Bugs) so the result are likely mailing list entries.

Otherwise, I'm not sure what we can do directly in Ubuntu to fix the problem.

If you do end up filing an upstream bug, please link to it from here. Thanks in advance!

BTW - and by that I'm not 100% sure how valid the bug is - if I omit the -T (forcing text) everything works just fine.

Changed in sharutils (Ubuntu):
status: New → Confirmed
importance: Undecided → Low
Revision history for this message
Paulo Ney de Souza (pauloney) wrote : Re: [Bug 1952491] Re: shar fails on some UTF-8 text files

Dear Christian,

Thanks for the return. Indeed this is an upstream bug. I have tried to file
with them, but it has
ot been easy. I'll keep you posted if things change.

Paulo Ney

On Mon, Nov 29, 2021 at 7:50 AM Christian Ehrhardt  <
<email address hidden>> wrote:

> Hi,
> I can confirm your report and have tested it in Focal (1:4.15.2-5) and
> Jammy (1:4.15.2-5build1) which both behave that way.
>
> I want to thank you for taking the time to report this bug and helping
> to make Ubuntu better. I appreciate the quality of this bug report and
> I'm sure it'll be helpful to others experiencing the same issue.
>
> But this sounds like an upstream bug to me. Even the latest behaves that
> way and I didn#t see Ubuntu/Debian delta that seems related.
>
> You might consider verifying this by building directly from the latest
> upstream source? If this can be confirmed as an upstream bug, the best
> route to getting it fixed in Ubuntu in this case would be to file an
> upstream bug if you're able to do that.
> Seems the project wants mails (
> https://www.gnu.org/software/sharutils/manual/sharutils.html#shar-Bugs)
> so the result are likely mailing list entries.
>
> Otherwise, I'm not sure what we can do directly in Ubuntu to fix the
> problem.
>
> If you do end up filing an upstream bug, please link to it from here.
> Thanks in advance!
>
>
> BTW - and by that I'm not 100% sure how valid the bug is - if I omit the
> -T (forcing text) everything works just fine.
>
> ** Changed in: sharutils (Ubuntu)
> Status: New => Confirmed
>
> ** Changed in: sharutils (Ubuntu)
> Importance: Undecided => Low
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1952491
>
> Title:
> shar fails on some UTF-8 text files
>
> Status in sharutils package in Ubuntu:
> Confirmed
>
> Bug description:
> When I do "shar -T test.tex > file.shar" on the one-liner file,
> encoded in UTF-8:
>
> \chapter{Séries}
>
> everything works fine, and when executing sh file.shar the file is
> extracted fine and it is the same as before.
>
> Adding one line to it and turning the file into:
>
> \chapter{Séries}
> Séries de potências são essenciais no estudo das funções.
>
> the same command produces a shar file that contains the text. But when
> we execute sh (or unshar, or change it into an executable file and
> execute it) it fails the MD5 check upon extraction:
>
> $ sh file.shar
> x - created lock directory _sh06764.
> x - extracting test.tex (text)
> test.tex: MD5 check failed
> x - removed lock directory _sh06764.
> and it produces a file that is recognized as ISO-8859 text and it has
> all accents messed up, as in:
>
> \chapter{Séries}
> S©ries de potências são essenciais no estudo das funções.
>
> I know the manual says that you may have problems e-mailing files
> produced with the -T flag, but this goes beyond mailing -- the file
> itself is corrupted.
>
> To manage notifications about this bug go to:
>
> https://bugs.launchpad.net/ubuntu/+source/sharutils/+bug/1952491/+subscriptions
>
>

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.