Problem with unknown CSS class

Bug #1598277 reported by leastcommonancestor
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
calibre
Won't Fix
Undecided
Unassigned

Bug Description

When converting some HTML fetched from somewhere with possibly incomplete or buggy CSS information, the conversion to EPUB introduces problematic class assignments, e.g. consider the following HTML:

<?xml version='1.0' encoding='utf-8'?>
<html xmlns="http://www.w3.org/1999/xhtml" lang="de" xml:lang="de">

<head>
</head>

<body class="calibre">

  <p class="foobar">Some text</p>

  <p>More text</p>

</body>

</html>

The class 'foobar' has no CSS information here. There may be no CSS information in a stylesheet or a link to a stylesheet may be broken. Now after adding this HTML as a book and converting to EPUB, the contents of the converted file are:

<?xml version='1.0' encoding='utf-8'?>
<html xmlns="http://www.w3.org/1999/xhtml" lang="de" xml:lang="de">
  <head>
    <title>Unknown</title>
    <meta http-equiv="Content-Type" content="text/html; charset=utf-8"/>
  <link href="stylesheet.css" rel="stylesheet" type="text/css"/>
<link href="page_styles.css" rel="stylesheet" type="text/css"/>
</head>
  <body class="calibre">

  <p class="foobar">Some text</p>

  <p class="foobar">More text</p>

</body>

</html>

The definition of foobar in the generated stylesheet is:

.foobar {
    display: block;
    margin: 1em 0
    }

Regarding the rendering of the HTML as it is, this is no problem. However, a possibly relevant distinction has been erased. For instance, when editing the EPUB and looking through the files, one may notice that the foobar class means centering for the stage directions in a drama. If one changes the rendering based on these semantics, the changes will now apply to other paragraphs, for which they are not appropriate. Consider something like:

.foobar {
  text-align: left;
  font-style: italic;
}
.foobar:before {
  content:"Stage direction: ";
}

The resulting problems can easily go unnoticed, if there are only a few instances of this class attribute in some big ebook.
Therefore the distinction should be preserved and the result should be

  <p class="foobar">Some text</p>

  <p class="calibre1">More text</p>

or even better

  <p class="foobar">Some text</p>

  <p>More text</p>

I am aware that this ist not a Calibre bug in the strict sense, since the input is defective. However, since the result appears to be ok and the flaws can easily go unnoticed, these defects should be handled graciously.

Tested with Calibre 2.60 / Ubuntu 15.10.

Revision history for this message
leastcommonancestor (leastcommonancestor) wrote :
Revision history for this message
Kovid Goyal (kovid) wrote : Re: calibre bug 1598277

Conversion flattens CSS, that is by design. The goal of conversion is to
preserve visual appearance, and produce output lkely to work on the
largest array of devices, while giving users automated control over CSS
transformations. Preserving input CSS is incompatible with those goals.
If you want to preseve input CSS import your html file into the editor
directly via File->Import and HTML or DOCX file as new book.

 status wontfix

Changed in calibre:
status: New → Won't Fix
Revision history for this message
leastcommonancestor (leastcommonancestor) wrote : Re: [Bug 1598277] Re: calibre bug 1598277

Hello Mr. Goyal!

Thank you for your quick reply.
It would IMHO be ok, if the CSS is not preserved at all. However, I do
not see, why some unknown class names should be preserved. But I can
live with that.

But let me take the opportunity to commend you for your outstanding work
with Calibre.
It is the best software I ever worked with. While being quite easy to
use for the beginner, it provides still immense flexibility and
elaborate, well-conceived tools for sophisticated use cases.

I'm the more pleased with Calibre when comparing it with commercial
software, e.g. crappy readers as provided by Adobe. You prove by
example, that a single, thinking person can do on it's own orders of
magnitude better than hordes of programming serfs micro-managed by
babbling idiots.

Thank you very, very much.

Greetings and best wishes for the future development of Calibre

Wolfgang

On 02.07.2016 04:39, Kovid Goyal wrote:
> Conversion flattens CSS, that is by design. The goal of conversion is to
> preserve visual appearance, and produce output lkely to work on the
> largest array of devices, while giving users automated control over CSS
> transformations. Preserving input CSS is incompatible with those goals.
> If you want to preseve input CSS import your html file into the editor
> directly via File->Import and HTML or DOCX file as new book.
>
> status wontfix
>
> ** Changed in: calibre
> Status: New => Won't Fix
>

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.