convert do not preserve spaces in <pre>

Bug #1349536 reported by Mauro on 2014-07-28
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
calibre
Undecided
Unassigned

Bug Description

ebook-convert recipe do not preserve spaces in nested tag, es

<div class="highlight-python"><div class="highlight"><pre><span class="kn">from</span> <span class="nn">django.conf.urls</span> <span class="kn">import</span> <span class="n">patterns</span>
<span class="n">urlpatterns</span> <span class="o">=</span> <span class="n">patterns</span><span class="p">(</span><span class="s">''</span><span class="p">,</span>
<span class="p">(</span><span class="s">r'^articles/(\d{4})/$'</span><span class="p">,</span> <span class="s">'news.views.year_archive'</span><span class="p">),</span>
<span class="p">(</span><span class="s">r'^articles/(\d{4})/(\d{2})/$'</span><span class="p">,</span> <span class="s">'news.views.month_archive'</span><span class="p">),</span>
<span class="p">(</span><span class="s">r'^articles/(\d{4})/(\d{2})/(\d+)/$'</span><span class="p">,</span> <span class="s">'news.views.article_detail'</span><span class="p">),</span>
<span class="p">)</span>
</pre></div>

instead of

<div class="highlight-python"><div class="highlight"><pre><span class="kn">from</span> <span class="nn">django.conf.urls</span> <span class="kn">import</span> <span class="n">patterns</span>

<span class="n">urlpatterns</span> <span class="o">=</span> <span class="n">patterns</span><span class="p">(</span><span class="s">''</span><span class="p">,</span>
    <span class="p">(</span><span class="s">r'^articles/(\d{4})/$'</span><span class="p">,</span> <span class="s">'news.views.year_archive'</span><span class="p">),</span>
    <span class="p">(</span><span class="s">r'^articles/(\d{4})/(\d{2})/$'</span><span class="p">,</span> <span class="s">'news.views.month_archive'</span><span class="p">),</span>
    <span class="p">(</span><span class="s">r'^articles/(\d{4})/(\d{2})/(\d+)/$'</span><span class="p">,</span> <span class="s">'news.views.article_detail'</span><span class="p">),</span>
<span class="p">)</span>
</pre></div>

This break python indentation.

As workaround You can replace embedded BautifoulSpup.py ("3.0.5") with new version ("3.2.1").

I do not know if need some tweaking.

Calibre 1.46 on GNU/Linux Ubuntu 14.04

Mauro (gaionim) wrote :
Mauro (gaionim) wrote :

The version of BS that ships with calibre is heavily modified and cannot
simply be replaced since literally thousands of recipes depend on it.

You can use either preprocess_html_raw() in your recipe to fix up the
html however you like before it is parsed by BeautifulSoup or you can
use the JavascripRecipe class instead of BasicNewsRecipe, which uses
html5lib for parsing.

If you do want to fix BS then a limited patch to fix onlythis issue
against the existing embedded version of BS is welcome.

 status wontfix

Changed in calibre:
status: New → Won't Fix
Mauro (gaionim) wrote :

Double check, but this patch works for me.

Fixed in branch master. The fix will be in the next release. calibre is usually released every Friday.

 status fixreleased

Changed in calibre:
status: Won't Fix → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers