Change justification sometimes does not work

Bug #1255541 reported by TomasHnyk
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
calibre
Invalid
Undecided
Unassigned

Bug Description

I try out my recipe with

ebook-convert my_recipe.recipe output.mobi --password pass --username username --test 10 --output-profile kindle --change-justification justify -vv --debug-pipeline debug

and the justification is changed to justify (even though the original is not justified).

When I load the recipe into gui via Fetch News>Add a custom news source and I set "Text Justitifaction" to "Justify text" in Preferences>Conversion>Common Options>Look and Feel and I run the recipe, justification is not changed.

If I set no_stylesheets = True in my recipe, justification is changed. When I run conversion from mobi to mobi, the justification is changed to justify as well.

to me, it seems to be a bug unless --change-justification justify is intended to mean something else then the corresponding option in the gui.

This was also discussed here:
http://www.mobileread.com/forums/showthread.php?p=2672301

(The recipe is here. It normally needs an account but this is visible even without one.)

#!/usr/bin/python
# -*- coding: utf-8 -*-
# License: GNU General Public License v3 - http://www.gnu.org/copyleft/gpl.html
# Copyright: <email address hidden>

__license__ = 'GNU General Public License v3 - http://www.gnu.org/copyleft/gpl.html'
__copyright__ = '<email address hidden>'

import re
from calibre.web.feeds.recipes import BasicNewsRecipe
from calibre.ebooks.BeautifulSoup import BeautifulSoup,Tag
#This imports the version bundled with Calibre
import lxml
from lxml.builder import E

class respektRecipe(BasicNewsRecipe):
    __author__ = 'Tomáš Hnyk'
    title = u'Respekt'
    publisher = u'Respekt Publishing a. s.'
    description = u'Articles from the printed edition without translations from The Economist that are not available online'
    encoding = 'cp1250'
    language = 'cs'
    remove_javascript = True
    extra_css = 'ul {color:black} .image_caption {font-size:50%;font-style:italic;}, .author {text-align:left;}'
    remove_tags_before = dict(name='div',attrs={'class':['l']})
    remove_tags_after = dict(id='text')
    remove_tags = [dict(name='ul', attrs={'class':['tabs-d'],'id':['comm']}), \
    dict(name='div',attrs={'class':['slot','reklama','date']}), \
    dict(name='span', attrs={'class':['detail-vykrik']}), \
    dict(name='p', attrs={'class':['detail-vykrik']}), \
    dict(name='div', attrs={'id':['col123d-video','col123d-infographic','col123d-gallery','col12d-discussion']}), # soup>lxml>soup in prprocess requires this
    dict(name='strong', attrs={'class':['detail-vykrik']})]
    # this makes authors left-aligned by not using the author class)
    preprocess_regexps = [(re.compile(r'<div class="author">', re.DOTALL|re.IGNORECASE), lambda match: '<div class="">')]
    # remove empty tags
    preprocess_regexps.append((re.compile(r'<strong> </strong>', re.DOTALL|re.IGNORECASE), lambda match: ' '))
    preprocess_regexps.append((re.compile(r'<strong>&nbsp;</strong>', re.DOTALL|re.IGNORECASE), lambda match: '&nbsp;'))
    preprocess_regexps.append((re.compile(r'<p></p>', re.DOTALL|re.IGNORECASE), lambda match: ''))

    def get_cover_url(self):
        soup = self.index_to_soup('http://respekt.ihned.cz/')
        cover = soup.findAll('div', attrs={'class':'cover'})[0].find('img')['src']
        return cover

    #needs_subscription = True

    """
    def get_browser(self):
        br = BasicNewsRecipe.get_browser(self)
        if self.username is not None and self.password is not None:
            br.open('http://muj-ucet.ihned.cz/')
            br.select_form(name='login')
            br['login[nick]'] = self.username
            br['login[pass]'] = self.password
            br.submit()
        return br
    """

    def parse_index(self):
        raw = self.index_to_soup('http://respekt.ihned.cz/aktualni-cislo/', raw=True)
        root = lxml.html.fromstring(raw)
        ans = []
        for article in root.xpath("//div[@class='ow-enclose']/div[@class='ow']"):
            section_title = article.xpath(".//span[text()='(rubrika: ']")[0].find("a").text
            date = article.xpath("//span[@class='date-author']")[0].text[:-3]
            author = article.xpath("//span[@class='date-author']")[0].find("a").text
            title = article.find("h2").find("a").text
            url = article.find('h2').find('a').get('href')
            link = {'title':title,'url':url,'date':date,'author':author}
            for section in ans:
                if section[0] == section_title:
                    section[1].append(link)
                    break
            else:
                ans.append((section_title,[link]))
        return ans

    def cleanup(self):
        self.browser.open('http://muj-ucet.ihned.cz/?login[logout]=1')

    def preprocess_html(self,soup):
        raw = u''.join(unicode(a) for a in soup.contents)
        root = lxml.html.fromstring(raw)

        # Make image captions visible
        body = root.xpath("//div[@id='text']")[0]
        add = 0
        for index, element in enumerate(body):
            try:
                if element.tag == 'img':
                    body.insert(index+add+1,E.p(element.get('title'),{"class":"image_caption"}))
                    add += 1
            except:
                pass

        # Add length in words after author
        article_length = str(len(body.text_content().split(' '))) + ' slov'
        root.xpath("//div[@class='author-image']/div[@class='']/ul")[0].append(E.li(article_length))

        # Make perex (subheading) start on a new line
        root.xpath("//h1")[0].append(E.br(''))

        return(BeautifulSoup(lxml.etree.tostring(root,encoding=unicode)))

Revision history for this message
TomasHnyk (sup) wrote :

Eh, the line is this:
ebook-convert my_recipe.recipe output.mobi --test 1 --output-profile kindle --change-justification justify -vv --debug-pipeline debug

And this is on Calibre 1.10, Ubuntu 13.10.

Revision history for this message
Kovid Goyal (kovid) wrote : Re: calibre bug 1255541

THe GUI does not apply conversion options (with a few specific
exceptions) to news downloads. If you want to apply a conversion option
unconditionally to the recipe, use conversion_options in the recipe
itself.

 status invalid

Changed in calibre:
status: New → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.