calibre

Change justification sometimes does not work

Bug #1255541 reported by TomasHnyk on 2013-11-27

This bug affects 1 person

Affects		Status	Importance	Assigned to	Milestone
	calibre	Invalid	Undecided	Unassigned

Bug Description

I try out my recipe with

ebook-convert my_recipe.recipe output.mobi --password pass --username username --test 10 --output-profile kindle --change-justification justify -vv --debug-pipeline debug

and the justification is changed to justify (even though the original is not justified).

When I load the recipe into gui via Fetch News>Add a custom news source and I set "Text Justitifaction" to "Justify text" in Preferences>Conversion>Common Options>Look and Feel and I run the recipe, justification is not changed.

If I set no_stylesheets = True in my recipe, justification is changed. When I run conversion from mobi to mobi, the justification is changed to justify as well.

to me, it seems to be a bug unless --change-justification justify is intended to mean something else then the corresponding option in the gui.

This was also discussed here:
http://www.mobileread.com/forums/showthread.php?p=2672301

(The recipe is here. It normally needs an account but this is visible even without one.)

#!/usr/bin/python
# -*- coding: utf-8 -*-
# License: GNU General Public License v3 - http://www.gnu.org/copyleft/gpl.html
# Copyright: <email address hidden>

__license__ = 'GNU General Public License v3 - http://www.gnu.org/copyleft/gpl.html'
__copyright__ = '<email address hidden>'

import re
from calibre.web.feeds.recipes import BasicNewsRecipe
from calibre.ebooks.BeautifulSoup import BeautifulSoup,Tag
#This imports the version bundled with Calibre
import lxml
from lxml.builder import E

class respektRecipe(BasicNewsRecipe):
    __author__ = 'Tomáš Hnyk'
    title = u'Respekt'
    publisher = u'Respekt Publishing a. s.'
    description = u'Articles from the printed edition without translations from The Economist that are not available online'
    encoding = 'cp1250'
    language = 'cs'
    remove_javascript = True
    extra_css = 'ul {color:black} .image_caption {font-size:50%;font-style:italic;}, .author {text-align:left;}'
    remove_tags_before = dict(name='div',attrs={'class':['l']})
    remove_tags_after = dict(id='text')
    remove_tags = [dict(name='ul', attrs={'class':['tabs-d'],'id':['comm']}), \
    dict(name='div',attrs={'class':['slot','reklama','date']}), \
    dict(name='span', attrs={'class':['detail-vykrik']}), \
    dict(name='p', attrs={'class':['detail-vykrik']}), \
    dict(name='div', attrs={'id':['col123d-video','col123d-infographic','col123d-gallery','col12d-discussion']}), # soup>lxml>soup in prprocess requires this
    dict(name='strong', attrs={'class':['detail-vykrik']})]
    # this makes authors left-aligned by not using the author class)
    preprocess_regexps = [(re.compile(r'<div class="author">', re.DOTALL|re.IGNORECASE), lambda match: '<div class="">')]
    # remove empty tags
    preprocess_regexps.append((re.compile(r'<strong> </strong>', re.DOTALL|re.IGNORECASE), lambda match: ' '))
    preprocess_regexps.append((re.compile(r'<strong> </strong>', re.DOTALL|re.IGNORECASE), lambda match: ' '))
    preprocess_regexps.append((re.compile(r'<p></p>', re.DOTALL|re.IGNORECASE), lambda match: ''))

    def get_cover_url(self):
        soup = self.index_to_soup('http://respekt.ihned.cz/')
        cover = soup.findAll('div', attrs={'class':'cover'})[0].find('img')['src']
        return cover

#needs_subscription = True

    """
    def get_browser(self):
        br = BasicNewsRecipe.get_browser(self)
        if self.username is not None and self.password is not None:
            br.open('http://muj-ucet.ihned.cz/')
            br.select_form(name='login')
            br['login[nick]'] = self.username
            br['login[pass]'] = self.password
            br.submit()
        return br
    """

    def parse_index(self):
        raw = self.index_to_soup('http://respekt.ihned.cz/aktualni-cislo/', raw=True)
        root = lxml.html.fromstring(raw)
        ans = []
        for article in root.xpath("//div[@class='ow-enclose']/div[@class='ow']"):
            section_title = article.xpath(".//span[text()='(rubrika: ']")[0].find("a").text
            date = article.xpath("//span[@class='date-author']")[0].text[:-3]
            author = article.xpath("//span[@class='date-author']")[0].find("a").text
            title = article.find("h2").find("a").text
            url = article.find('h2').find('a').get('href')
            link = {'title':title,'url':url,'date':date,'author':author}
            for section in ans:
                if section[0] == section_title:
                    section[1].append(link)
                    break
            else:
                ans.append((section_title,[link]))
        return ans

def cleanup(self):
self.browser.open('http://muj-ucet.ihned.cz/?login[logout]=1')

    def preprocess_html(self,soup):
        raw = u''.join(unicode(a) for a in soup.contents)
        root = lxml.html.fromstring(raw)

        # Make image captions visible
        body = root.xpath("//div[@id='text']")[0]
        add = 0
        for index, element in enumerate(body):
            try:
                if element.tag == 'img':
                    body.insert(index+add+1,E.p(element.get('title'),{"class":"image_caption"}))
                    add += 1
            except:
                pass

        # Add length in words after author
        article_length = str(len(body.text_content().split(' '))) + ' slov'
        root.xpath("//div[@class='author-image']/div[@class='']/ul")[0].append(E.li(article_length))

# Make perex (subheading) start on a new line
root.xpath("//h1")[0].append(E.br(''))

return(BeautifulSoup(lxml.etree.tostring(root,encoding=unicode)))

Revision history for this message

TomasHnyk (sup) wrote on 2013-11-27:

Eh, the line is this:
ebook-convert my_recipe.recipe output.mobi --test 1 --output-profile kindle --change-justification justify -vv --debug-pipeline debug

And this is on Calibre 1.10, Ubuntu 13.10.

Revision history for this message

Kovid Goyal (kovid) wrote on 2013-11-27: Re: calibre bug 1255541

THe GUI does not apply conversion options (with a few specific
exceptions) to news downloads. If you want to apply a conversion option
unconditionally to the recipe, use conversion_options in the recipe
itself.

status invalid

Changed in calibre:
status:	New → Invalid

Report a bug

This report contains Public information

Everyone can see this information.

You are

Subscribing...

Edit bug mail

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.