Garbage in Date/Time

Bug #775269 reported by Edward Teo
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
RedNotebook
Fix Released
Medium
Jendrik Seipp

Bug Description

I am new to RedNotebook, I think its going to be a very useful app if only the date/time could show properly.
I just cannot get the Date/Time to display properly.
Here is a screenshot of what i mean:
http://screencast.com/t/gLc4fauQ
(I think %A is not showing up properly... unicode problem?)

My OS is Winxp SP3, Japanese ver. Locale is Tokyo.

Revision history for this message
Edward Teo (vanquish-box) wrote :

Sorry, I forgot to give version info.

Version 1.1.4

Problem occurs in both installer and portable version.

Revision history for this message
Edward Teo (vanquish-box) wrote :

Bug #762942: Automatic input of time(Ctrl + D) is always 00:00

Just to add.... appened to see the above bug report. I am suffering from this as well.

Changed in rednotebook:
status: New → Confirmed
importance: Undecided → Medium
Revision history for this message
Jendrik Seipp (jendrikseipp) wrote :

This code fragment should reproduce the problem:

def format_date(format_string, date=None):
    if date is None:
        date = datetime.datetime.now()
    date_string = date.strftime(format_string)
    # Turn date into unicode string
    date_string = date_string.decode('utf-8', 'replace')
    return date_string

import locale
locale.setlocale(locale.LC_ALL, '')
print format_date("%A %B", date=datetime.date(2011, 3, 1))

Does anyone have an idea about how to fix this unicode issue?

Revision history for this message
Jendrik Seipp (jendrikseipp) wrote :

For the above code to work you have to import "datetime" first.

Revision history for this message
TLE (k-nielsen81) wrote :

This applies to a few Danish letters in weekend days as well (Lørdag, Søndag). The are not rendered correctly in the preferences dialog and in the automatically inserted weekdays in the exports, but they are rendered correctly in the preview and in the export content.

Since, as you know, I use Rednotenook as a log book at work, the obvious solution is off course for me to simply not work in the weekends ;)

I'll can reproduce the problem with the code above and will have a look at to see if I can figure it out.

PS: I'm assuming the export weekdays problem is related, else please inform me and I will make a separate report

Revision history for this message
Jendrik Seipp (jendrikseipp) wrote : Re: [Bug 775269] Re: Garbage in Date/Time

Am 05.05.2011 14:02, schrieb TLE:
> I'll can reproduce the problem with the code above and will have a look
> at to see if I can figure it out.
That would be great!
> PS: I'm assuming the export weekdays problem is related, else please
> inform me and I will make a separate report
Yes, this is the same issue, no separate report is needed.

Revision history for this message
TLE (k-nielsen81) wrote :

Hmm, seems to be a windows only problem, at least the code works fine for me on Linux. The locales get set differently, but it think they ar esupposed to, so that is probably not the problem.
import locale
locale.setlocale(locale.LC_ALL, '')
locale.getlocale()
('Danish_Denmark', '1252')

on Linux:
import locale
locale.setlocale(locale.LC_ALL, '')
locale.getlocale()
('da_DK', 'UTF8')

Revision history for this message
TLE (k-nielsen81) wrote :

I believe this solves it:

def format_date(format_string, date=None):
    if date is None:
        date = datetime.datetime.now()
    date_string = date.strftime(format_string)
    # Turn date into unicode object
    date_string = date_string.decode(locale.getlocale()[1], 'replace')
    return date_string

import locale
import datetime
locale.setlocale(locale.LC_ALL, '')
print format_date("%A %B", date=datetime.date(2011, 5, 1))

All right so I had a thought of what might be the problem, so I went to my local character set guru and talked with him about it: date.strftime returns a byte string object where the special characters are represented by several bytes according to the encoding set in locale, but it does not know anything about it. It is simply a byte string. So what decode does when called, is to return that byte string as a unicode object which is python internal unambigous string representation. Therefore, the argument that decode needs is not what you want out (that is always a unicode object), but the encoding of the byte string, which it does not know. And so the reason that it worked before in Linux is that here the byte string returned from strftime just happened to be in UTF8 since that is what is given in the locale (in most cases I gues, not sure about chinese).

In any case, if what you want returned from the function is a unicode object then the above will do the trick. If what you wanted was a UTF8 encoded byte string the you need to add .encode('UTF8') to the relevant line.

Regards Kenneth

PS: BTW i tested on both Linux and Windows, works great
PPS: I have not checked what the 'replace' argument does (but i can guess), so you have to make sure yourself that it is still right.

Revision history for this message
Ask Hjorth Larsen (askhl) wrote :

What is the purpose of converting the date_string to unicode? strftime returns something locale-dependent, so that should be in the correct encoding already. By converting to unicode you are discarding that information, and when you print the unicode object, Python (or rather: the file object you are printing to) does not know anymore which encoding you want when it attempts to turn the unicode object back into bytes in order to print it. You either have to explicitly reconvert it back to the locale's encoding, *or* never convert to a unicode object in the first place.

Disclaimer: I think the above is true, but I didn't specifically test any of it. Some tinkering may be required.

Revision history for this message
Jendrik Seipp (jendrikseipp) wrote :

Awesome Kenneth, thank you very much for your help! I'll incorporate your fix as soon as time permits. This will make many WIndows people happy I guess :)

@Ask: All internal text processing is done in unicode in RedNotebook. so the dates have to be in that format as well.

Changed in rednotebook:
assignee: nobody → Jendrik Seipp (jendrikseipp)
status: Confirmed → In Progress
Revision history for this message
Ask Hjorth Larsen (askhl) wrote :

Depending on what you do with the printed string, there can still be problems. Try running the example script when the output is piped into e.g. less or cat. It will complain, because stdout doesn't use the locale encoding unless you're printing to a tty (which can be very annoying).

For correct results, everything that you print should be explicitly encoded:

  print some_unicode_object.encode(locale_encoding)

or you can use the codecs module to create a special file object to print to.

(I don't know whether you necessarily need to print to a tty, or if you are in reality passing the strings along to e.g. gtk; in the latter case it may not be necessary to do explicit encoding)

Revision history for this message
TLE (k-nielsen81) wrote :

@Jendrik: No problem

@Ask: As far as I understand the strings are used by a webkit engine for the preview and the exports and for some module "save-simple-types-in-a-text-format kind of module (don't remember the name) for saving. So it probably isn't a problem.

Changed in rednotebook:
status: In Progress → Fix Committed
Revision history for this message
TLE (k-nielsen81) wrote :

Wuhu. I have helped with my fist ever free software bug.

Changed in rednotebook:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.