Convert newlines \r\n to \n

Bug #709 reported by Brian Sutherland
20
Affects Status Importance Assigned to Milestone
Launchpad itself
Fix Released
Medium
Dafydd Harries

Bug Description

Translations added through the web end up with newlines like \r\n rather than \n

Translations added through the web end up with newlines like \r\n rather than \n. Could rosetta please convert these newlines.

An excerpt frm our IRC logs:

jinty: "Segundo grupo, alguna descripción, este texto será descartado\r\n"
jinty: for the above translation
mgedmin: where do those \r's come from?
mgedmin: rosetta?
jinty the file was directly downloaded from rosetta
mgedmin: browsers do send CR LF characters when you add a newline to a web form
srichter: maybe someone on windows edited the file
mgedmin: I think rosetta should convert newlines to just \n

Revision history for this message
Brian Sutherland (jinty) wrote :

This causes some translations to break msgfmt. The following tanslation was downloaded from rosetta:

#: src/schoolbell/app/main.py:470
msgid ""
"\n"
"Perhaps another %s instance is using it?"
msgstr ""
"^M\n"
"Może inna instacja %s używja jej?"

and failed with:

src/schoolbell/app/locales/pl/LC_MESSAGES/schoolbell.po:1766: `msgid' and `msgstr' entries do not both begin with '\n'
src/schoolbell/app/locales/pl/LC_MESSAGES/schoolbell.po:1774: `msgid' and `msgstr' entries do not both begin with '\n'
src/schoolbell/app/locales/pl/LC_MESSAGES/schoolbell.po:1790: `msgid' and `msgstr' entries do not both begin with '\n'
msgfmt: found 3 fatal errors

Revision history for this message
Carlos Perelló Marín (carlos) wrote :

Now, every time we see '\r\n' or '\r' we change it to '\n'.

Current data will be migrated (when possible). It will be applied to launchpad.ubuntu.com next Monday (perhaps earlier, it depends on our QA process).

Changed in rosetta:
assignee: nobody → daf
status: New → Fixed
Revision history for this message
Christian Reis (kiko) wrote :

Carlos, are we going to schedule this migration anytime soon?

Revision history for this message
Carlos Perelló Marín (carlos) wrote :

It was done at the same time we applied the patch.

I said 'When possible' because there would be some situations when the migration would not be possible but, from what daf told me, the migration was possible for all data we have in our server.

Sorry if I was not clear enough.

cheers

Revision history for this message
Kamil Páral (kamil.paral) wrote :

I reopen this bug because it is not working again. Yesterday one translator of my project made some translations and after I downloaded it, it contained many '\r' characters. I asked him and he said he used only the web interface in combination with Internet Explorer and Windows Vista.

I think all the received forms should be automatically stripped of all '\r' characters. This is clearly not the case now.

Changed in rosetta:
status: Fix Released → New
Revision history for this message
Данило Шеган (danilo) wrote :

The bug you are looking for is most likely #88831 or #61096 (which describe the new problem, we should keep the old one closed).

Changed in rosetta:
status: New → Fix Released
Revision history for this message
Kamil Páral (kamil.paral) wrote :

No, Данило, those bugs you mentioned consider different thing. I reopen this bug because this concrete problem appeared again. But I can report a new bug if you prefer.

Those aforementioned bugs consider the case when '\r' is contained in the msgid. But my POT file does not contain *any* '\r' characters in msgid and despite that Rosetta puts '\r' into msgstr when translators use Internet Explorer on Windows (it sends CRLF instead of just LF and Rosetta does not have necessary checks to fix it). That is a different issue from the other bugs. It is this bug, fixed in 2005 and now broken again.

Changed in rosetta:
status: Fix Released → New
Revision history for this message
Данило Шеган (danilo) wrote :

Ok Kamil, sorry for misunderstanding you. It would help if you provided more details about what project is it, what translation, what translator, and if possible, even what exact messages are affected. That will help us a lot in tracking down the problem.

Revision history for this message
Kamil Páral (kamil.paral) wrote :

Ok, it's in my project Esmska:
https://launchpad.net/esmska

And here are the exact translations containing CR characters:
https://translations.launchpad.net/esmska/trunk/+pots/esmska/sk/26/+translate
https://translations.launchpad.net/esmska/trunk/+pots/esmska/he/250/+translate

I don't know about Hebrew, but the Slovak translation was created in Internet Explorer 7 in Windows Vista.

Both translations have linefeeds while they shouldn't be there (translators faults), but that's another kind of problem. More important are the CR characters.

Revision history for this message
TomasKovacik (nail-nodomain) wrote : Re: [Bug 709] Re: Convert newlines \r\n to \n

hech, este, ze som ti nechal nejaky sample :) ocividne mi tato jedna
svinka unikla ked som opravoval tie "nechcene zalomenia riadkou ..... "
potom posli kde su vsade zle zalomene riadky, alebo skosob ako to zistit
a poopravujem to.

t.

Kamil Páral wrote:
> Ok, it's in my project Esmska:
> https://launchpad.net/esmska
>
> And here are the exact translations containing CR characters:
> https://translations.launchpad.net/esmska/trunk/+pots/esmska/sk/26/+translate
> https://translations.launchpad.net/esmska/trunk/+pots/esmska/he/250/+translate
>
> I don't know about Hebrew, but the Slovak translation was created in
> Internet Explorer 7 in Windows Vista.
>
> Both translations have linefeeds while they shouldn't be there
> (translators faults), but that's another kind of problem. More important
> are the CR characters.
>
>

Revision history for this message
TomasKovacik (nail-nodomain) wrote :

ups, sorry, this comment should be for kamil only :)
delete it. thx
sorry

t.

Revision history for this message
Данило Шеган (danilo) wrote :

Kamil, we've done some changes to how we export carriage returns. Can you check to see if that helped with your problem?

Changed in rosetta:
status: New → Incomplete
Revision history for this message
Kamil Páral (kamil.paral) wrote :

Yes it seems it has helped. Downloaded translations do not contain any CR characters. Which of course is not a proof, but is seems to be solved.

Changed in rosetta:
status: Incomplete → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.