No unique directory for exporting multiple pages written in Japanese

Bug #996172 reported by Takahiro Sumiya
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Mahara
Fix Released
High
Son Nguyen

Bug Description

When exporting multiple pages with "Standalone HTML web site" option, Mahara tries to write all pages, whose titles are written in Japanese, to same directory named "-". As a result, just one page was successfully exported and all other pages were not.

tags: added: translations
Changed in mahara:
status: New → Triaged
importance: Undecided → Medium
Revision history for this message
Richard Mansfield (richard-mansfield) wrote :

I think this should really be high priority, it sounds like it makes html export useless in Japanese and probably lots of other languages.

Changed in mahara:
importance: Medium → High
milestone: none → 1.6.0
Revision history for this message
Son Nguyen (ngson2000) wrote :

Richard, you are right. There is also the problem for Vietnamese.

I suggest 2 solutions:
1. Translate Unicode string to ASCII valid filename; or
2. Enable Unicode filename for Mahara export function

Son Nguyen (ngson2000)
Changed in mahara:
assignee: nobody → Son Nguyen (ngson2000)
status: Triaged → In Progress
Revision history for this message
Takahiro Sumiya (sumi-2) wrote :

 > I suggest 2 solutions:
 > 1. Translate Unicode string to ASCII valid filename; or

URL Encoding (http://www.w3schools.com/tags/ref_urlencode.asp) are commonly used to translate unicode to ASCII.
But sometimes the result strings are too long to handle as file/directory names.

So, in my opinion, the simple naming scheme, in which we use page ID, is better solution.

For example, translate

  http://maharahost/view/view.php?id=503

into

  503.html

 > 2. Enable Unicode filename for Mahara export function

This sounds good, but Japanese version of Windows cannot handle Unicode file names in ZIP files.
(It's OK in Japanese version of Mac OS X ;-)

Revision history for this message
Son Nguyen (ngson2000) wrote :

Hi Takahiro;

Using view id is a good solution. However, we may lost the meaning of file and directory name. This may cause difficulties to users when navagating and editing these pages.

I suggest we add 1 more editable field called "short name" for each page/view and use it for exporting and search engine

Son Nguyen (ngson2000)
Changed in mahara:
status: In Progress → Opinion
Revision history for this message
Richard Mansfield (richard-mansfield) wrote :

Son, this should really be left on Triaged or Confirmed; see https://wiki.mahara.org/index.php/Developer_Area/Bug_Status

"Opinion" should only be used when opinion is divided on whether the bug should be fixed, but in this case we all agree it needs to be fixed, we're just not quite sure how to fix it yet.

Changed in mahara:
status: Opinion → Triaged
Revision history for this message
Son Nguyen (ngson2000) wrote :

I added a patch for using page ID as directory name

https://reviews.mahara.org/#/c/1232/

Revision history for this message
Takahiro Sumiya (sumi-2) wrote :

Thank you Son,

I have tried your patches on our Mahara, and it works fine. Thank you very much!

But I found another (related) problem. If the exported page has attached files that have Japanese filenames, they are exported with original (utf-8) file names. As a result, on Windows, we can not open the attached files from index.html. (On Mac OS X, we can open them) ...Sigh...

http://sumi.riise.hiroshima-u.ac.jp/skitch/mahara_export-20120526-200204.png

How about in Vietnamese?

Revision history for this message
Kristina Hoeppner (kris-hoeppner) wrote :

Hi Takahiro,

The patch is currently in code review and not yet part of Mahara. ;-) If you want to help, you can review it at the URL that Son provided - either by testing it or even performing acode review. In order to place it into Mahara, it will need to be approved by a reviewer who has the approval status.

Cheers
Kristina

Revision history for this message
Takahiro Sumiya (sumi-2) wrote :

Hello Kristina, I know it's not just for me :-)

Do you mean that we can register to reviews.mahara.org as just a tester? I'll try to register later. Thank you.

Revision history for this message
Kristina Hoeppner (kris-hoeppner) wrote :

Hello Takahiro,

Yes, you can register and become a tester. That would be awesome. I am not a developer and thus don't do code reviews, but I "verify" patches from the front end that I can verify that way. It helps because then we also know if the functionality actually works as a user might expect it to work.

Cheers
Kristina

Revision history for this message
Son Nguyen (ngson2000) wrote :

It seems good for Vietnamese filename.
However, we need to solve this problem by adding a new field "ASCII short name" for page title as mentioned above.

Any other ideas?

Changed in mahara:
status: Triaged → In Progress
Revision history for this message
Son Nguyen (ngson2000) wrote :

Enable Unicode filename for exporting HTML - https://reviews.mahara.org/#/c/1418/

Tested for Vietnamese, Japanese, Chinese, Thai, and Arabic Worked well with Firefox 3, Chrome

Revision history for this message
Mahara Bot (dev-mahara) wrote : A change has been merged

Reviewed: https://reviews.mahara.org/1418
Committed: http://gitorious.org/mahara/mahara/commit/bc66d5c11e37c89dfd33c24e0009907e02139c5d
Submitter: Hugh Davenport (<email address hidden>)
Branch: master

commit bc66d5c11e37c89dfd33c24e0009907e02139c5d
Author: Son Nguyen <email address hidden>
Date: Tue Jul 31 11:47:11 2012 +1200

    Enable Unicode filename for exporting HTML
    (Bug #996172)

    + Using utf-8 string as file/folder name:
    + Setting the maximum of file/folder name length to 80. (Max filename
    length on many OS is limited to 256 bytes)
    + Percent encoding URL. It is a non human-readable string. However, the
    modern web browsers (Firefox3, Chrome, >IE7) will automatically display
    it into human readable URL.

    Change-Id: Ibb0d4e5d5fbe01bc49b5ad9ba68e4bc483938016
    Signed-off-by: Son Nguyen <email address hidden>

Changed in mahara:
status: In Progress → Fix Committed
Revision history for this message
Hugh Davenport (hugh-davenport) wrote :

https://reviews.mahara.org/#/c/1418/ <- 1.6 version (will get pushed when 1.6 branch is pushed)

Revision history for this message
Mahara Bot (dev-mahara) wrote :

Reviewed: https://reviews.mahara.org/1495
Committed: http://gitorious.org/mahara/mahara/commit/8ee84e2a09f28b5cbd0acbba6ff6f65568ffdae2
Submitter: Hugh Davenport (<email address hidden>)
Branch: 1.6_STABLE

commit 8ee84e2a09f28b5cbd0acbba6ff6f65568ffdae2
Author: Son Nguyen <email address hidden>
Date: Tue Jul 31 11:47:11 2012 +1200

    Enable Unicode filename for exporting HTML
    (Bug #996172)

    + Using utf-8 string as file/folder name:
    + Setting the maximum of file/folder name length to 80. (Max filename
    length on many OS is limited to 256 bytes)
    + Percent encoding URL. It is a non human-readable string. However, the
    modern web browsers (Firefox3, Chrome, >IE7) will automatically display
    it into human readable URL.

    Change-Id: Ibb0d4e5d5fbe01bc49b5ad9ba68e4bc483938016
    Signed-off-by: Son Nguyen <email address hidden>

Revision history for this message
Hugh Davenport (hugh-davenport) wrote :

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

 status fixreleased
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iJwEAQECAAYFAlCbHO8ACgkQuMoJ2LQ3zxH8TAP/YN4BiCJZsn5a899/0UzV31Qg
lM8LXAwZWa6zFv6t0BQUHCqe6eFK9wPp51qgCWWXjUZ3vvvVcsyeWp6626aBFKSU
pCQXI9E7huPw802nJQ9WcZXRBUmgw87ww72Tx4mybnu7SPSrkZgXdnPGSMwDs89N
oWvTpl7Xuac48e6p0lU=
=ouU+
-----END PGP SIGNATURE-----

Changed in mahara:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.