Festival doesn't understand UTF-8

Bug #872190 reported by jherazob
40
This bug affects 6 people
Affects Status Importance Assigned to Milestone
festival (Ubuntu)
Confirmed
Undecided
Unassigned

Bug Description

I imagine that this is a job for Upstream, but i still mention it.

Both Debian and Ubuntu have unicode-ized themselves for years now, yet Festival still cannot process utf-8 strings. This is problematic for non-english languages. We have to resort to hacks like piping text through iconv to convert it to iso-8859-1 to be able to use accents and the like.

I'm using Natty, but have seen the problem for many years. Current package version is 1:2.0.95~beta-5.1ubuntu2.

Expected: When accents are used, text is spoken correctly
Happened: When accents are used, they are replaced on the spoken text by gibberish and numbers
Test procedure:
Prerrequisite: standard spanish voice, package festvox-ellpc11k
Test command: echo "Esta es una prueba con texto en español y tildes, probémosla allá" | festival --language spanish --tts
Workaround: echo "Esta es una prueba con texto en español y tildes, probémosla allá" | iconv -f utf-8 -t iso-8859-1 | festival --language spanish --tts

Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in festival (Ubuntu):
status: New → Confirmed
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.