Translation Nightmare

Posted by Ross Burton on October 1, 2008

I just got a new bug titled Very weird translation template, need comments in .pot file to clarify, and giggled to myself. I was wondering how long it would be for this bug to be filed. The problem is that whilst most of the translatable strings in Tasks are pretty boring: "Tasks", "today", "Priority" and so on, all of a sudden the template goes a bit mental:

"^(?<task>.+) (?:by|due|on)? (?<month>\\w+) (?<day>\\d{1,2})(?:st|nd|rd|th)?$"

Apparently the average translator doesn't think that learning PCRE-style regular expressions, and reading the source that uses this string to understand how it is to be used, is appropriate. [note: this is sarcasm]

Maybe I should have added some translator comments to clarify exactly what I meant by this. These monster strings (all in koto-date-parser.c) are GRegex regular expressions which are used to parse the user's input to try and extract meaningful date information. To translate these strings you'll need to have a basic understanding of regular expressions: if you don't then skip them and hopefully someone who does will finish the translation. If you know regular expressions then translating these strings is easy, honest.

The golden rule is to never translate the words which look like this: (?<foo>. These are markers which identify portions of the input (such as task or month) and need to remain in English, although they can be moved around if required. The rest of the strings are translatable. I'll give an example using the French translation by Stéphane Raimbault. First, the string in English and a worked example:

"^(?<task>.+) (?:by|due|on)? (?<day>\\d{1,2})(?:st|nd|rd|th)? (?<month>\\w+)$"

First, we have a sequence of any characters identified as task, which magically expands to be as many as possible. This is optionally followed by one of the words "by", "due" or "on". This is followed by one or two digits identified as day followed by "st", "nd", "rd" or "th". Finally a sequence of characters which is identified as month. If the user had entered "pay bills on 2nd june" then task would be "pay bills", day would be "2", and month would be "june". Tasks can then turn "june" into a month number through other translations, and it now knows what date the user entered. In French, this translates as follows:

"^(?<task>.+) (?:pour|prévu|pour le)? (?<day>\\d{1,2})(?:er|e)? (?<month>\\w+)$"

See, I said it was easy! All I need now is a legion of translators who understand regular expressions enough to correctly translate the new Tasks... [this, again, is sarcasm] Luckily, plans are afoot to move the Tasks source to the GNOME Subversion server, so the full fury of the GNOME translation team can attack this.

NP: Trailer Park, Beth Orton

tags: tasks, tech