Please note: This site's design is only visible in a graphical browser that supports Web standards, but its content is accessible to any browser or Internet device. To see this site as it was designed please upgrade to a Web standards compliant browser.
 
Signal vs. Noise

Our book:
Defensive Design for the Web: How To Improve Error Messages, Help, Forms, and Other Crisis Points
Available Now ($16.99)

Most Popular (last 15 days)
Looking for old posts?
37signals Mailing List

Subscribe to our free newsletter and receive updates on 37signals' latest projects, research, announcements, and more (about one email per month).

37signals Services
Syndicate
XML version (full posts)
Get Firefox!

Localize This

12 Aug 2004 by Jason Fried

Care to share your experiences localizing and translating a web site or web-based application? What major obstacles did you encounter? What was harder and easier than expected? How did you handle adding new features to the site/app — did you work with the same translator you used the first time? If you worked with a team of translators, how did you ensure that the translation was consistent across the entire site? Did you run into any unanticipated cultural issues or clashes? How much does having a site/app in multiple languages slow down development and impede progress? Finally, any fundamental words of advice?

19 comments so far (Post a Comment)

12 Aug 2004 | Justin French said...

Whilst I can't speak on behalf of Dean Allen (the Developer), the localisation of Textpattern seems to be going OK. Since the first Gamma release there's been about 10 translations by the community, each one requiring 450+ strings to be translated. I'm sure some of them aren't perfect, but over time I'm sure they'll be refined by the community.

I guess it's easier to get "free" help from the community when the product is opensource (like TXP), rather than closed like Basecamp (for example).

A quick read of the TXP forums Localisation section should give you some insight into what problems have been encountered.

12 Aug 2004 | Eric Martin said...

I've done a few site projects in multiple languages. In Montreal, many sites are required to be in both French and English. The latest one, for a private school, was simple. Finish the entire site in English, have it approved, and then the client starts to have everything translated.
First setp is to translate all section names and page names. That way, while we build out the second, French site, the client is gathering all the page content to translate. Then it's a simple copy and paste into all the waiting pages.
Not diffilicult at all.

12 Aug 2004 | Ben Stiglitz said...

Well, I can't exactly answer the question, but I have worked on localized software. The tough parts have been:


  • Words or concepts that don't translate well; see a discussion about PulpFiction at nslog.com for an example

  • Localizing images; everyone likes to rat out the stop sign, but that's only the very beginning

  • Updating localizations when software changes

  • Keeping text out of source; this is a huge deal, and very difficult to deal with in large projects. Have a system for organizing and storing localized strings from the very beginning.

12 Aug 2004 | CM Harrington said...

Localisation is really only difficult when you don't plan for it initially. If you think that perhaps one day you will need to be able to show content in multiple languages, it is important to design your application (or website) with that in mind. It is just another instance of abstracting the localisable strings from the rest of the application. For example, Apple does a pretty good job of delegation and abstraction in the Cocoa frameworks, so localisation is really quite easy to do (assuming you have good translators).

One thing to remember, is that it isn't just headings and main body content. The rest of the world (read as non-US) use a different method for dates and denomination (commas and periods in numbers and currency), so all that stuff needs to be abstracted as well. For the most part, this can be placed into a simple environment variable that can be saved as a preference locally.

I've also found that it is important to find a translator who understands the basic technical jargon in your native language as well as the equivalent for the localisation. Very often, the expected words for common things are not a literal translation.

In keeping with the above, note that you aren't really looking for a translation when porting an application, you are looking for a localisation. You want to create an experience that feels "native" regardless of the original application's origin. Think about how dreadful Word 6 for the Mac was (for those of you who don't know, it looked and operated like a Windows application, not like a Mac application. Would you, as someone who wants to help your customers, inflict that kind of pain on them?

12 Aug 2004 | Darrel said...

I've localized some desktop/handheld software products. We found that pro translation services really aren't worth the money (at least on our scale) and you're usually better off just finding a bilingual person that is in the market you are are targetting. We found the biggest issue is usually industry-specific terminology. The second biggest issue was cultural differences (ie, Spain Spanish vs. Mexico Spanish, etc.)

12 Aug 2004 | Brian said...

Darrel mentions Spain Spanish and Mexico Spanish ... there are some considerations with US - English, UK - English, AUS - English.

Kind of related - Engrish.com

12 Aug 2004 | John Kopanas said...

I am presently using gettext() in PHP for languages/localization. To be very honest I don't really like it because once you change the default language you have to review and re-compile all the secondary languages. The only problem is that I am not sure if I have any other solutions.

I would consider putting everything in a DB so that that the secondary languages don't rely on the primary languages. Or I can just create a script to automate the whole process of merging/reviewing and re-compiling the files that have the translations :-).

12 Aug 2004 | Jes said...

A few additional trouble areas come to mind...

One thing that can be a real pain is dynamically generated text. For example, let's say you want to have the site report the number of unread messages. If the user has one message, you'd say, "you have one unread message"; if there are two or more you'd say something like, "you have two unread messages".

Translating this seemingly innocuous text can be a nightmare, since other languages have very different rules for handling plurals. Word order might change for one vs. many items. Some languages have different constructions for one, a few, and more than a few. That's why you see things like "Number of unread messages: 4" in software that needs to work in a lot of different languages (e.g., Office, WinXP). Although it's a bit inelegant, it can be tranlated directly. Otherwise, you'd need a little routine for each dynamically generated sentence to account for local language syntax.

Another thing that can be tough is localizing to a language that doesn't read from left to right. Not only do you have to translate all the text, but your layout may work very poorly in a right-to-left reading language. As designers, we tend to take advantage of the fact that people read from left to right, top to bottom. As a quick test, mock up a mirror image of your layout and see how it feels.

Lastly, if you're trying to deploy to languages like Japanese, you have an additional set of problems. The hardest problem I had when localizing one of my applications for Japan was text sorting. Western languages sort lexically according to (relatively) straightforward rules ('a' comes before 'b', etc.). In contrast, Japanese should be sorted phonetically, and there is not a direct mapping from characters to syllables; characters change their pronunciation based on the characters that appear around them. If you want to display Japanese words in sorted order, the built-in sort routines in your language of choice aren't going to do it properly.

12 Aug 2004 | Jonathan M. Hollin said...

I have to agree with Darrel:

"pro translation services really aren't worth the money"

It's worth adding that the same applies to software translation systems, most of which are farcical.

At work, we have completed several multi-language projects and there are no real technical difficulties involved.

As for the translations themselves, the best thing to do is visit your local university and talk with people in the language departments there. These are the very best people to do translations, they understand the culture, dialect and grammar of the languages they study and this makes a huge difference to the quality of the finished product.

For our software, we worked closely with the various language departments at the University of Leeds - we have never had any problems with the results.

12 Aug 2004 | JF said...

If the user has one message, you'd say, "you have one unread message"; if there are two or more you'd say something like, "you have two unread messages". Translating this seemingly innocuous text can be a nightmare, since other languages have very different rules for handling plurals. Word order might change for one vs. many items.

Yes yes! This is the stuff I love to hear about. These are the details that quickly double the scope of a project (assuming you want to do it right).

12 Aug 2004 | Aleksandar said...

I have translated FeedDemon to both serbian cyrillic and latin, and it was a tough experience.

12 Aug 2004 | Adam M said...

The part we haven't been able to solve so far was finding a content management tool that allows convenient side-by-side language translation; management of localized images versus general-purpose images; convenient addition/editing/removal of pages; reuse of text snippets, and doesn't cost an arm and a leg.

By the way, we only recently became aware of how sensitive some people are to having address blocks written in their native format, e.g., do you place the zip code after the city or before the country; do you place the street address above the city above the country or the other way around, etc. Turns out each country has its own postal standard.

12 Aug 2004 | justin said...

I happen to own one of those 'pro' companies, so I'll leave off any self-links. The most problematic issues are going to be the ones that are least likely to occur to the average English-speaking programmer. That said, treat web software as similarly as desktop software during the process and watch for the following:

1. Use a separate file, library, or database to track and load all the strings. The less hard-coding text there is, the better.
2. Never, ever concatenate strings.
3. Translation always happens on the sentence level or paragraph level if pronouns are used.
4. Watch dates, numbers (many languages uses commas where we would use decimals), and punctuation. For example, German and French use different quotation marks and French leaves a space before colons and some other punctuation.
5. Watch the differences between a) Latin American Spanish (which overlaps US Spanish) and European Spanish (sometimes referred to as Castilian) and b) Canadian French and European French. The people and cultures are entirely different.
6. If you're serious about doing a good job, use a professional agency that knows what they're doing. There are several good, solid agencieslike oursthat are Mac-based and love software and design. It doesn't have to be expensive but in any case, consider the benefit of a polished look. The term 'localization' derives from the implication that a product or service works for a local market. Mac people don't like unpolished Windows apps ported over to their side and the French feel the same way about English-based software.

13 Aug 2004 | mark rushworth said...

im currently translating WINEGLASS, a free groupware system i hope to use in place of Basecamp - its in Italian, the developers dont understand English (as was apparent from the Italian emails i recieved from my requests) im usng babel fish to edit the visible display on a page by page basis...

its a pain

anyone care to help

http://www.fermentigrafici.it/wineglass/

mark

13 Aug 2004 | t said...

I did a fairly large localized PHP+MySQL website (English/Spain/Portuguese, hundreds of database-backed pages, developed on and off for three years) and since I planned from day one to have a highly localized experience, it wasn't terribly difficult. Things that made the process easy:

1. A good template library. I wrote my own. Others should look into Smarty.

2. The templates were structured into one folder per language (templates_en, templates_es, templates_pt etc). A cookie (which fell back on a default) decided which folder was used when the template was called up.

3. Each template file contains all the text and some logic. This made it easy to take language-specific quirks into account. For instance:

[if messages == "0"]
You have no messages
[else]
You have [%messages%] messages.
[/if]

4. I created a pretty simple library of functions to deal with number formatting (and the stripping of formatting, because occasionally we needed to accept a number and turn it into something PHP can understand -- so in other words 1.234,00 became 1234.00), dates, money, accents, etc. These functions were called by the templates whenever data was sent or received.

Things that sucked:

1. We had a lot of problems finding good translators without dealing with a large, volume-oriented translating company. We ended up relying on friends. This is a stupid practice. They aren't precise.

2. Translating images was a drag (even if it was fairly easy to refer to the proper image from within the template). This was before IFR and friends, maybe it would be better now.

3. When we added functionality to a page, we would have to edit all three templates to make the corresponding visual changes. I didn't expect this to be a problem at first, but the site ended up constantly evolving and the main language would be way ahead of the other two. A smarter person could have structured the site so that different versions of each script were deployed when the associated template was updated and not before.

4. It was tempting to cheat and save time by embedding text in the actual PHP source and not in the template. This ALWAYS came back to bite me in the end and I eventually learned.

5. MySQL's behavior when inserting multilingual text can be kind of weird. It's good to experiment a bit to at least get an idea of what the results will be when you compare accented words in difference languages and stuff like that. Overall wasn't as big of a situation as I expected.

6. Some people tend to use ASCII codes instead of entity references, but enough scolding ended this practice.

13 Aug 2004 | Dan T. said...

If you know PHP/mySQL or ASP, just put your text blocks into database fields (1 row per language) and dynamically pull these blocks into your page as the visitor chooses a language from your language selection scheme.

You can even use "htmlarea" to let your translators do the job online.

13 Aug 2004 | Matt said...

In WordPress we used a variation of gettext() and it's been wonderful. Here are some things I learned along the way:

* First make sure you have all your encoding bits in order. UTF-8 is great but not a perfect solution. If you're going to be dealing with Asian characters you need to be more flexible.
* Any place in your code that you're using string functions, like substr, you probably need to rewrite or double-check.
* In PHP to do a good job you really need to use the multibyte functions, http://www.php.net/mb_string

Once you have the groundwork in place the actual work isn't too hard, just tedious. You need to go through every file and wrap every string that outputs in a gettext function. For convinence in WordPress we have __() return and _e() echo. Variable interpolation like described above ("Hello NAME", "NAME hello") is a cinch with gettext. There are tons of very robust tools for working with gettext strings on every platform that makes it a nice experience for the translator. (String tips: Do your best to keep markup out of strings. Always enclose full phrases and sections, never break things up.)

16 Aug 2004 | Paul said...

Currently developing a multi-language system myself, one of the things I found I overlooked was "lookup" values ( not the correct terminoligy I know ) stored in a database. For example, in one part of the application, we have a dropdown listing "Property Types" that contains items such as "House", "Flat", "Bungalow" and such. Representing these in different languages via the database was a simple task, but when you couple that with an administration tool that works in your local language it soon adds up to a lot of defintions of the same word in various languages. Example: native Country names: if you're German, France is "Frankreich", but if you're Spanish it's "Francia".

Also, we're found it's touches like that that quickly become very important features.

20 Aug 2004 | Tomas Jogin said...

Let me know if you need someone to translate BaseCamp to Swedish.

Comments on this post are closed

 
Back to Top ^