This short article gives a quick summary of the issues of managing multilingual websites.
There are seven key issues to consider:
- Non-Latin character sets
Translation is essential to the running of a multilingual website and will require qualified personnel or the use of an external translation service. Proof reading of translated copy is also often required.
Machine translation should be considered with extreme caution, but it may be a plausible alternative for infrequently-accessed pages containing non-essential content. In this case, the use of short, unambiguously structured sentences and the avoidance of idiomatic phrases are essential, and sub-editing is likely to be a necessity.
On any site with significant translation requirements, translation costs are likely to dwarf all other running costs.
A multilingual website is usually a mixture of global and local content. Local content presents no particular content management issues; global content - which has to be translated across all language locales - does.
Deciding where multiple language versions of content are going to be required and where content can be maintained separately for different locales is a critical decision that will affect how a site should be maintained and what it will cost.
Differences in language are only part of what distinguishes different locales. Graphical conventions, matters of taste, sense of humour, socially acceptable forms of address and issues of privacy all vary from place to place.
Also, some important concepts (think of Home Office or maternity pay) may have no useful meaning if translated literally.
Responses to any website feedback will need to be addressed in the language of the initial communication. User feedback should not be solicited in a language if it cannot be routed to a suitably qualified person who can answer in the appropriate language.
Scripts that handle interactivity, such as discussion forums, search results and feedback forms, will also need to be configured appropriately.
Perhaps the most common, and an easily overlooked, difficulty encountered in developing multi-lingual websites is the maintenance of a consistent design across different language versions of a site, and in particular the layout of navigation: text or graphic labels that fit the design constraints in one language may not work well in translation.
The only sensible way to tackle this issue is to ensure that the initial design brief for a site includes all language variations of site branding and of the major navigational elements.
Also, links between pages should not lead unsuspecting users from one language locale into another.
Simple workflow mechanisms usually offer some kind of notification when some action is performed on a page or when the page moves from one state to another.
Translation workflow, on the other hand, requires that changes to a page trigger appropriate notification of required changes to the other language versions of that page.
In addition, it is usually helpful to have some mechanism for identifying which elements within the page have changed.
Non-Latin character sets
There are some interesting challenges associated with the creation and rendering of non-Latin alphabets, although modern browsers have better support for them than in the past.
Unicode has now emerged as a recognised (and growing) international standard that includes most non-latin characters and makes storage and retrieval of non-Latin characters in distributed environments (such as the web) much easier.
Unicode support for some character sets (such as Bengali) is still not universal, so the use of legacy character sets may occassionally be necessary (at least in the short term), but, ideally, website content should be stored and edited as unicode.
In addition, website pages should be published with an appropriate character set (either UTF-8 or a language-specific character set, such as windows-1256—an arabic subset of unicode) and language META tags; any characters that are not in the publishing character set should be published as html entities; and direction tags should be specified, where appropriate.
It should also be possible to mix languages on a single page, allowing links to other language versions of the page to be handled simply.