EC boost for computer-assisted translation
The EU's Joint Research Centre (JRC) has published a million sentences
translated into 22 official EU languages in a bid to help the
development of computer-assisted translation technologies and software.
By offering free and open access to this collection of sentences,
the EU hopes to foster multilingualism and provide a valuable resource
for system developers to create machine translation software.
As part of its remit, the EU translates all its legal and political
documents into all 23 official languages, meaning translators must work
with 253 possible language pair combinations across 1.5 million pages a
year. This also means there is a collection of translated texts which
is of great value as a learning base for system developers.
'By this initiative the European Commission intends to boost human
language technologies, support multilingualism and make
computer-assisted translation easier, cheaper and more accessible,'
said Leonard Orban, EU Commissioner for Multilingualism.
While it is relatively easy to find English/French documents on the
web to aid such developments, it is much more difficult to find Latvian
to Romanian, for example. 'Citizens belonging to the smaller linguistic
communities will have an easier access to documents and web pages only
available in the most used languages,' Mr Orban continued.
Because the text is offered in context, it can also help develop
and test grammar and spell checkers, online dictionaries and text
classification systems.
'This unique collection of language data contributes to the
creation of a new generation of software tools for human language
processing and helps foster the competitiveness of the language
industry, which is already one of the fastest growing industries in the
European Union,' said Janez Potocnik, European Commissioner for Science
and Research.
Source: Community R&D Information Service (CORDIS)
