Introduction to Translation Machine Computerassisted Translation Machine Translation

  • Slides: 12
Download presentation
Introduction to Translation Machine & Computer-assisted Translation

Introduction to Translation Machine & Computer-assisted Translation

Machine Translation � What is Machine Translation? � Machine translation is automated translation. It

Machine Translation � What is Machine Translation? � Machine translation is automated translation. It is the process by which computer software is used to translate a text from one natural language to another.

Machine Translation � How does machine translation process? On a basic level, MT performs

Machine Translation � How does machine translation process? On a basic level, MT performs simple substitution of words in one natural language for words in another (word-for-word). However, that alone usually cannot produce a good translation of a text, because recognition of whole phrases and their closest counterparts in the target language is needed.

Machine Translation � Solving this problem with corpus and statistical techniques is a rapidly

Machine Translation � Solving this problem with corpus and statistical techniques is a rapidly growing field that is leading to better translations, handling differences in linguistic typology, translation of idioms, etc. Some solutions for this problem are: Ø Specifying Software Domain: Select one topic for each software to minimize variation in lexical options. This technique is particularly effective in domains where formal language is used (government and legal documents). Ø Human intervention: some systems are able to translate more accurately if the user identifies which words in the texts are names, locations, etc.

Methods of Machine Translation � There are 3 methods used by machine translation (MT):

Methods of Machine Translation � There are 3 methods used by machine translation (MT): Rule-based MT. Ø Statistical MT. Ø Hybrid MT. Ø

Methods of Machine Translation � Rule-based MT relies on countless built-in linguistic rules and

Methods of Machine Translation � Rule-based MT relies on countless built-in linguistic rules and millions of bilingual dictionaries for each language pair. � Statistical machine translation tries to generate translations using statistical methods based on bilingual text corpora, but such corpora are still rare for many language pairs.

Methods of Machine Translation (cont’d) � Hybrid machine translation (HMT) combines the strengths of

Methods of Machine Translation (cont’d) � Hybrid machine translation (HMT) combines the strengths of statistical and rule-based translation methodologies. This approach includes two ways: Ø Rules processed by statistics: Translations are performed using a rules based engine. Statistics are then used in an attempt to adjust the output from the rules engine. Ø Statistics guided by rules: Rules are used to pre-process data in an attempt to better guide the statistical engine. Rules are also used to post-process the statistical output to perform functions such as normalization. This approach has a lot more power, flexibility and control when translating.

Comparison Rule-based MT + Consistent and predictable quality Statistical MT - Unpredictable translation quality

Comparison Rule-based MT + Consistent and predictable quality Statistical MT - Unpredictable translation quality + Out-of-domain translation quality - Poor out-of-domain quality + Knows grammatical rules - Does not know grammar + Consistency between versions - Inconsistency between versions - Lack of fluency + Good fluency - Hard to handle exceptions to rules + Good for catching exceptions to rules - High development and customization costs + Rapid and cost-effective development costs provided the required corpus exists

Applications of MT � MT programs are used around the world. For example: Ø

Applications of MT � MT programs are used around the world. For example: Ø European Commission: probably the largest institutional user. Ø Google: it improved its internal translation capabilities by using approximately 200 billion words from United Nations materials to train their system.

Computer-assisted Translation � What is Computer-assisted Translation (CAT)? � Computer-assisted is a form of

Computer-assisted Translation � What is Computer-assisted Translation (CAT)? � Computer-assisted is a form of language translation in which a human translator uses computer software to support and facilitate the translation process.

Computer-assisted Translation Computer-assisted translation is a broad and imprecise term covering a range of

Computer-assisted Translation Computer-assisted translation is a broad and imprecise term covering a range of tools, from the fairly simple to the complicated. E. g. : � � � Spell Checkers: either built into word processing software, or add-on programs. Grammar Checkers: either built into word processing software, or add-on programs. Terminology managers: which allow the translator to manage his own terminology bank in an electronic form. Electronic dictionaries: either monolingual or bilingual. Translation memory tools: consist database of text segments in a source language and their translations in one or more target languages.

Assignment � Translate the following paragraph, then use Google Translate to do the same

Assignment � Translate the following paragraph, then use Google Translate to do the same task. Compare the two versions. One of the benefits of reading is that it can be a lot of fun. If reading is fun, then children as well as adults will enjoy the experience and want to read more. One way that reading can be fun is that it helps to stimulate the imagination. When children read fictional stories that take place in make-believe worlds, it pushes their imagination to new heights.