Postediting how to futureproof your career in translation

  • Slides: 34
Download presentation
Post-editing: how to future-proof your career in translation Paulo Camargo, Ph. D. Owner, Terminologist

Post-editing: how to future-proof your career in translation Paulo Camargo, Ph. D. Owner, Terminologist pcamargo@blc. com. br BLC - Brazilian Localization Company Web: www. blc. com. br

Purpose of this presentation • Promote the adoption of Machine Translation (MT) and Post-editing

Purpose of this presentation • Promote the adoption of Machine Translation (MT) and Post-editing (PE) – How we can work faster, better, and make more money • Target-audience: – Novice, experienced, and advanced freelance translators – Small LSPs and in-house translators

Perspectives: novice translator • Introduce PE as a new profession – Background information –

Perspectives: novice translator • Introduce PE as a new profession – Background information – Current adoption of PE – PE productivity/compensation • Explore availability of PE training – Why a translator need PE training – What are the required skills – PE certifications available: TAUS, and SDL

Perspectives: experienced/advanced • Use MT output as translation aid - Research shows MT increases

Perspectives: experienced/advanced • Use MT output as translation aid - Research shows MT increases productivity - Translators prefer MT instead of unaided - GT, SDL Cloud, MS Hub, Systran • Advanced: combine MT / term. manag. - Term extraction/customization of MT - Generation/PE of MT output ↑ productivity - Replace combined TM/on-line TM servers?

Perspectives: small LSP • Large/medium LSPs use MT (> decade) – Small LSPs: need

Perspectives: small LSP • Large/medium LSPs use MT (> decade) – Small LSPs: need to catch-up • How to get started with low budget - MT developments: ↓ need of specialized IT Key-resource: in-house translator Terminology management Customizable MT Preliminary analysis/PE Guidelines

Definition of post-editing (TAUS) • Post-editing: “the correction of machine-generated translation output to ensure

Definition of post-editing (TAUS) • Post-editing: “the correction of machine-generated translation output to ensure it meets a level of quality negotiated in advance between client and post-editor”. • “Post-editing seeks the minimum steps required for an acceptable text”

Background information • PE reality: driven by advances in MT – Hybrid MT: rule-based

Background information • PE reality: driven by advances in MT – Hybrid MT: rule-based / statistic-based – Rule-based: dictionaries, rules; e. g Systran – Statistic-based: training data (TM); e. g GT • Pre-editing – Customization: glossary, training data – Preliminary analysis: language rules, client rules, example card

Preliminary analysis (Rico, 2011) • After engine customization – Select MT samples – Check

Preliminary analysis (Rico, 2011) • After engine customization – Select MT samples – Check term consistency/accuracy – Check for recurrent MT errors • Draw guidelines (quality acceptance) – Quality/errors to expect / how to proceed – Language independent/dependent rules – Feedback (glossary update, errors report)

Guidelines for PE (Rico, 2011) • Language independent rules – Fix terminology, syntactic, morphology

Guidelines for PE (Rico, 2011) • Language independent rules – Fix terminology, syntactic, morphology – Fix misspelling, punctuation, omissions – Edit offensive/inappropriate text • Language dependent rules – Language specific examples: – Example card: expected errors/how to fix

Custom MT: what to expect (O’Brien, 2002) • Custom MT: high-level MT output –

Custom MT: what to expect (O’Brien, 2002) • Custom MT: high-level MT output – Most segments = 85% TM fuzzy – Some better than 100% TM match (review) – Some bad translations: retranslate • Translator: critical to MT success • Need human assessment: always! “Not only will MTPE not replace the translator but it also will not happen without the translator”

Full MT post-editing (Dillinger, 2004) • Goal: human-quality output • Most frequent: higher visibility

Full MT post-editing (Dillinger, 2004) • Goal: human-quality output • Most frequent: higher visibility texts • Quality expectations: high (TEP) • Grammar, syntactic, semantic correct • Stylistically appropriate • Productivity expected: 4 K – 10 K w/day

Current adoption of post-editing • Common Sense Advisory report (2012) – Freelance: 21. 7%

Current adoption of post-editing • Common Sense Advisory report (2012) – Freelance: 21. 7% (15. 4% plan) – Small LSP: 32. 5% (22. 6 % plan) – Large LSP: 72. 0% (28. 0% plan) ** • ALC report (2015) – Small LSP: 20. 0% (USA), 25. 0% (Europe) • Lionbridge (Marciano, 2015) – Apply 30% projects (goal 50%), 60 M, 2014

Post-editing productivity data • Post-editing productivity (O’Brien, 2006) – Equal/higher than editing TM High

Post-editing productivity data • Post-editing productivity (O’Brien, 2006) – Equal/higher than editing TM High Fuzzy – Typical: 4 K to 10 K words/day – Proficiency: 100 K w (1 month full-time PE) • Other productivity data – Full PE: 5 K– 8 K w/day (De. Palma, 2011)

PE compensation: follow TM fuzzy • TM fuzzy matches (Guerberof, 2013) – 60 -66%

PE compensation: follow TM fuzzy • TM fuzzy matches (Guerberof, 2013) – 60 -66% of full TR rate for 75%-94% match • MT full post-editing – 70 -50% of rate (Guerberof, 2013) – 65 -68% of rate (Marciano, 2015) – Smaller companies: prefer to pay per hour

Proposal for PE training (O’Brien, 2002) • PE: what do TRs think about? –

Proposal for PE training (O’Brien, 2002) • PE: what do TRs think about? – Dislike for correcting repetitive errors – Fear of losing proficiency (poor MT output) – Dislike for limited freedom of expression • Why do TRs need PE training? – ≠ skills: 2 source texts – Quality requirements, different error types – Qualified translator ≠ successful post-editor

What skills does a post-editor need? • Same as the translator (O’Brien, 2002) -

What skills does a post-editor need? • Same as the translator (O’Brien, 2002) - Expert in subject area and target language - Excellent knowledge of source language - Word-processing (WP) skills, tolerance • Skills for post-editor only (Rico, 2011) - Adv WP: Reg. Ex, S&R, term. management - Positive attitude towards MT

Proposal for PE course (O’Brien, 2002) • Theoretical component – Intro to PE/MT tech

Proposal for PE course (O’Brien, 2002) • Theoretical component – Intro to PE/MT tech / controlled language – Adv. term. Management / text linguistics – Basic programming skills • Required background – TRA skills; basic linguistics/term manag – IT skills; intro lang tech; source/target skills

Sources for PE certification • TAUS (Transl. Automation User Society) – English >23 lang

Sources for PE certification • TAUS (Transl. Automation User Society) – English >23 lang (European, Arabic, Asian) – Also Spanish > English – Cost: 60 Euro (member), 80 Euro (non) • SDL MT PE Certification – Free with SDL Language Cloud MT

Perspectives: experienced/advanced What possibilities can MT offer other than post-editing? Is it worth using

Perspectives: experienced/advanced What possibilities can MT offer other than post-editing? Is it worth using MT output as an aid to increase TR productivity? Can MT replace with advantages the use of combined TM/on-line TM servers?

Efficiency of PE for language translation • Rigorous, controlled analysis (Spence, 2013) - Hypothesis

Efficiency of PE for language translation • Rigorous, controlled analysis (Spence, 2013) - Hypothesis 1: PE reduces translation time - Hypothesis 2: PE increases quality - Hypothesis 3: MT primes the translator • Compared PE vs. unaided translation – Blind experiment: TR did not GT was used – Pre-interview: TR showed strong MT dislike – 16 PRO TRA/pair: EN-AR, EN-FR, EN-GE

Results clarify value of post-editing • Which one is faster? 69% PE • Useful?

Results clarify value of post-editing • Which one is faster? 69% PE • Useful? Yes 56%, No 29%, Unsure 15% • Suggestions improved quality (all) • MT output primes the translator – PE text (closer MT) ≠ Unaided ≠ Raw MT – Lower the TR experience → closer to MT

Does MT output increase productivity? • Example 1: Google Translate – – – Now

Does MT output increase productivity? • Example 1: Google Translate – – – Now a paid service: $20/M characters Plug-in to SDL Trados/other CAT tools General statistical MT engine Not customizable Confidentiality issues See app for complete setup procedure

Does MT output increase productivity? • Example 2: SDL Cloud MT – Price range:

Does MT output increase productivity? • Example 2: SDL Cloud MT – Price range: $5 – $75 /month (Expert) – Plug-in to SDL Trados/other CAT tools – Complete confidentiality (nothing is stored) – Pre-trained engines: Travel, IT, Life Sciences, Automotive, Consumer Electronics – Customizable MT: can add own glossaries – Comprehensive analytics (quality analysis)

Does MT output increase productivity? • Example 3: Microsoft Translator Hub – Plug-in to

Does MT output increase productivity? • Example 3: Microsoft Translator Hub – Plug-in to SDL Trados/others, secure – First 2 M char free; 4 M/mon $40 – Fully customizable MT engine • Previous translations (> 20 K words) • Add glossaries • Request training / evaluate results • Option to “Use Microsoft Models”

How about confidentiality? • Consider e. g. Microsoft and Google – Among the largest

How about confidentiality? • Consider e. g. Microsoft and Google – Among the largest providers of MT – Among the largest buyers of translation – Control information flow around the globe • Confidentiality should not be problem – Google not option → MS Hub/SDL Cloud – Uncomfortable sending data to MS/SDL – Use desktop/server solution: Systran

Changes in MT offer to TRs • Common scenario for TR (4 years ago)

Changes in MT offer to TRs • Common scenario for TR (4 years ago) – One affordable desktop product (Systran) – Macros, Reg. Ex, format conversion – No plug-in for CAT tools (high-end) • Current scenario – Software as a service (GT, SDL, MS Hub) – Plug-in for CAT tools is standard – Much lower IT requirements

Can MT replace combined/online TMs? • Experienced/advanced translators – Use combined/on-line TM for productivity

Can MT replace combined/online TMs? • Experienced/advanced translators – Use combined/on-line TM for productivity – Proud users: ↑ 50% prod, see TM as asset – TM is error-prone (consistency, mistranslation) – Need to check term consistency • MT improved a lot in last 5 years – TRs trust TM fuzzy > raw MT (Guerberof, 2008) – Mistake MT output for TM output (human? )

MTPE can provide a better result • Avoid problems in combined/on-line TM: – Terminology

MTPE can provide a better result • Avoid problems in combined/on-line TM: – Terminology inconsistencies – Mistranslations – Waste time correcting TUs that will never use • New approach using MTPE – Extract terms (Systran, rule-based) – Customizable MT (SDL Cloud or MS Hub) – Post-edit fresh MT for ↑ productivity/quality

How small LSPs can get started • Scenario: 40% TRs use MT (TAUS) •

How small LSPs can get started • Scenario: 40% TRs use MT (TAUS) • Actual post-editing offer (2015) – PE of our MT engine output (GT? ) – Payment: 900 words/hour – Instruction: as readable as possible – No pre-editing: cal, gloss or guidelines • Translators were really upset

Need more than just Google Translate • Pre-editing: custom. , prelim. analysis – Key:

Need more than just Google Translate • Pre-editing: custom. , prelim. analysis – Key: in-house translator (O’Brien, 2002) • Allocate translator for MTPE activities – Use secure on-line customizable engines – Define suitable projects – Invest in terminology management – Develop PE guidelines – No rate discount initially (learning curve)

MT implementation at BLC (4 years) • Smaller projects: 10 - 50 K words

MT implementation at BLC (4 years) • Smaller projects: 10 - 50 K words – Terminology extraction (Systran, rule-based) – Normal TR + ED procedure – Semi-customized: SDL Cloud + Multiterm • Larger projects > 50 K words – Extract bigger glossary (higher coverage) – Raw MT (Systran, SDL Cloud, MS Hub) – PE + ED (no pre-editing, no discount)

MT implementation at BLC • What does MT do for BLC? – Leverage my

MT implementation at BLC • What does MT do for BLC? – Leverage my knowledge: engineering/science – Increase productivity (more/larger projects) – Increase quality (terminology/TM updates) • Future developments – MS Hub, SDL Cloud; Systran – Hire new translator (2016) – Develop a PE team/service

Conclusion The combination of machine translation (MT) and post-editing (PE) is a disruptive innovation

Conclusion The combination of machine translation (MT) and post-editing (PE) is a disruptive innovation that can improve translator’s productivity and translation quality, no matter how you plan to use it. Can you afford to ignore it?

References • • • Guerberof , Ana (2008). Productivity and Quality in the Post-editing

References • • • Guerberof , Ana (2008). Productivity and Quality in the Post-editing of Outputs from Translation Memories and Machine Translation. Masters Dissertation. Universitat Rovira i Virgili. O’Brien, Sharon (2006). Eye-tracking and Translation Memory matches. Perspectives: Studies in Translatology 14 (3), 185 -205. Spence, Green et al (2013), The Efficacy of Human Post-Editing for Language Translation, ACM Human Factors in Computing Systems (CHI), Computer Science Department, Stanford University Rico, Celia et al (2011), EDI-TA: Post-editing Methodology for Machine Translation, Multilingual. Web-LT. O’Brien, Sharon (2002), Teaching Post-editing: A Proposal for Course Content. Proceedings of the 6 th EAMT Workshop on Teaching Machine Translation. EAMT/BCS, UMIST, Manchester, UK. 99 -106. De. Palma, Donald (2011), Common Sense Advisory, Trends in Machine Translation. Dillinger, Mike et al (2004), Implementing Machine Translation, LISA Best Practice Guides. TAUS (2014), MT Post-editing Guidelines Marciano, Jay (2015), Personal communication.