Corpus representativeness in the selection of medical terms

  • Slides: 12
Download presentation
Corpus representativeness in the selection of medical terms to be used in translation memory

Corpus representativeness in the selection of medical terms to be used in translation memory tools Paula Tavares Pinto Paiva São Paulo State University –UNESP/ FCLAr/Brazil UCCTS – Edge Hill University July 2010

Motivation for the study The use of updated terms in scientific journals =>new discoveries

Motivation for the study The use of updated terms in scientific journals =>new discoveries and technologies in the medical field => diffusion of recent studies in Brazil. Ribeiro (2004: 161) “there is a clear division between the tools intended for scholars and those aimed at professional market (…) the tools reflect the priority of each sector: research and teaching, for universities, and productivity for the market ”. Nogueira & Nogueira (2004) => some “translators tend to technological conservatism” (2004: 18)/ these tools (translation memories) are useful, if not indispensable, for all kinds of translators and translation.

Research Paiva (2006, 2009): Parallel and Comparable “Small” corpora (Tognini. Bonelli, 2001; Sinclair, 2001);

Research Paiva (2006, 2009): Parallel and Comparable “Small” corpora (Tognini. Bonelli, 2001; Sinclair, 2001); Compilation of glossaries in the medical fields of anesthesiology, cardiac surgery, orthopedics (Barros, 2004; Castanho, 2004); Glossaries; translation memory tools and Word. Smith Tools (Ribeiro, 2004; Paiva, 2009) Translation features - simplification, explicitation (Baker, 1996, Olohan and Baker, 2000; Camargo, 2007).

Corpora in this study Brazilian Med. Corp (710. 322 tokens) The Translated Brazilian Medical

Corpora in this study Brazilian Med. Corp (710. 322 tokens) The Translated Brazilian Medical Corpus (TBMC) Anesthesiology Cardiovascular Surgery The Comparable Medical Corpus (CMC) Orthopedics Anesthesiology Cardiovascular Surgery Orthopedics

Corpora in this study The Translated Brazilian Medical Corpus (TBMC) The Comparable Medical Corpus

Corpora in this study The Translated Brazilian Medical Corpus (TBMC) The Comparable Medical Corpus (CMC) Cardiology * 45. 788 tokens in Portuguese * 46. 661 tokens English Brazilian Journal of Cardiology (different editions during 2003). Cardiovascular Surgery * 48. 710 tokens in Portuguese * 47. 789 tokens in English. Brazilian Journal of Cardiovalcular Surgery (different editions during 2007). * 113. 508 tokens in Portuguese Arquivos Brasileiros de Cardiologia ( 2003, 2004). * 137. 905 tokens in English BMC Cardiovascular Disorders (2004); Current Interventional Cardiology Reports (2001); etc. Cardiovascular Surgery * 107. 620 tokens in Portuguese Revista Brasileira de Cirurgia Cardiovascular (2006 and 2007) *162. 345 tokens in English Thoracic Surgery ( 2006 and 2007); etc.

Keywords and the selection of simple and complex terms with the aid of Word.

Keywords and the selection of simple and complex terms with the aid of Word. Smith Tools 1. Frequency lists from the TBMC and CMC - Word. List (Scott, 1999). 2. Keywords for each subcorpus ( Lácio-Ref - 4. 156. 816 tokens)/ Folha de São Paulo 97 (235. 036 tokens) as reference corpora: eight lists: a) two lists of keywords of Cardiology from the TBMC; b) two lists of keywords from the TBMC of Cardiovascular Surgery; c) two lists of keywords from the CMC of Cardiology and d) two lists of keywords from the CMC of Cardiovascular Surgery. 3. Comparison of the first one hundred words from all lists + selection of the twenty most representative keywords in Portuguese + same process for the equivalents in English. 4. The choice of candidate terms with the help of medical experts, from both subareas, : “hipertensão” → “hipertensão arterial sustentada” → “protocolo hipertensão arterial sustentada” which presented, as equivalent terms, “hypertension” → “arterial hypertension” → “sustained arterial hypertension

Preparation and definition of glossary in this study OT: No tipo balanceado, a artéria

Preparation and definition of glossary in this study OT: No tipo balanceado, a artéria coronária direita irrigava somente o ventrículo direito e parte posterior do septo interventricular, não fornecendo ramos significantes para o ventrículo esquerdo, enquanto este era irrigado pela artéria coronária esquerda. Artéria Coronária Esquerda Left Coronary Artery TT: In the balanced type, the right coronary artery irrigated only the right ventricle and the posterior part of the interventricular septum, and did not provide significant branches for the left ventricle, which was irrigated by the left coronary artery. OP: A operação foi feita sem o emprego de circulação extracorpórea (4/7/2003) e realizados anastomose da artéria torácica interna esquerda com o ramo interventricular anterior da artéria coronária esquerda e enxerto de veia safena autóloga para o ramo diagonal da artéria coronária esquerda. OE: Anomalous origin of the left coronary artery from the right sinus of Valsalva.

The insertion of glossaries in the tool Wordfast 1 2 3

The insertion of glossaries in the tool Wordfast 1 2 3

Final Remarks Professional Brazilian translators use the same medical terminology as medical researchers. Parts

Final Remarks Professional Brazilian translators use the same medical terminology as medical researchers. Parts of the glossaries compiled in this study was made available to be tested by three students who wrote about their experience (Felice, 2008; Garcia, 2008; Santos, 2008) and a private translation office. Azzan (2004) also highlights the importance of managing the memories and glossaries in these programs. Whether in teaching or in professional contexts, glossaries compiled with the help of Corpus Linguistics and, afterwards, inserted into translation memory tools (Wordfast) may put students and translators in contact with a bilingual terminology in a more dynamic and useful way.

Acknowlegments I am grateful to Professors Diva Cardoso de Camargo (UNESP/IBILCE) and Maeve Olohan

Acknowlegments I am grateful to Professors Diva Cardoso de Camargo (UNESP/IBILCE) and Maeve Olohan (MANCHESTER UNIVERSITY) for the valuable suggestions in this study. I am also grateful to CAPES and FUNDUNESP for the financial support. Thank you, Paula T. P. Paiva ptppaiva@terra. com. br paulapaiva@fclar. unesp. br

 AZZAN, Fuad. Gerenciamento de memórias de tradução e de glossários. In: Cadernos de

AZZAN, Fuad. Gerenciamento de memórias de tradução e de glossários. In: Cadernos de Tradução, 2004, 14, v 2. p. 87 - 119. BARROS, L. A. Curso básico de terminologia. São Paulo: USP, 2004. BERBER SARDINHA, T. Lingüística de Corpus. Barueri, SP: Manole, 2004. BIBER, D. ; CONRAD, S. ; REPPEN, R. Corpus linguistics: investigating language structure and use. Cambridge: CUP, 1998. British National Corpus (BNC). Disponível em <http: //www. natcorp. ox. ac. uk/> Acesso em 2006. CAMARGO, D. C. de. Metodologia de pesquisa em tradução e lingüística de corpus. São Paulo: Cultura Acadêmica; São José do Rio Preto, SP: Laboratório Editorial do IBILCE, UNESP, 2007. 65 p. : il. – (Brochuras; v. 1). CASTANHO, R. Proposta para a elaboração de um glossário de colocações na área médica – subárea hipertensão arterial. 2004. 92 f. Dissertação (Mestrado em Estudos Lingüísticos e Literários em Inglês)-Departamento de Letras Modernas da Faculdade de Filosofia, Letras e Ciências Humanas, Universidade de São Paulo, 2004. CHAMPOLLION, Yves. Wordfast. 1999. FELICE, A. A. A utilização de glossários de medicina no programa de memórias de tradução Wordfast. Relatório de estágio básico, 2008. Universidade Estadual Paulista Júlio de Mesquita Filho, UNESP, 2008. GARCIA, R. C. P. Tradução e arte: comercial, graffiti e games. Relatório de estágio básico, 2008. Universidade Estadual Paulista Júlio de Mesquita Filho, UNESP, 2008. LÁCIO-REF. Disponível em <http: //www. nilc. icmc. usp. br/lacioweb/corpora. htm> Acesso em julho de 2008. LAVIOSA, S. Corpus-based translation studies: theory, findings, applications. Amsterdam, Rodopi, 2002. NOGUEIRA, Danilo; NOGUEIRA, Vera Maria C. Por que usar programas de apoio à tradução? In Cadernos de Tradução, 2004, v. 2, p. 17 – 35. OLOHAN, M. Introducing corpora in translation studies. New York: Routledge, 2004

 PAIVA, P. T. P. ; CAMARGO, D. C. Estudos da tradução baseados em

PAIVA, P. T. P. ; CAMARGO, D. C. Estudos da tradução baseados em corpus e lingüística de corpus: levantamento de termos médicos para a elaboração de um glossário bilíngüe. Estudos Lingüísticos (São Paulo), v. 2, p. 428 -436, 2007. PAIVA, P. T. P. ; CAMARGO, D. C. ; XATARA, C. M. Uma reflexão sobre a elaboração de um léxico bilíngüe preliminar na subárea de cardiologia a partir do uso de termos encontrados em um corpus paralelo e em dois corpora comparáveis. DELTA. Documentação de Estudos em Lingüística Teórica e Aplicada, v. 24: 1, p. 1 -22, 2008. RIBEIRO, Gabriela C. B. Tradução técnica, terminologia e lingüística de corpus: ferramenta Word. Smith Tools. In: In Cadernos de Tradução, 2004, v. 2, p. 159 - 174. SANTOS, S. F. Memórias de tradução como auxílio ao tradutor de textos especializados. 2008. Trabalho de Conclusão de Curso. (Graduação em Letras com Habilitação em Tradutor e Intérprete) - União das Faculdades dos Grandes Lagos. 2008. SCOTT, M. Word. Smith Tools: version 3. 0. Oxford: Oxford University Press, 1999. SINCLAIR, J. Corpus, concordance and collocation. Oxford: Oxford University Press, 1991. SINCLAIR, J. Developing Linguistic Corpora: A Guide to Good Practice Corpus and Text – Basic Principles. 2004. Disponível em <http: //ahds. ac. uk/creating/guides/linguisticcorpora/chapter 1. htm> Acesso em 2009. STUBBS, M. Corpus evidence for norms of lexical collocation. In: G. COOK; B. SEIDLHOFER (org). Principle and practice in applied linguistics : studies in honour of H Widdownson. Oxford: Oxford University Press, 1995. TOGNINI-BONELLI, E. Corpus linguistics at work. Amsterdam : Jonh Benjamins, 2001.