Abstracts
Abstract
Parallel corpora have long been awaited in simultaneous interpreting studies in order to validate existing theories and models. The present paper illustrates the development of the European Parliament Interpreting Corpus (EPIC), an open, parallel, multilingual (English, Italian and Spanish), POS-tagged corpus of European Parliament source speeches and simultaneously-interpreted target speeches. The aim of the project is to study recurrent lexical patterns and morphosyntactical structures across all the possible language combinations and directions, and verify empirically whether different strategies can be detected when interpreting from a Germanic language into a Romance one and vice-versa, or between two Romance languages. EPIC is freely available on-line for the research community to use and contribute to.
Keywords/Mots-Clés:
- directionality,
- parallel corpora,
- simultaneous interpreting,
- EPIC,
- electronic corpus
Résumé
Les corpus parallèles dans le domaine de la recherche sur l’interprétation simultanée étaient attendus depuis longtemps pour valider des théories et des modèles existants. La présente contribution a pour but de présenter EPIC (European Parliament Interpreting Corpus), un corpus ouvert, parallèle, multilingue (anglais, italien et espagnol) et avec étiquetage des parties du discours, composé de discours source prononcés au Parlement européen et de discours cible interprétés en simultanée. Le but de ce projet est d’examiner les modèles lexicaux et les structures morphosyntaxiques dans toutes les combinaisons linguistiques considérées et quelles que soient la langue de départ et d’arrivée, et de vérifier de manière empirique si des stratégies différentes peuvent être décelées lors d’une interprétation à partir d’une langue germanique vers une langue romane et viceversa, ou entre deux langues romanes. EPIC est librement accessible en ligne pour les chercheurs et est ouvert à leurs contributions.
Download the article in PDF to read it.
Download
Appendices
References
- ARMSTRONG, S. (1997): “Corpus Based Methods for NLP and Translation Studies”, Interpreting 2-1/2, pp. 141-162.
- BARONI, M., BERNARDINI, S., COMASTRI, F., PICCIONI, L., VOLPI, A., ASTON, G. and M. MAZZOLENI (2004): “Introducing the La Repubblica Corpus: A Large, Annotated, TEI (XML)-compliant Corpus of Newspaper in Italian”, in Lino, M. T., Xavier, M. F., Ferreira, F., Costa, R., and R. Silva (eds.), with the collaboration of C. Pereira, F. Carvalho, M. Lopes, M. Catarino and S. Barros, Proceedings of the 4th International Conference onLanguage Resources and Evaluation, ELRA 5, pp. 1771-1774
- BENDAZZOLI, C., MONTI, C., SANDRELLI, A., RUSSO, M., BARONI, M., BERNARDINI, S., MACK, G., BALLARDINI, E. and P. MEAD (2004): “Towards the Creation of an Electronic Corpus to Study Directionality in Simultaneous Interpreting”, in Oostdijk, Nelleke, Kristoffersen, Gjert and Geoffrey Sampson (eds.), Compiling andProcessing Spoken Language Corpora, LREC 2004 Satellite Workshop, Fourth International Conference on Language Resources and Evaluation, 24 May 2004, pp. 33-39.
- BOWKER, L. and J. PEARSON (2002): Working with Specialized Language. A Practical Guide to Using Corpora, London and New York, Routledge.
- CARRERAS, X., CHAO I., PADRÓ, L. and M. PADRÓ (2004): “Freeling: An Open-source Suite of Language Analyzers”, in Lino, Maria Teresa, Xavier, Maria Francisca, Ferreira, Fátima, Costa, Rute, and Raquel Silva (eds.), with the collaboration of Carla Pereira, Filipa Carvalho, Milene Lopes, Mónica Catarino and Sérgio Barros, Proceedings of the 4th International Conference on Language Resources and Evaluation, ELRA, vol. 1, pp. 239- 242.
- CENCINI, M. (2002): “On the Importance of an Encoding Standard for Corpus-based Interpreting Studies. Extending the TEI Scheme”, inTRAlinea, Special Issue: CULT2K, <http://www.intralinea.it/specials/eng_open1.php?id=P107/>.
- CHRIST, O. (1994): “A Modular and Flexible Architecture for an Integrated Corpus Query System”, COMPLEX ’94, Budapest.
- DONOVAN, C. (2004): “European Masters Project Group: Teaching Simultaneous Interpretation into a B language: Preliminary findings”, Interpreting, 6-2, pp. 205-216.
- FALBO, C., RUSSO, M. e F. STRANIERO SERGIO (a cura di) (1999): Interpretazione simultanea e consecutiva, Milano, Hoepli.
- GILE, D. (1994): “Methodological Aspects of Interpretation and Translation Research”, in Lambert, S. and B. Moser- Mercer (eds.), Bridging the Gap: Empirical Research in Simultaneous Interpretation, Amsterdam-Philadelphia, John Benjamins, pp. 39-56.
- GILE, D. (1997): “Interpretation Research: Realistic Expectations”, in Klaudy, K. and J. Kohn (eds.), Transferre necesse est, Proceedings of the 2nd International Conference on Current Trends in Studies of Translation and Interpreting, 5-7 September 1996, Budapest, Hungary, Scholastica, pp. 43-51.
- GILE, D. (2000): “Issues in Interdisciplinary Research into Conference Interpreting”, in Englund Dimitrova, B. and K. Hyltenstam (eds.), Language Processing and Simultaneous Interpreting: Interdisciplinary Perspectives, Amsterdam-Philadelphia, John Benjamins, pp. 89-106.
- HALVERSON, S. (1998): “Translation Studies and Representative Corpora: Establishing Links between Translation Corpora, Theoretical/Descriptive Categories and a Conception of the Object of Study”, META 43-4, pp. 494-513.
- JURAFSKY, D. and J. H. MARTIN (2004): “Word Classes and Part-of-Speech Tagging”, revised 2004 version, original chapter in Speech and Language Processing: An Introduction to Natural Language Processing, ComputationalLinguistics and Speech Recognition, Upper Saddle River, Prentice Hall, <http://www.cs.colorado.edu/~martin/slp.html>.
- KALINA, S. (1994): “Analysing Interpreters’ Performance: Methods and Problems”, in Dollerup, C. and A. Loddegaard (eds.), Teaching Translation and Interpreting 2: Insights, Aims, Visions, Amsterdam-Philadelphia, John Benjamins, pp. 225-232.
- LAMBERT, S. (1992): “Shadowing”, The Interpreters’ Newsletter 4, pp. 15-24.
- LEECH, G., MYERS, G. and J. THOMAS (eds.) (1995): Spoken English on Computer: Transcription, Mark-up and Application, New York, Longman.
- MANUEL JEREZ, J. de (2003a): “El canal Ebs en la mejora de la calidad de la formación de intérpretes: estudio de un corpus en vídeo del Parlamento Europeo”, in Collados Aís, Á., Fernández Sánchez, M.a M. and D. Gile (eds.), Laevaluación de la calidad en interpretación: investigación, Granada, Editorial Comares, pp. 207-218.
- MANUEL JEREZ, J. de (2003b): “Nuevas tecnologías y selección de contenidos: la base de datos Marius”, in Manuel Jerez, J. de (coord.), Nuevas tecnologías y formación de intérpretes, Granada, Editorial Atrio, pp. 21-61.
- MARZOCCHI, C. and G. ZUCCHETTO (1997): “Some Considerations on Interpreting in an Institutional Context: The Case of the European Parliament”, Terminologie et Traduction 3, pp. 70-85.
- O.CONNELL, D. C. and S. KOWAL (1994): “Some Current Transcription Systems for Spoken Discourse: A Critical Analysis”, Pragmatics 4, pp. 81-107.
- ORLETTI, F. and R. TESTA (1991): “La trascrizione di un corpus di interlingua: aspetti teorici e metodologici”, Studi italiani di linguistica teorica e applicata 20-2, pp. 243-283.
- PÖCHHACKER, F. (1995): “Those who do, a profile of research(ers) in interpreting”, Target 7-1, pp. 47-64.
- PSATHAS, G. and T. ANDERSON (1990): “The ‘Practices’ of Transcription in Conversation Analysis”, Semiotica 78- 1/2, pp. 75-99.
- SCHWEDA NICHOLSON, N. (1990): “The Role of Shadowing in Interpreter Training”, The Interpreters’ Newsletter 3, pp. 33-40.
- SHLESINGER, M. (1998a): “Corpus-based Interpreting Studies as an Offshoot of Corpus-based Translation Studies”, META 43-4, pp. 486-493.
- SHLESINGER, M. (1998b): “Interpreting as a Cognitive Process: What Do We Know About How It Is Done?”, Paper given at the II Jornadas Internacionales de Traducción e Interpretación, Málaga, 17-20 marzo 1997, Málaga, Grupo de Investigación de Lingüística Aplicada y Traducción de la Universidad de Málaga.
- STRANIERO SERGIO, F. (1999): “The Interpreter on the Talk Show: Analyzing Interaction and Participation Framework”, The Translator 5-2, pp. 303-326.
- EbS (Europe by Satellite) TV channel: http://www.europa.eu.int/comm/dg10/ebs
- EPIC interface on the SSLMITdev website: http://sslmitdev-online.sslmit.unibo.it/corpora/corpora.php
- European Parliament: http://www.europarl.eu.int
- FreeLing: http://garraf.epsevg.upc.es/freeling/
- IMS Corpus Workbench: http://www.ims.uni-stuttgart.de/projekte/CorpusWorkbench
- Interinstitutional style guide: http://publications.eu.int/code/en/en-000400.htm
- SCHMIDT, T. (2001) “The transcription system EXMARaLDA: An application of the annotation graph formalism as the Basis of a Database of Multilingual Spoken Discourse”. In Proceedings of the IRCS Workshop on Linguistic Databases, 219-227, in http://www.rrz.uni-hamburg.de/exmaralda/de/dokumentation.html
- SCHMIDT, T. (2003a) A short introduction to the EXMARaLDA Partitur-Editor, in http://www.rrz.unihamburg.de/exmaralda/de/dokumentation.html
- SCHMIDT, T. (2003b) “Visualising Linguistic Annotation as Interlinear Text”. In Arbeiten zur Mehrsprachigkeit, Serie B (46) Hamburg, in http://www.rrz.uni-hamburg.de/exmaralda/de/dokumentation.html TreeTagger: http://www.ims.uni-stuttgart.de/ftp/pub/corpora/italian-tagset.txt