Ximena Gutierrez-Vasques is currently visiting the Research Group in Computational Linguistics from the National Autonomous University of Mexico to collaborate with members of the group. On the 25th April, Ximena presented the group with a talk about her subject area.
Title: Bilingual lexicon extraction for a low-resource language pair
Bilingual lexicon extraction is the task of obtaining a list of word pairs deemed to be word-level translations. This has been a NLP active area of research for several years, especially with the availability of big amounts of parallel, comparable and monolingual corpora that allow us to model the relations between the lexical units of two languages.
However, the complexity of this task increases when we deal with typologically different languages where little data is available.
We focus on the language pair Spanish-Nahuatl. These two languages are spoken in the same country (Mexico) but they are distant from each other, they belong to different linguistic families: Indo-European and Uto-Aztecan. Nahuatl is an indigenous language with around 1.5M speakers and it is a language with a scarcity of monolingual and parallel corpora.
Our work comprises the construction of the first digital publically available parallel corpus for this language pair. Moreover, we explore the combination of several language features and statistical methods to estimate the bilingual word correspondences.
On Wednesday 6th April, RGCL were very pleased to welcome Prof. Mikel Forcada from the University of Alicante, Spain. Mikel is currently undertaking a sabbatical in England and we were very pleased that he could spare the time to visit and to give a talk to our Research Group. The talk, about translation technologies, was well attended and very well received!
Title: Towards effort-driven combination of translation technologies in computer-aided translation
The talk puts forward a general framework for the measurement and estimation of professional translation effort in computer-aided translation. It then outlines the application of this framework to optimize and seamlessly combine available translation technologies (machine translation, translation memory, etc.) in a principled manner to reduce professional translation effort. Finally, it shows some results that point out at existing challenges, particularly as regards to machine translation.
It was a great privilege to welcome Eveline Wandl-Vogt from the Austrian Academy of Sciences to RGCL this week. Eveline is a Research Manager from the Lexicography Laboratory at the Academy who came to RGCL to discuss possible future collaborations with members of the Research Group. During her stay, Eveline carried out a seminar on her research for members of the group.
Title: Computational Linguistics and Digital Humanities- Designing Joint Discovery on the example of lexicography laboratory @ ACDH @ AAS
Abstract: Continue reading