Ahmet Üstün, University of Groningen, Netherlands
A Single Model for Many Languages with Adapters
25 January 2022
Recent advances in pre-trained language models have brought the idea of truly multilingual models for many languages, in different tasks. However, cross-language interference and restrained model capacity, i.e. curse of multilinguality, remain as the major obstacle especially for zero or low resource languages. Adapters (Houlsby et al., 2019) that are small bottleneck layers inserted into Transformer models, enable modular and efficient transfer learning. They can also be purposed as a solution to the curse of multilinguality. In this talk, I will discuss how to use adapters to build a single model for many languages including zero-shot and unsupervised scenarios in dependency parsing and neural machine translation respectively.
Ahmet Ustun is a PhD Student in the Center for Language and Cognition (CLCG) at the University of Groningen. He is working as a member of the Computational Linguistics research group under the supervision of Arianna Bisazza, Gosse Bouma and Gertjan van Noord. His research focuses on multilingual natural language processing with a special interest in cross-lingual transfer learning. In this context, he worked on cross-lingual word embeddings, multilingual dependency parsing and multilingual unsupervised NMT. His research aim is to find efficient multilingual adaptation methods for low-resource languages without suffering the curse of multilinguality.
Speaker: Prof Melanie Mitchell, Santa Fe Institute, USA
Title: Why AI is Harder Than We Think
Abstract: Since its beginning in the 1950s, the field of artificial intelligence has cycled several times between periods of optimistic predictions and massive investment (“AI Spring”) and periods of disappointment, loss of confidence, and reduced funding (“AI Winter”). Even with today’s seemingly fast pace of AI breakthroughs, the development of long-promised technologies such as self-driving cars, housekeeping robots, and conversational companions has turned out to be much harder than many people expected.
One reason for these repeating cycles is our limited understanding of the nature and complexity of intelligence itself. In this talk I will discuss some fallacies in common assumptions made by AI researchers, which can lead to overconfident predictions about the field. I will also speculate on what is needed for the grand challenge of making AI systems more robust, general, and adaptable—in short, more intelligent.
Speaker Bio: Melanie Mitchell is the Davis Professor of Complexity at the Santa Fe Institute. Her current research focuses on conceptual abstraction, analogy-making, and visual recognition in artificial intelligence systems. Melanie is the author or editor of six books and numerous scholarly papers in the fields of artificial intelligence, cognitive science, and complex systems. Her book Complexity: A Guided Tour (Oxford University Press) won the 2010 Phi Beta Kappa Science Book Award and was named by Amazon.com as one of the ten best science books of 2009. Her latest book is Artificial Intelligence: A Guide for Thinking Humans (Farrar, Straus, and Giroux).
Speaker: Lyke Esselink, University of Amsterdam
Title: Text-to-sign translation: making information accessible
Communication between healthcare professionals and deaf patients is challenging, and the current COVID-19 pandemic makes this issue even more acute. Sign language interpreters can often not enter hospitals and face masks make lipreading impossible. To address this urgent problem, SignLab Amsterdam developed a system which allows healthcare professionals to translate sentences that are frequently used in the diagnosis and treatment of COVID-19 into Sign Language of the Netherlands (NGT). Translations are displayed by means of videos and avatar animations. The architecture of the system is such that it could be extended to other applications and other sign languages in a relatively straightforward way.
In the first part of this talk, I will present an overview of the system created by SignLab Amsterdam. I will provide a background on the problem at hand, explain the basics of sign languages and sign synthesis, and outline our system and the process behind its implementation. The second part of the talk will focus on an extensive evaluation study that we did, of which the results are not yet published. I will cover the methodology of this study, some important lessons that we learned from the process, and unveil some of the results.
Lyke Esselink is a Master’s student in Artificial Intelligence at the Radboud University in Nijmegen, and completed her bachelor’s degree in AI at the University of Amsterdam. Since the start of 2020, she combined her education with her interest in sign language through research at SignLab Amsterdam, where she has investigated the translation of text to Sign Language of the Netherlands. Research interest areas include Machine Translation, Natural Language Processing and accessibility technologies.
Title: Modern Enterprise Translation Management: Problems, Compliance and Resources
Speaker: Dr Todor Lazarov, New Bulgarian University
Modern-day LSP companies have extensive translation experience and deep industry know-how. They work with vendors who already use “some” language technology – e.g. certain CAT tools, certain file formats, etc. Companies often own, but unfortunately rarely have full control of their linguistic assets and resources. LSPs benefit from TM usage and leverage, but usually they find it difficult to manage their production process and to effectively manage their linguistic resources.
Most probably these statements describe the most common situation for most LSPs! In this presentation the author will outline the most common problems for modern day LSP companies regarding the effective management of internal and external linguistic resources. We will elaborate on compliance problems (such as compliance with the industry specialized ISO 17100) and we will try to construct and describe a system for effective management of linguistic resources and ROI.
The current trend is to collect linguistic resources with as much as possible meta-information, but rarely this meta-information is useful for practical business purposes – we will try to elaborate on how converting this “artefacts” into useful “instruments” can benefit the production process and in addition – how the mainstream LSP production process can be used to create resources for different NLP tasks.
Dr. Todor Lazarov holds a PhD degree in Computational linguistics and has a diverse background in Linguistics. He has also specialized Artificial Intelligence in the University of Amsterdam. Todor teaches courses in the programmes of the Centre for Computational and Applied Linguistics in New Bulgarian University and he is also working as Research and Development Manager at Sofita Translation Agency. He has a diverse experience with CAT tools and has also established successful collaboration with different commercial MT providers. His research interests include machine translation, modern translation technologies, machine translation evaluation and CAT tools. Todor is also providing subject matter expertise and consultation to different LSP`s in Bulgaria.
19 November 2021
Speaker: Rocío Caro, University of Wolverhampton
Title: Integration of TM and MT
Translation Memories (TM) and Machine Translation (MT) have been used by translators for a long time, but research has mainly studied them separately until very recently. Nowadays, however, not only academic research is focused on the integration of TM and MT, but many CAT tools include the possibility of working with an MT engine as well. Some companies claim that the integration of the two technologies is beneficial for translators as it may increase their productivity, but there are not comprehensive studies on the topic and very little is known about the efforts, productivity and opinion of translators on using translation tools that integrate TM and MT, and the quality of the final texts.
In the first part of the talk, I will present the different ways TM and MT can be integrated, which are divided into two main categories: internal or external integration. In the second part, I will present the project we are currently carrying out to study the post-editing efforts (technical, temporal, and cognitive) of translators working in an external integrated environment (i.e., both TM and MT segments are presented to the translator), the preliminary findings, what we found about the opinion of translators, and the next steps of the project.
Rocío Caro is currently doing her PhD in Translation Technology at the Research Group of Computational Linguistics, University of Wolverhampton. She has an MA in Translation for the Publishing World and a BA in Translation and Interpreting from the University of Malaga, Spain.
19 November 2021
Speaker: Dr Antonio Toral, University of Groningen
Title: Machine-Aided Literary Translation: State of Affairs in the Early 2020s
To what extent can machine translation be used to translate literary texts? Could such machine translations be of any use to professional literary translators? Could readers benefit in any way from the resulting machine-aided translations?
Through these and other related questions, I aim to present the current state of affairs concerning the application of machine translation to literary texts, focusing on fiction. Taking into account the shortcomings encountered to date, I will then outline potential lines of research that may occupy us in the first half of the 2020s.
Antonio Toral is an Senior Lecturer in Language Technology at the University of Groningen. He holds a PhD in Computational Linguistics from the University of Alicante and has carried out research in the area of Machine Translation (MT) since 2010. His research interests include the application of MT to literary texts, MT for under-resourced languages and the analysis of translations produced by machines and humans.