Ahmet Üstün, University of Groningen, Netherlands
A Single Model for Many Languages with Adapters
25 January 2022
Recent advances in pre-trained language models have brought the idea of truly multilingual models for many languages, in different tasks. However, cross-language interference and restrained model capacity, i.e. curse of multilinguality, remain as the major obstacle especially for zero or low resource languages. Adapters (Houlsby et al., 2019) that are small bottleneck layers inserted into Transformer models, enable modular and efficient transfer learning. They can also be purposed as a solution to the curse of multilinguality. In this talk, I will discuss how to use adapters to build a single model for many languages including zero-shot and unsupervised scenarios in dependency parsing and neural machine translation respectively.
Ahmet Ustun is a PhD Student in the Center for Language and Cognition (CLCG) at the University of Groningen. He is working as a member of the Computational Linguistics research group under the supervision of Arianna Bisazza, Gosse Bouma and Gertjan van Noord. His research focuses on multilingual natural language processing with a special interest in cross-lingual transfer learning. In this context, he worked on cross-lingual word embeddings, multilingual dependency parsing and multilingual unsupervised NMT. His research aim is to find efficient multilingual adaptation methods for low-resource languages without suffering the curse of multilinguality.
Speaker: Dr Ilias Chalkidis, Department of Computer Science at University of Copenhagen (CoAStaL NLP Group)
Title: Let’s Transform Law with Augmented Lawyering: Advances and Challenges in Legal Text Processing
Abstract: Law is one of the cornerstones of our civilisation. The legal industry produces constantly large volumes of legal text in various forms, i.e, legislation, court decisions, and legal agreements (contracts). Hence, consuming and understanding legal information can be overwhelming, as the points of interest for users (legal professionals and laypersons) are hidden in piles of pages and documents. Legal Text Processing (a.k.a. Legal NLP or Intelligence) is a growing research area where Natural Language Processing (NLP) techniques are applied in the legal domain. In this talk, I will first point out the main challenges of legal NLP and how the field is advancing. In this regard, I will first argue on the importance of proper benchmarking, and the development of legal-oriented language models. Then I will cover parts of my recent work on large-scale classification under temporal drift, and “baby” steps for cross-lingual legal NLP, decision explainability, robustness, and fairness.
What the blockchain will do especially in its NFT form on top of Ethereum is create a world that will eat up all of the contracts. Every contract in the next 15 years will be done on the blockchain.2021, Gary Vaynerchuk (@garyvee)
Speaker Bio: Ilias Chalkidis is a post-doctoral researcher at the Department of Computer Science at the University of Copenhagen (CoAStaL NLP Group). He received his PhD from the Department of Informatics at Athens, University of Economics and Business on Deep Learning for Legal Text Processing. He is well known for his work in Legal Natural Language Processing (LegalNLP), also known as Legal Intelligence. He has published in top-tier conferences such as ACL, NAACL, EMNLP, and EACL. He is the lead developer of LEGAL-BERT which has more than 35,000 downloads from the community, and LEX-GLUE, the benchmark for Legal Language Understanding. Furthermore, he serves in the organizing committee of the Natural Legal Language Processing (NLLP) Workshop. More information on https://iliaschalkidis.github.io.
We will be back in the New Year on the 4th January 2022
If you are a student looking for support over the festive break, please visit the Winter break at WLV web page. You’ll find information about library opening times, how you can access mental health and wellbeing support, personal safety guidance and emergency contact details.
Everyone at the University of Wolverhampton can access free online mental health support with Togetherall, any time, any day.
Speaker: Prof Melanie Mitchell, Santa Fe Institute, USA
Title: Why AI is Harder Than We Think
Abstract: Since its beginning in the 1950s, the field of artificial intelligence has cycled several times between periods of optimistic predictions and massive investment (“AI Spring”) and periods of disappointment, loss of confidence, and reduced funding (“AI Winter”). Even with today’s seemingly fast pace of AI breakthroughs, the development of long-promised technologies such as self-driving cars, housekeeping robots, and conversational companions has turned out to be much harder than many people expected.
One reason for these repeating cycles is our limited understanding of the nature and complexity of intelligence itself. In this talk I will discuss some fallacies in common assumptions made by AI researchers, which can lead to overconfident predictions about the field. I will also speculate on what is needed for the grand challenge of making AI systems more robust, general, and adaptable—in short, more intelligent.
Speaker Bio: Melanie Mitchell is the Davis Professor of Complexity at the Santa Fe Institute. Her current research focuses on conceptual abstraction, analogy-making, and visual recognition in artificial intelligence systems. Melanie is the author or editor of six books and numerous scholarly papers in the fields of artificial intelligence, cognitive science, and complex systems. Her book Complexity: A Guided Tour (Oxford University Press) won the 2010 Phi Beta Kappa Science Book Award and was named by Amazon.com as one of the ten best science books of 2009. Her latest book is Artificial Intelligence: A Guide for Thinking Humans (Farrar, Straus, and Giroux).
Speaker: Lyke Esselink, University of Amsterdam
Title: Text-to-sign translation: making information accessible
Communication between healthcare professionals and deaf patients is challenging, and the current COVID-19 pandemic makes this issue even more acute. Sign language interpreters can often not enter hospitals and face masks make lipreading impossible. To address this urgent problem, SignLab Amsterdam developed a system which allows healthcare professionals to translate sentences that are frequently used in the diagnosis and treatment of COVID-19 into Sign Language of the Netherlands (NGT). Translations are displayed by means of videos and avatar animations. The architecture of the system is such that it could be extended to other applications and other sign languages in a relatively straightforward way.
In the first part of this talk, I will present an overview of the system created by SignLab Amsterdam. I will provide a background on the problem at hand, explain the basics of sign languages and sign synthesis, and outline our system and the process behind its implementation. The second part of the talk will focus on an extensive evaluation study that we did, of which the results are not yet published. I will cover the methodology of this study, some important lessons that we learned from the process, and unveil some of the results.
Lyke Esselink is a Master’s student in Artificial Intelligence at the Radboud University in Nijmegen, and completed her bachelor’s degree in AI at the University of Amsterdam. Since the start of 2020, she combined her education with her interest in sign language through research at SignLab Amsterdam, where she has investigated the translation of text to Sign Language of the Netherlands. Research interest areas include Machine Translation, Natural Language Processing and accessibility technologies.
Title: Modern Enterprise Translation Management: Problems, Compliance and Resources
Speaker: Dr Todor Lazarov, New Bulgarian University
Modern-day LSP companies have extensive translation experience and deep industry know-how. They work with vendors who already use “some” language technology – e.g. certain CAT tools, certain file formats, etc. Companies often own, but unfortunately rarely have full control of their linguistic assets and resources. LSPs benefit from TM usage and leverage, but usually they find it difficult to manage their production process and to effectively manage their linguistic resources.
Most probably these statements describe the most common situation for most LSPs! In this presentation the author will outline the most common problems for modern day LSP companies regarding the effective management of internal and external linguistic resources. We will elaborate on compliance problems (such as compliance with the industry specialized ISO 17100) and we will try to construct and describe a system for effective management of linguistic resources and ROI.
The current trend is to collect linguistic resources with as much as possible meta-information, but rarely this meta-information is useful for practical business purposes – we will try to elaborate on how converting this “artefacts” into useful “instruments” can benefit the production process and in addition – how the mainstream LSP production process can be used to create resources for different NLP tasks.
Dr. Todor Lazarov holds a PhD degree in Computational linguistics and has a diverse background in Linguistics. He has also specialized Artificial Intelligence in the University of Amsterdam. Todor teaches courses in the programmes of the Centre for Computational and Applied Linguistics in New Bulgarian University and he is also working as Research and Development Manager at Sofita Translation Agency. He has a diverse experience with CAT tools and has also established successful collaboration with different commercial MT providers. His research interests include machine translation, modern translation technologies, machine translation evaluation and CAT tools. Todor is also providing subject matter expertise and consultation to different LSP`s in Bulgaria.