RCGL Seminars logo

Natural Language Processing

George Chrysostomou, The University of Sheffield

25 October 2021

Title: Improving Explanations for Model Predictions

Abstract:

Large neural models dominate benchmarks of natural language understanding tasks. Their achievements have led in increasing adoption in critical areas such as that of health and law. A significant drawback of these models is their highly parameterized architecture, which makes their predictions hard to interpret. Previous work has introduced approaches for generating rationales for model predictions (e.g. using feature attribution). However, how accurately these approaches explain the reasoning behind a model’s prediction has only recently been studied. This seminar will introduce three studies which aim to improve explanations for model predictions: (1) Improving the Faithfulness of Attention-based Explanations with Task-specific Information for Text Classification (published at ACL2021); (2) Towards Better Transformer-based Faithful Explanations with Word Salience (published at EMNLP 2021); (3) Instance-level Rationalization of Model Predictions (Under review at AAAI 2021).

Bio:

George Chrysostomou is a PhD student at the University of Sheffield, supervised by Dr. Nikolaos Aletras and Dr. Mauricio Alvarez. His research interests lie in improving explanations for model predictions in Natural Language Processing. Before pursuing his doctoral studies, he did his masters in Data Analytics at the University of Sheffield. 

RCGL Seminars logo

Machine Learning/Deep Learning

Dr Yuval Pinter, Ben Gurion University of the Negev, Isarel

Challenging and Adapting NLP Models to Lexical Phenomena

12 October 2021

Abstract:

Over the last few years, deep neural models have taken over the field of natural language processing (NLP), brandishing great improvements on many of its sequence-level tasks. But the end-to-end nature of these models makes it hard to figure out whether the way they represent individual words aligns with how language builds itself from the bottom up, or how lexical changes in register and domain can affect the untested aspects of such representations.

In this talk, I will present NYTWIT, a dataset created to challenge large language models at the lexical level, tasking them with identification of processes leading to the formation of novel English words, as well as with segmentation and recovery of the class of novel blends. I will then present XRayEmb, a method which alleviates the hardships of processing these novelties by fitting a character-level encoder to the existing models’ subword tokenizers; and conclude with a discussion of the drawbacks of current tokenizers’ vocabulary creation schemes.

Bio:

Yuval Pinter is a Senior Lecturer in the Department of Computer Science at Ben-Gurion University of the Negev, focusing on NLP. Yuval got his PhD at the Georgia Institute of Technology School of Interactive Computing as a Bloomberg Data Science PhD Fellow. Before that, he worked as a Research Engineer at Yahoo Labs and as a Computational Linguist at Ginger Software, and obtained an MA in Linguistics and a BSc in CS and Mathematics, both from Tel Aviv University. Yuval blogs (in Hebrew) about language matters on Dagesh Kal.

=====

Website:

www.yuvalpinter.com

Blog:

dagesh.wordpress.com

RCGL Seminars logo

Technologies for Translation and Interpreting: Challenges and Latest Developments

Aleks Sandor Milovanovic and Dora Murgu, Interprefy

The backstage of a hybrid event – a complex string puppet called RSIBOX

15 October 2021

Abstract:

Hybrid events have been at the core of Interprefy since its creation in 2014 when remote simultaneous interpreting (RSI) was only accepted as a sideline to in-person events, where complex language pairs or space restrictions could require expanding the pool of in-person interpreting teams to one that also included remote participation. The real breakthrough came in 2018 when Interprefy won their first UN tender and the International Seabed Authority signed on with Interprefy as the first UN agency to replace onsite interpreters for their major meetings with remote interpreters for a whooping cost savings of almost a million dollars. From there it went strength to strength and culminated at WHA73 which was watched by a total of 800 million people worldwide, being the first world health assembly that was fully online in the history of World Health Organizatio).

At Interprefy we have developed our own plug and play equipment (RSIBOX) which can be used onsite for seamless bridge between AV and Remote setups. The RSIBOX originated from experimentation in hybrid environments and is a piece of hardware that has been used on most football championships, Euro 2020 being the most prominent example.

During this webinar Aleks and Dora will speak about what goes on backstage for a seamless hybrid event and discuss the technology behind our RSIBOX. This webinar is oriented at EM TTI students who have a particular interest in interpreting technology, AV systems and hardware.

Bios:

Dora Murgu. Romanian born and Spanish bred, Dora started her career as a conference interpreter. She soon transitioned into the backstage of interpretation services after creating a pioneering training program for OPI which she later taught at universities across Spain for over six years. She has presented several papers at major industry conferences and published articles on interpreting quality management, interpreter training and OPI service provision in Spain. She has worked for major LSPs and RSI providers for the past 13 years and currently holds the position of Interpreter Engagement Manager at Interprefy, one of the leading RSI platforms on the market. When she’s not immersed in the world of interpreters she threads the waters of the Arabian Gulf with her SUP board in Dubai, where she lives with her family.

Aleks Sandor Milovanovic. Raised in South Africa, Hungarian citizen Aleks Sandor moved to Switzerland in 2014. As one of the most senior members of Interprefy (the 3rd to be precise) he built the original Operations Team for which he was responsible during the first startup phase of the company. Shortly before COVID hit he created the Special Operations Department to more efficiently respond to a high demand of very sensitive clients such as the UN, IMF and UEFA. The innovation that stemmed from his leadership included the Interprefy Gateway solution which was first used at the Google PES 2018 and notably at the UN Hybrid Rooms setup which enabled UN to resume their operations after nearly three months of meetings without interpretation. In his spare time, Aleks enjoys kayaking and cycling around lake Zurich.

RCGL Seminars logo

Technologies for Translation and Interpreting: Challenges and Latest Developments

Dr Joss Moorkens, Dublin City University

Ethics and NMT

8 October 2021

Abstract:

Neural MT can facilitate communication in a way that surpasses previous MT paradigms, but there are also consequences of its use. As with the development of any technology, MT is not ethically neutral, but rather reflects the values of those behind its development. This talk considers the ethical issues around MT, beginning with data gathering and reuse and looking at how MT fits with the values and codes of the translator. If machines and systems reflect value systems, can they be explicitly ‘good’ and remove bias from their output? What is the contribution of MT to discussions of sustainability and diversity? Rather than promoting an approach that involves following a set of instructions to implement a technology unthinkingly, this talk will highlight the importance of a conscious decision-making process when designing a data-driven MT workflow.

Bio:

Joss Moorkens is an Associate Professor and Chair of postgraduate translation programmes at the School of Applied Language and Intercultural Studies at Dublin City University. He is also a Funded Investigator with the ADAPT Centre and a member the Centre for Translation and Textual Studies. He has authored over 50 journal articles, book chapters, and conference papers on translation technology, user interaction with and evaluation of machine translation, translator precarity, and translation ethics. He is General Coeditor of the journal Translation Spaces with Prof. Dorothy Kenny, and coedited the book ‘Translation Quality Assessment: From Principles to Practice’, published in 2018 by Springer, and special issues of Machine Translation (2019) and Translation Spaces (2020). He leads the Technology working group (with Prof. Tomas Svoboda of Charles University) as a board member of the European Masters in Translation network and sits on the advisory board of the Journal of Specialised Translation.

RANLP 2021 Conference Report

RANLP logo

A number of staff and students recently attended RANLP 2021 – online this year due to the ongoing pandemic – however, as you can see from the reports below, still a lively and engaging conference.


Another successful online conference I attended is RANLP. I enjoyed the ability to attend many sessions and workshops from the comfort of my home office😊 RANLP had very interesting keynote speeches, they were quite informative on the ongoing research trends for different NLP groups all over the world. My online presentation went very well, the only thing that I was missing was to see the attendees reactions while talking. I can either concentrate on my slides or the participants 😊 But I was happy from how the research ideas were interesting to many. The RANLP workshops were also excellent. Researchers from top-notch universities gave very interesting presentations. Looking forward to repeating this wonderful experience in the future.

Hadeel Saadany – PhD Student


I recently participated in RANLP 2021 (Recent Advances in Natural Language Processing). RANLP has established itself over the years as one of the most influential and competitive NLP conferences. This year, due to the COVID situation in many countries, organisers decided to keep the conference virtually using the zoom technology.

RANLP 2021 had excellent keynote speeches from top researchers in NLP around the world. The RANLP organisers made sure that there were at least three keynote speeches for a day. Usually, the day started with a keynote speech. There was another keynote speech after the lunch break and the day concluded with a keynote speech. Day 1 in RANLP 2021 began with a keynote speech from Dr Jing Jiang in Singapore Management University. She talked about the latest research on question and answering. In the afternoon, we had a keynote speech from Prof Josef van Genabith and Nico Herbig on translation technologies. They talked about implementing a multimodal user interface for post-editing. Day 2 in RANLP started with a keynote speech from Prof Hwee Tou Ng where he talked about current and future research directions in grammatical error correction in texts.  After the lunch break, we had a keynote speech from Prof Constantin Orasan. He provided a very informative session on preserving sentiment in machine translations. Since he is the first supervisor in my PhD studies, he also talked about the research we did on translation quality estimation for my PhD. Therefore, this session was special for me. Later, in the afternoon we had a keynote from Dr He He at New York University about text generation. She talked about the latest developments in the text generation area, including neural transformers. The final day in RANLP started with a keynote speech from Prof Tim Baldwin about text summarisation and the evaluation of text summarisation methods. As the second keynote speech of the day, we had a session by Prof Sebastian Riedel where he talked about learning from knowledge bases and reasoning in machine learning models. As the final keynote, we had a session with Prof Alessandro Moschitti. He presented a very informative session on recent developments of question and answering. Overall, all of the keynote speeches in RANLP were enlightening and provided useful insight knowledge about the state-of-the-art in several NLP topics. It was great to listen to the pioneers of the field and hear their first-hand experiences.

In RANLP 2021, I was fortunate to be a session chair of two parallel sessions. My first parallel session was on the 1st of September, which contained four long papers about offensive language identification. There were four exciting papers, including offensive language identification in Spanish and Romanian, in that session. The second session I chaired was on the 2nd of September. It contained four fascinating papers on translation technologies. RANLP was my first experience being a session chair, and it was a good opportunity for me. I thank the RANLP organisers for allowing me to be a session chair.

I presented two papers at the conference.  I got the opportunity to present my first paper on the 1st of September. It contained the work we did on creating an offensive language identification corpus on a low-resource language, Marathi. I presented the second paper on the 3rd of September, which was on multilingual misinformation identification on COVID-19 tweets, which is timely research. I received very good feedback from the audience for both of the papers with comments to improve in future work. I hope to incorporate them in my future work, and I am glad for the RANLP participants for their valuable ideas.

During the conference, I got the opportunity to get to know several researchers working in the same field from universities worldwide, and the networking was very valuable. However, I did miss the physical presence and all the fun activities in RANLP, such as cocktail receptions and the Gala dinner. I hope that we can have the next RANLP conference physically in Varna, Bulgaria and present at the venue site. Finally, I would like to thank the organisers of RANLP for having the conference despite the difficult situation in the world.

Tharindu Ranasinghe – PhD Student

EM TTI Application Portal now open!

Erasmus Mundus European Master’s in Technology for Translation and Interpreting (EM TTI)           

Call for applications for start date September 2022

Scholarship application deadline: 15th January 2022

Self-funded application deadline: 1st July 2022

We invite applications for the new Erasmus Mundus European Master’s programme in Technology for Translation and Interpreting (EM TTI) with a start date of September 2022. The programme, run by University of Wolverhampton, University of Malaga, New Bulgarian University and Ghent University, offers students the opportunity to study at two international institutions and to undertake work placements with industry leaders around the world. A competitive Erasmus Mundus scholarship is offered to the highest-ranking applicants. Both European and non-European students can apply.

How to apply: https://em-tti.eu/how-to-apply

Course Fees: https://em-tti.eu/about-masters/course-fees/

Should you have any questions, please do not hesitate to contact the EM TTI team at enquiries@em-tti.eu.