Category Archives: news

3 PhD studentships on NLP and DL approaches in Digital Humanities

Research Group in Computational Linguistics,

Research Institute of Information and Language Processing,

University of Wolverhampton

*** Closing date 19 July 2021 ***

The Research Group in Computational Linguistics (http://rgcl.wlv.ac.uk) at the Research Institute of Information and Language Processing of the University of Wolverhampton invites applications

for three PhD studentships with the prospective PhD students working on the following topics: (i) Natural Language Processing (NLP) and Deep Learning (DL) in Computational History studies, (ii) NLP and DL in Computational Literature studies and (iii) NLP and DL in Computational Film Studies.

These are 3-year funded bursaries which will include a stipend towards living expenses (£15,609 per year) with the tuition fees and the research fees included.

Applicants will submit PhD research proposals not exceeding 2,000 words. The applicants are invited to propose an original computational history study, computational literature study or computational film study where NLP and DL techniques are employed.


Prerequisites

A successful applicant must have a good honours degree or equivalent in Computer Science, Computational Linguistics, Digital Humanities or Linguistics, with good programming skills, and knowledge of Deep Learning and Natural Language Processing.


Application procedure

Applications must include:

  • Research proposal not exceeding 2,000 words (see above)
  • A curriculum vitae listing degrees awarded, courses covered and marks obtained, publications, relevant experience and names of two referees who could be contacted for a reference
  • Cover letter with statement of research interests, outlining why you are interested in this PhD position/topic, how you plan to approach the research task and why you consider your experience is relevant.

Schedule

The application deadline is 19 July 2021. The short-listed candidates will be notified by email by 20 July 2021 and interviewed via Zoom on 21 or 22 July 2021. The starting date of the PhD position is 1 September 2021 or any time as soon as possible after that.

Established by Prof Mitkov in 1998, the research group in Computational Linguistics delivers cutting-edge research in a number of NLP areas. The results from the UK research assessment exercises confirm the research group in Computational Linguistics as one of the top performers in UK and international research with its research assessed as ‘internationally leading, internationally excellent and internationally recognised’.

The PhD students will be members of the newly established Responsible Digital Humanities Research Lab which is part of the Research Group of Computational Linguistics.


Applications should be sent by email to

Prof Dr Ruslan Mitkov

Director of Research Institute of Information and Language Processing

University of Wolverhampton

Email: R.Mitkov@wlv.ac.uk

and copied to Prof Mitkov’s PAs Miss Suman Hira (suman.hira@wlv.ac.uk) and Mrs April Harper (a.harper2@wlv.ac.uk)

Papers accepted at ACL-IJCNLP 2021 and NAACL-HWT 2021

Congratulations to Dr Frédéric Blain who has had the following papers accepted at upcoming conferences.


Title: Knowledge Distillation for Quality Estimation

Authors: Amit Gajbhiye, Marina Fomicheva, Fernando Alva-Manchego, Frédéric Blain, Abiola Obamuyide, Nikolaos Aletras and Lucia Specia

Abstract: Quality Estimation (QE) is the task of automatically predicting Machine Translation quality in the absence of reference translations, making it applicable in real-time settings, such as translating online social media conversations. Recent success in QE stems from the use of multilingual pre-trained representations,where very large models lead to impressive results. However, the inference time, disk and memory requirements of such models do not allow for wide usage in the real world.

Attempts have been made at making pre-trained representations less resource-hungry by using knowledge distillation, but the resulting models remain prohibitively large for many usage scenarios.Instead of building upon distilled pre-trained representations, we propose to transfer knowledge from a strong QE teacher model to a much smaller model with a different, shallower architecture. In combination with a confidence-based data augmentation approach, we show that it is possible to create light-weight QE models that achieve comparable results to distilled pre-trained representations with 8x fewer parameters.

This paper should appear in the Findings of ACL-IJCNLP 2021 (https://2021.aclweb.org/).


Title: Backtranslation Feedback Improves User Confidence in MT, Not Quality

Authors: Vilém Zouhar, Michal Novák, Matúš Žilinec, Ondřej Bojar, Mateo Obregón, Robin L. Hill, Frédéric Blain, Marina Fomicheva, Lucia Specia, Lisa Yankovskaya

Abstract: Translating text into a language unknown to the text’s author, dubbed outbound translation, is a modern need for which the user experience has significant room for improvement, beyond the basic machine translation facility. We demonstrate this by showing three ways in which user confidence in the outbound translation, as well as its overall final quality, can be affected: backward translation, quality estimation (with alignment) and source paraphrasing. In this paper, we describe an experiment on outbound translation from English to Czech and Estonian. We examine the effects of each proposed feedback module and further focus on how the quality of machine translation systems influence these findings and the user perception of success. We show that backward translation feedback has a mixed effect on the whole process: it increases user confidence in the produced translation, but not the objective quality.

This paper will appear at NAACL-HWT 2021 (https://2021.naacl.org/).

The paper can also be found here: https://arxiv.org/abs/2104.05688

Paper accepted at ACL-IJCNLP 2021

Congratulations to one of our PhD students Tharindu Ranasinghe – who has had a paper accepted at ACL-IJCNLP 2021.

An Exploratory Analysis of Multilingual Word Level Quality Estimation with Cross-Lingual Transformers. 

Authors: Tharindu Ranasinghe, Constantin Orasan, Ruslan Mitkov

Abstract: Most studies on word level Quality Estimation (QE) of machine translation focus on language-specific models. The obvious disadvantages of these approaches are the need for labelled data for each language pair and the high cost required to maintain several language-specific models. To overcome these problems, we explore different approaches to multilingual word level QE. We show that these QE models perform on-par with the current language-specific models. In the case of zero-shot QE, we show that it is possible to accurately predict word level quality for any given new language pair from models trained on other language pairs. Our findings indicate that the word level QE models based on powerful pre-trained transformers we propose on this paper generalise well across languages, making them more useful in real-world scenarios.

Wolverhampton researchers collaborate on BBC’s “Novels That Shaped Our World”

We are working with BBC Arts and Faculty of Arts on a new engagement project to mark the 300th anniversary of the English Language Novel.

The interdisciplinary project unites research by Wolverhampton’s Research Group for Computational Linguistics, including Dr Sara Moze, Richard Evans and Dr Emad Mohamed, and from English and Creative Writing staff, led by Dr Aidan Byrne. The project is seeking support from the Arts and Humanities Research Council.

There is more information on the University website if you would like to find out more about the project: https://www.wlv.ac.uk/about-us/news-and-events/latest-news/2019/august-2019/wolverhampton-researchers-collaborate-on-bbcs-novels-that-shaped-our-world.php

Paper accepted at NAACL 2019

Congratulations to Marcos Zampieri, whose paper has been accepted at NAACL 2019.

Reference:  Marcos  Zampieri,  Shervin  Malmasi,  Preslav  Nakov, Sara  Rosenthal,  Noura  Farra,  and  Ritesh  Kumar (2019)  Predicting the Type and Target of Offensive Posts in Social Media. 

You may access the NAACL paper here: https://arxiv.org/abs/1902.09666

Paper accepted at NAACL 2019

We are pleased to announce that the paper titled “Bridging the Gap: Attending to Discontinuity in Identification of Multiword Expressions” from researchers in RGCL has been accepted into the main track of the Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL 2019). This is joint work by Omid Rohanian, Shiva Taslimipoor, Le An Ha, Samaneh Kouchaki, and Prof. Ruslan Mitkov.

A preprint of this paper will soon be available on ArXiv.