We are pleased to announce we are restarting our research seminars and aiming for this to be a monthly series on the first Thursday of every month. Please find details of the first seminar below.
1st October 2020
Platform: Teams, please email A[dot]harper2[at]wlv.ac.uk for a link
TransQuest: Translation Quality
Estimation with Cross-lingual Transformers
High-accuracy translation quality
estimation (QE) that can be easily deployed for a number of language pairs is
the missing piece in many commercial translation workflows. Even though there
are many systems that can do QE, majority of these methods work only on
the language pair they are trained on and need retraining for new language
pairs which can be usually computationally expensive and difficult
especially for low-resource language pairs. As a solution, in this
presentation, we introduce TransQuest – a simple QE framework based on
cross-lingual transformers. TransQuest outperforms the current state of the art
quality estimation methods like DeepQuest and OpenKiwi. This is also the
winning solution in recently concluded WMT 2020 sentence-level Direct
Assessment shared task, winning all the language pairs with the multilingual
We are working with BBC Arts and Faculty of Arts on a new engagement project to mark the 300th anniversary of the English Language Novel.
The interdisciplinary project unites research by Wolverhampton’s Research Group for Computational Linguistics, including Dr Sara Moze, Richard Evans and Dr Emad Mohamed, and from English and Creative Writing staff, led by Dr Aidan Byrne. The project is seeking support from the Arts and Humanities Research Council.
There is more information on the University website if you would like to find out more about the project: https://www.wlv.ac.uk/about-us/news-and-events/latest-news/2019/august-2019/wolverhampton-researchers-collaborate-on-bbcs-novels-that-shaped-our-world.php
Congratulations to Marcos Zampieri, whose paper has been accepted at NAACL 2019.
Reference: Marcos Zampieri, Shervin Malmasi, Preslav Nakov, Sara Rosenthal, Noura Farra, and Ritesh Kumar (2019) Predicting the Type and Target of Offensive Posts in Social Media.
You may access the NAACL paper here: https://arxiv.org/abs/1902.09666
We are pleased to announce that the paper titled “Bridging the Gap: Attending to Discontinuity in Identification of Multiword Expressions” from researchers in RGCL has been accepted into the main track of the Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL 2019). This is joint work by Omid Rohanian, Shiva Taslimipoor, Le An Ha, Samaneh Kouchaki, and Prof. Ruslan Mitkov.
A preprint of this paper will soon be available on ArXiv.
This week we have had the pleasure of welcoming Dr Sheila Castilho and Dr Natalia Resende for a one week research stay at the Research Group in Computational Linguistics. Sheila and Natalia both come from the ADAPT Centre, Dublin and have come to discuss collaborations with members of our research group. During their stay, both Natalia and Sheila gave the group a talk about their research. The details of which can be found below:-
Speaker: Dr Sheila Castilho
Date of talk: 19th November 2018
Title: Attaining the Unattainable? Reassessing Claims of Human Parity in Neural Machine Translation
Abstract: We reassess a recent study (Hassan et al., 2018) that claimed that machine translation (MT) has reached human parity for the translation of news from Chinese into English, using pairwise ranking and considering three variables that were not taken into account in that previous study: the language in which the source side of the test set was originally written, the translation proficiency of the evaluators, and the provision of inter-sentential context. If we consider only original source text (i.e. not translated from another language, or translationese), then we find evidence showing that human parity has not been achieved. We compare the judgments of professional translators against those of non-experts and discover that those of the experts result in higher inter-annotator agreement and better discrimination between human and machine translations. In addition, we analyse the human translations of the test set and identify important translation issues. Finally, based on these findings, we provide a set of recommendations for future human evaluations of MT.
Speaker: Dr Natalia Resende
Date of talk: 21st November 2018
Title: Classifying nouns in Portuguese into gender categories: a deep learning approach
Abstract: In Portuguese, all nouns are distributed into two gender categories: feminine and masculine. On one hand, gender can be predicted from the phonological cues present in the endings of the nouns. For example, nouns ending in -a tend to be feminine and nouns ending in -o tend to be masculine. On the other hand, the relationship between word ending and gender is far from being a consistent rule, since nouns ending in other phonemes may be of either gender. In the present study, a connectionist network was trained to classify Portuguese nouns into gender categories considering their phonological structure as whole. The performance of the network was analysed in detail to check whether the network considers only the endings of the nouns or their whole phonological structure for gender decisions. In addition, it was analysed what type of information the network takes into account to decide the gender of nouns whose endings are not predictive of gender. Results show an error-free performance when the network takes into account the phonological information present in the endings of the nouns and frequency effects for nonpredictive endings. The present study has implications to the training of NLP systems when classifying nouns into gender categories.
In August, Professor Alexander Gelbukh began a 12 month sabbatical at RGCL. As part of his visit, Prof. Gelbukh presented a research seminar to the group on ‘Opinion Mining and Sentiment Analysis’. During his time here, he has held many meetings with members of the group to discuss both future opportunities for collaboration, and discuss his research with interested people.