Category Archives: news

Paper accepted at ACL-IJCNLP 2021

Congratulations to one of our PhD students Tharindu Ranasinghe – who has had a paper accepted at ACL-IJCNLP 2021.

An Exploratory Analysis of Multilingual Word Level Quality Estimation with Cross-Lingual Transformers. 

Authors: Tharindu Ranasinghe, Constantin Orasan, Ruslan Mitkov

Abstract: Most studies on word level Quality Estimation (QE) of machine translation focus on language-specific models. The obvious disadvantages of these approaches are the need for labelled data for each language pair and the high cost required to maintain several language-specific models. To overcome these problems, we explore different approaches to multilingual word level QE. We show that these QE models perform on-par with the current language-specific models. In the case of zero-shot QE, we show that it is possible to accurately predict word level quality for any given new language pair from models trained on other language pairs. Our findings indicate that the word level QE models based on powerful pre-trained transformers we propose on this paper generalise well across languages, making them more useful in real-world scenarios.

Wolverhampton researchers collaborate on BBC’s “Novels That Shaped Our World”

We are working with BBC Arts and Faculty of Arts on a new engagement project to mark the 300th anniversary of the English Language Novel.

The interdisciplinary project unites research by Wolverhampton’s Research Group for Computational Linguistics, including Dr Sara Moze, Richard Evans and Dr Emad Mohamed, and from English and Creative Writing staff, led by Dr Aidan Byrne. The project is seeking support from the Arts and Humanities Research Council.

There is more information on the University website if you would like to find out more about the project: https://www.wlv.ac.uk/about-us/news-and-events/latest-news/2019/august-2019/wolverhampton-researchers-collaborate-on-bbcs-novels-that-shaped-our-world.php

Paper accepted at NAACL 2019

Congratulations to Marcos Zampieri, whose paper has been accepted at NAACL 2019.

Reference:  Marcos  Zampieri,  Shervin  Malmasi,  Preslav  Nakov, Sara  Rosenthal,  Noura  Farra,  and  Ritesh  Kumar (2019)  Predicting the Type and Target of Offensive Posts in Social Media. 

You may access the NAACL paper here: https://arxiv.org/abs/1902.09666

Paper accepted at NAACL 2019

We are pleased to announce that the paper titled “Bridging the Gap: Attending to Discontinuity in Identification of Multiword Expressions” from researchers in RGCL has been accepted into the main track of the Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL 2019). This is joint work by Omid Rohanian, Shiva Taslimipoor, Le An Ha, Samaneh Kouchaki, and Prof. Ruslan Mitkov.

A preprint of this paper will soon be available on ArXiv.

Datasets annotated with signs of syntactic complexity

In our paper:

Evans, R., & Orasan, C. (2013). Annotating signs of syntactic complexity to support sentence simplification. In I. Habernal & V. Matousek (Eds.), Text, Speech and Dialogue. Proceedings of the 16th International Conference TSD 2013. Plzen, Czech Republic: Springer. pp. 92 – 104

we present the annotation of a dataset that is used by our syntactic simplification method to identify places where rewriting rules have to be applied in order to produce simpler sentences. 

The datasets are available in XML format as three independent files, each representing a different genre

Each file contains a list of sentences annotated using the following format:

<S ID="2"><SIGN ID="2" CLASS="SSEV">That</SIGN> is 
<SIGN ID="3" CLASS="HELP">,</SIGN> a high-fibre diet, 
fluid <SIGN ID="4" CLASS="CLN">,</SIGN> etc.</S>

The sentences are marked using the S tag, whilst the signs by the tag SIGN. The type of sign is encoded by the attribute CLASS. The sentences were annotated in isolation, so the files above do not contain coherent texts, but sequences of sentences extracted from different files. 

To understand the difference between different classes and how the annotation process was carried out please consult the annotation guidelines. Specific questions about the annotation should be sent to Richard Evans. A demo of the sign tagger is available at http://rgcl.wlv.ac.uk/demos/SignTaggerWebDemo/  

You can find out more about our approach for syntactic simplification in our recent paper

Evans, Richard, and Constantin Orǎsan. 2018. “Identifying Signs of Syntactic Complexity for Rule-Based Sentence Simplification.” Natural Language Engineering. https://doi.org/10.1017/S1351324918000384.

Mireille Makary completes her Viva!

Congratulations to Mireille Makary for completing her Viva Voce exam on 17th October. Mireille, a part-time distance RGCL student, was defending her thesis ‘Ranking retrieval systems using minimal human assessments’.

After the Viva, Mireille celebrated with the group in the traditional RGCL way!