Author Archives: c.orasan

Jobs in translation technology at the Research Group in Computational Linguistics

The Research Group in Computational Linguistics at the University of Wolverhampton is currently recruiting a Reader in Translation Technology (permanent) and a Research Fellow in Translation Technology (3 year position with the possibility of extension). The purpose of these posts is to  strengthen the research group by enhancing its research and publications in the field of translation technology. The appointed candidates will be expected to produce REF-returnable outputs, attract external income, seek industrial collaborations, teach at Masters level and supervise PhD students. Continue reading

Syntactic complexity sign tagger demo released

The successfully completed FIRST project has developed various components which help users to analyse the complexity of texts and rewrite texts in order to make them more accessible for readers with Autistic Spectrum Disorder (ASD). These components were integrated in the OpenBook tool, but they cannot be used in isolation. In an attempt to make some of this technology available for other researchers, we started a process of releasing some of the components individually. The first component to be released as a web demo is the syntactic
complexity sign tagger. This is a tool that assigns words and punctuation marks from a predefined set to categories indicating their syntactic linking and bounding functions. Some of these categories are used by our sentence rewriting algorithm. Continue reading

Seminar: An Interpreter’s Wish List

20160607_133604Elena Errico, University of Genoa
An Interpreter’s Wish List
Date and time: Tuesday 7th June, 1.30pm
Room: MC232, City Campus


Interpreting is a very challenging cognitive activity not least because it requires professionals to take translation decisions under very strict time constraints and while performing several Continue reading

The programme for the 2nd Workshop on Natural Language Processing for Translation Memories

The programme and the abstracts of the presentations of the 2nd Workshop on  Natural Language Processing for Translation Memories to be held in conjunction with LREC 2016 is available on the workshops website. It features three invited speakers, four research papers, a shared task and a round table. We hope to see you in Portorož.


Seminar: Automatic Extraction and Translation of Multiword Expressions


Speaker: Shiva Taslimipoor
Automatic Extraction and Translation of Multiword Expressions
Date and time: Wednesday, March 9th, 2pm
Room: MD083, City Campus

Abstract: Multiword expressions (MWEs) are defined as idiosyncratic interpretations that cross word boundaries or spaces, e.g. frying pan, take a look and take part. They have distinct syntactic and semantic properties that call for special treatment within a computational system. Continue reading

Seminar: Taxonomies for semantic tagging: how large do they need to be?

Speaker: Dr Paul Rayson, Lancaster University
Title: Taxonomies for semantic tagging: how large do they need to be?
Date and time: Tuesday Feb 9th, 2pm
Room: MI301, City Campus

Abstract: In this presentation, I will describe joint research carried out in the recently completed Samuels project ( in which we have applied automatic semantic analysis to two very large corpora around 1-2 billion words each: Continue reading

Special Issue of the Machine Translation journal: Natural Language Processing for Translation Memories

Guest editors:

  • Constantin Orasan (University of Wolverhampton, UK)
  • Marcello Federico (FBK, Italy)

Submission deadline: May 15, 2016

1. Call For Papers

Translation Memories (TM) are amongst the most widely used tools by professional translators. The underlying idea of TMs is that a translator should benefit as much as possible from previous translations by being able to retrieve the way in which a similar sentence was translated before. Moreover, the usage of TMs aims to guarantee that new translations follow the client’s specified style and terminology. Despite the fact that the core idea of these systems relies on comparing segments (typically of sentence length) from the document to be translated with segments from previous translations, most of the existing TM systems hardly use any language processing for this. Instead of addressing this issue, most of the work on translation memories focused on improving the user experience by allowing processing of a variety of document formats, intuitive user interfaces, etc. Continue reading