The advanced translation memory tool developed by Rohit Gupta is now available on Github at https://github.com/rohitguptacs/TMAdvanced
Current Translation Memory (TM) systems work at the surface level and lack semantic knowledge while matching. This tool implements an approach to incorporating semantic knowledge in the form of paraphrasing in matching and retrieval. Most of the TMs use Levenshtein edit- distance or some variation of it. This tool implements an efficient approach to incorporating paraphrasing with edit-distance. The approach is based on greedy approximation and dynamic programming. We have obtained significant improvement in both retrieval and translation of retrieved segments. More details about the approach and evaluations given in the following publications:
Approach: Rohit Gupta and Constantin Orasan. 2014. Incorporating Paraphrasing in Translation Memory Matching and Retrieval. In Proceedings of the European Association of Machine Translation (EAMT-2014).
Human Evaluations: Rohit Gupta, Constantin Orasan, Marcos Zampieri, Mihaela Vela and Josef van Genabith. 2015. Can Transfer Memories afford not to use paraphrasing? In Proceeding of EAMT-2015, Antalya Turkey.
The tool was developed part of the EXPERT project.
The 2nd Call for Papers of the Workshop on Natural Language Processing for Translation Memories (NLP4TM) organised at RANLP 2015 by Constantin Orasan and Rohit Gupta has been published. Information about the topics addressed by the workshop and important dates can be found on the workshop’s webpage.
Speaker: Rohit Gupta (University of Wolverhampton)
Date: 3 June 2015
Abstract: Current Translation Memory (TM) systems work at the surface level and lack semantic Continue reading
Speaker: Dr Corina Forascu (Univ. Al.I. Cuza of Iasi, Romania)
Date: 26 May 2015
Main discussion points:
- How to deal with a less-studied language? (language technologies, with emphasis on Romanian)
Research carried out in the EXPERT project between researchers from University of Wolverhampton and Saarland University, Germany is being presented at the European Association for Machine Translation 2015 conference. The work shows how paraphrasing can help the task of translators who use translation memories. Continue reading
By Patrick Hanks and Sara Može
Research Institute of Information and Language Processing
University of Wolverhampton
No doubt every politically conscious person in Britain has a pretty good idea by now of the main issues selected by the various political parties fighting each other for votes in the upcoming General Election. An obvious way of finding out what those issues are is to read the manifestos of each of the parties.
But linguistic analysis can tell us more than the politicians ever intended to reveal. Linguists working on the DVC project at the University of Wolverhampton have been using corpus-analysis tools such as Adam Kilgarriff’s Sketch Engine to explore the language used in the manifestos of four parties: Continue reading
The PhD thesis of Miguel Rios entitled Methods for Measuring Semantic Similarity of Texts is now available online in the section dedicated to PhD theses.
The Research Group in Computational Linguistics is happy to announce their new website. Not all the content has been migrated yet, so please bear with us.
The 1st Call for Papers of the Workshop on Natural Language Processing for Translation Memories (NLP4TM) organised at RANLP 2015 by Constantin Orasan and Rohit Gupta has been published. Information about the topics addressed by the workshop and important dates can be found on the workshop’s webpage.
Dr Michael Oakes is in Malta presenting a poster at PARSEME 4th general meeting in Malta on the topic of Measures of collocational strength and flexibility for the identification of MWEs. The work is done in collaboration with Prof. Patrick Hanks and Dr. Ismail El Maarouf, and uses data created in the DVC project.