Tag Archives: translation memories

The programme for the 2nd Workshop on Natural Language Processing for Translation Memories

NLP4TM2016Programme
The programme and the abstracts of the presentations of the 2nd Workshop on  Natural Language Processing for Translation Memories to be held in conjunction with LREC 2016 is available on the workshops website. It features three invited speakers, four research papers, a shared task and a round table. We hope to see you in Portorož.

 

Special Issue of the Machine Translation journal: Natural Language Processing for Translation Memories

Guest editors:

  • Constantin Orasan (University of Wolverhampton, UK)
  • Marcello Federico (FBK, Italy)

Submission deadline: May 15, 2016

1. Call For Papers

Translation Memories (TM) are amongst the most widely used tools by professional translators. The underlying idea of TMs is that a translator should benefit as much as possible from previous translations by being able to retrieve the way in which a similar sentence was translated before. Moreover, the usage of TMs aims to guarantee that new translations follow the client’s specified style and terminology. Despite the fact that the core idea of these systems relies on comparing segments (typically of sentence length) from the document to be translated with segments from previous translations, most of the existing TM systems hardly use any language processing for this. Instead of addressing this issue, most of the work on translation memories focused on improving the user experience by allowing processing of a variety of document formats, intuitive user interfaces, etc. Continue reading

TMAdvanced: A tool to retrive semantically similar matches from a Translation Memory using paraphrases

The advanced translation memory tool developed by Rohit Gupta is now available on Github at https://github.com/rohitguptacs/TMAdvanced

Current Translation Memory (TM) systems work at the surface level and lack semantic knowledge while matching. This tool implements an approach to incorporating semantic knowledge in the form of paraphrasing in matching and retrieval. Most of the TMs use Levenshtein edit- distance or some variation of it. This tool implements an efficient approach to incorporating paraphrasing with edit-distance. The approach is based on greedy approximation and dynamic programming. We have obtained significant improvement in both retrieval and translation of retrieved segments. More details about the approach and evaluations given in the following publications:

Approach: Rohit Gupta and Constantin Orasan. 2014. Incorporating Paraphrasing in Translation Memory Matching and Retrieval. In Proceedings of the European Association of Machine Translation (EAMT-2014).

Human Evaluations: Rohit Gupta, Constantin Orasan, Marcos Zampieri, Mihaela Vela and Josef van Genabith. 2015. Can Transfer Memories afford not to use paraphrasing? In Proceeding of EAMT-2015, Antalya Turkey.

The tool was developed part of the EXPERT project.