Natural Language Processing for Translation Memories (NLP4TM)

16 June 2015: A number of selected papers will be invited to submit extended versions to a special issue of Machine translation.

Translation Memories (TM) are amongst the most used tools by professional translators. The underlying idea of TMs is that a translator should benefit as much as possible from previous translations by being able to retrieve how a similar sentence was translated before. Despite the fact that the core idea of these systems relies on comparing segments (typically of sentence length) from the document to be translated with segments from previous translations, most of the existing TM systems hardly use any language processing for this. Instead of addressing this issue, most of the work on translation memories focused on improving the user experience by allowing processing of a variety of document formats, intuitive user interfaces, etc.

The term second generation translation memories has been around for more than ten years and it promises translation memory software that integrates linguistic processing in order to improve the translation process. This linguistic processing can involve matching of subsentential chunks, edit distance operations between syntactic trees, incorporation of semantic and discourse information in the matching process. This workshop invites papers presenting second generation translation memories and related initiatives.

Terminologies, glossaries and ontologies are also very useful for translation memories, by facilitating the task of the translator and ensuring a consistent translation. The field of Natural Language Processing (NLP) has proposed numerous methods for terminology extraction and ontology extraction. Researchers are encouraged to submit papers to the workshop which show how these methods are being successfully applied to Translation Memories. In addition, papers discussing the integration of Machine Translation and Translation Memories or studies about automatic building of translation memories from corpora are also welcomed.

This workshop invites original papers which show how language processing can help translation memories. Topics of interest include but are not limited to:

  • improving matching and retrieval of segments by using morphological, syntactic, semantic and discourse information
  • automatic extraction of terminologies and ontologies for translation memories
  • integration of named entity recognition and terminologies in matching and retrieval
  • using natural language processing for automatic construction of translation memories
  • extracting and aligning TM segments from a parallel or comparable corpus
  • construction of translation memories using the Internet
  • corpus based studies about the usefulness of TM for specific domains
  • development of hybrid TM and MT translation systems
  • study of NLP techniques used by TM tools available in the market

Authors can submit full papers describing original completed research, short papers presenting on going research ideas and demos of working systems.

This workshop is partially supported by the EXPERT project.