"Don’t get me wrong…What’s next? Intelligent Translation Memory Systems…"

by Prof Ruslan Mitkov, University of Wolverhampton

Update: the event has now finished (Jan 15th 2021).

Abstract

We witnessed the birth of the modern computer between 1943 and 1946; it was not long after that Warren Weaver wrote his famous memorandum in 1949 suggesting that translation by machine would be possible. Weaver’s dream did not quite come true: while automatic translation went on to work reasonably in some scenarios and to do well for gisting purposes, even today, against the background of the recent promising results delivered by statistical Machine Translation (MT) systems such as Google Translate and latest developments in Neural Machine Translation and in general Deep Learning for MT, automatic translation gets it often wrong and is not good enough for professional translation.

Consequently, there has been a pressing need for a new generation of tools for professional translators to assist them reliably and speed up the translation process. First Krollman put forward the reuse of existing human translations in 1971. A few years later, in 1979 Arthern went further and proposed the retrieval and reuse not only of identical text fragments (exact matches) but also of similar source sentences and their translations (fuzzy matches). It took another decade before the ideas sketched by Krollman and Arthern were commercialised as a result of the development of various computer-aided translation (CAT) tools such as Translation Memory (TM) systems in the early 1990s. These translation tools revolutionised the work of translators and the last two decades saw dramatic changes in the translation workflow.

The TM memory systems indeed revolutionised the work of translators and now the translators not benefiting from these tools are a tiny minority. However, while these tools have proven to be very efficient for repetitive and voluminous texts, are they intelligent enough? Unfortunately, they operate on fuzzy (surface) matching mostly, cannot benefit from already translated texts which are synonymous to (or paraphrased versions of) the text to be translated and can be ‘fooled’ on numerous occasions.

What is next in the translation world? We cannot get it wrong as we cannot let the translation go wrong: it is obvious that the next generation of TM systems will have to be more intelligent. A way forward would be to equip the TM tools with Natural Language Processing (NLP) capabilities. NLP can come to help and propose solutions towards addressing this objective. The invited talk will present recent and latest work by the speaker and his research group in achieving this. More specifically, the speaker will explain how two NLP methods/tasks, namely paraphrasing and clause splitting, make it possible for TM systems to identify semantically equivalent sentences which are not necessarily identical or close syntactically and enhance performance. The first evaluation results of this new generation TM matching technology are already promising….

Speaker’s bio

Prof Dr Ruslan Mitkov has been working in Natural Language Processing (NLP), Computational Linguistics, Corpus Linguistics, Machine Translation, Translation Technology and related areas since the early 1980s. Whereas Prof Mitkov is best known for his seminal contributions to the areas of anaphora resolution and automatic generation of multiple-choice tests, his extensively cited research (more than 260 publications including 17 books, 40 journal articles and 40 book chapters) also covers topics such as machine translation, translation memory and translation technology in general, bilingual term extraction, automatic identification of cognates and false friends, natural language generation, automatic summarisation, computer-aided language processing, centering, evaluation, corpus annotation, NLP-driven corpus-based study of translation universals, text simplification, NLP for people with language disabilities and computational phraseology. Current topics of research interest include the employment of deep learning techniques in translation and interpreting technology as well as conceptual difficulty for text processing and translation. Mitkov is author of the monograph Anaphora resolution (Longman) and Editor of the most successful Oxford University Press Handbook - The Oxford Handbook of Computational Linguistics. Current prestigious projects include his role as Executive Editor of the Journal of Natural Language Engineering published by Cambridge University Press and Editor-in-Chief of the Natural Language Processing book series of John Benjamins publishers. Dr Mitkov is also working on the forthcoming Oxford Dictionary of Computational Linguistics (Oxford University Press, co-authored with Patrick Hanks) and the forthcoming second, substantially revised edition of the Oxford Handbook of Computational Linguistics. Prof Mitkov designed the first international Erasmus Mundus Master programme on Technology for Translation and Interpreting which was awarded competitive EC funding and which he leads as Project Coordinator. Dr Mitkov has been invited as a keynote speaker at a number of international conferences including conferences on translation and translation technology; he has acted as Programme Chair of various international conferences on Natural Language Processing (NLP), Machine Translation, Translation Technology (including the annual London conference ‘Translation and the Computer’), Translation Studies, Corpus Linguistics and Anaphora Resolution. Dr Mitkov is asked on a regular basis to review for leading international funding bodies and organisations and to act as a referee for applications for Professorships both in North America and Europe. Ruslan Mitkov is regularly asked to review for leading journals, publishers and conferences and serve as a member of Programme Committees or Editorial Boards. Prof Mitkov has been an external examiner of many doctoral theses and curricula in the UK and abroad, including Master’s programmes related to NLP, Translation and Translation Technology. Dr Mitkov has considerable external funding to his credit (more than є 25,000,000) and is currently acting as Principal Investigator of several large projects, some of which are funded by UK research councils, by the EC as well as by companies and users from the UK and USA. Ruslan Mitkov received his MSc from the Humboldt University in Berlin, his PhD from the Technical University in Dresden and worked as a Research Professor at the Institute of Mathematics, Bulgarian Academy of Sciences, Sofia. Mitkov is Professor of Computational Linguistics and Language Engineering at the University of Wolverhampton which he joined in 1995 and where he set up the Research Group in Computational Linguistics. His Research Group has emerged as an internationally leading unit in applied Natural Language Processing has been ranked as world No.1 in different international NLP competitions. In addition to being Head of the Research Group in Computational Linguistics, Prof Mitkov is also Director of the Research Institute in Information and Language Processing. The Research Institute consists of the Research Group in Computational Linguistics and the Research Group in Statistical Cybermetrics, which is another top performer internationally. Ruslan Mitkov is Vice President of ASLING, an international Association for promoting Language Technology. Dr Mitkov is Fellow of the Alexander von Humboldt Foundation, Germany, Marie Curie Fellow and Distinguished Visiting Professor at the University of Franche-Comté in Besançon, France; he also serves as Vice-Chair for the prestigious EC funding programme ‘Future and Emerging Technologies’. In recognition of his outstanding professional/research achievements, Prof Mitkov was awarded the title of Doctor Honoris Causa at Plovdiv University in November 2011. At the end of October 2014 Dr Mitkov was also conferred Professor Honoris Causa at Veliko Tarnovo University.

CONTACT DETAILS


RGCL
University of Wolverhampton
Wulfruna Street
Wolverhampton, WV1 1LY
United Kingdom