At the end of February, RGCL welcomed Sheila Castilho from Dublin City University. During her visit she gave a lecture comparing PBSMT and NMT systems. The lecture was well received and also attended by the Research Group’s MA students.
TITLE: A multifaceted comparison between PBSMT and NMT systems ABSTRACT
Since the inception of machine translation (MT), new techniques have regularly generated high expectation, often followed by disappointing results. Qualitative improvements have tended to be incremental rather than exponential.
Statistical machine translation (SMT) became the dominant MT paradigm in the 2000s and, since then, MT for production has become mainstream. Recently, Neural MT has emerged as a promising new MT paradigm, raising interest in academia and industry by outperforming phrase-based statistical MT systems despite many years of SMT development, based on impressive results in automatic evaluation.
This presentation will report a multifaceted comparison between statistical and neural machine translation systems that were developed for translation of data from Massive Open Online Courses. The study uses four language pairs: English to German, Greek, Portuguese, and Russian
Translation quality is evaluated using automatic metrics and human evaluation, carried out by professional translators. Results show that neural MT is preferred in side-by-side ranking, and is found to contain fewer overall errors. Results are less clear-cut for some error categories, and for temporal and technical post-editing effort. In addition, results are reported based on sentence length, showing advantages and disadvantages depending on the particular language pair and MT paradigm.