Even though Machine Translation (MT) quality may have increased considerably over the past years, most notably with advances in the field of Neural Machine Translation (NMT), Translation Memories (TMs) still offer some advantages over MT systems. They are not only able to translate previously seen sentences ‘perfectly’ but they also offer ‘near perfect’ translation quality when highly similar source sentences are retrieved from the TM. As a result, in Computer-Assisted Translation (CAT) workflows, the MT system is often used as a back-off mechanism when the TM fails to retrieve high fuzzy matches above a certain threshold, even though it has been shown that this basic integration method is not always the most optimal TM-MT combination strategy.
We present a simple yet powerful data augmentation method for boosting Neural Machine Translation (NMT) performance by leveraging information retrieved from a Translation Memory (TM). Tests on the DGT-TM data set for multiple language pairs show consistent and substantial improvements over a range of baseline systems. The results suggest that this method is promising for any translation environment in which a sizeable TM is available and a certain amount of repetition across translations is to be expected, especially considering its ease of implementation.