RGCL Anniversary Highlights, Day 19
Published on Sep, 1 2022 by RGCL.
Tools & Resources
In our highlight series so far, we talked about awards, news coverage, research funding and projects, as well as the numerous collaborations RGCL has had throughout the years. In line with research excellence from our group, we could not not talk about the multiple tools, datasets and demos RGCL has produced so far.
In a dedicated section of our website, called “Tools, Demos & Resources”, we indexed about 50 resources, available to all, organised in the following 8 categories:
- Core NLP
- Core NLP (utility)
- Language Processing for Assistive Technologies
- Lexicography (Applied NLP)
- NLP for Social Media
- NLP for Technology-Enhanced Learning
- Technology-Enhanced Learning
- Translation Technologies
Each resource is listed under its published name with a link to access it, complementary information such as the contact person, a brief description and the language(s) it covers, is also provided. Some of these resources are web demos that everyone can use. Here is a selection of some of the most recent ones:
Turkish Delight NLP Tooklit
Screenshot of the web interface of the TurkishNLP toolkit
TurkishNLP was designed by Dr Burcu Can and her student:
TurkishDelightNLP is a neural Turkish NLP toolkit that performs computational linguistic analyses from morphological level to semantic level that involves tasks such as stemming, morphological segmentation, morphological tagging, part-of-speech tagging, dependency parsing, and semantic parsing, as well as high-level NLP tasks such as named entity recognition. We publicly share the open-source Turkish NLP toolkit through a web interface that allows an input text to be analysed in real-time, as well as the open source implementation of the components provided in the toolkit, an API, and several annotated datasets such as word similarity test set to evaluate word embeddings and UCCA-based semantic annotation in Turkish. This is the first open-source Turkish NLP toolkit that involves a range of NLP tasks in all levels.
Screenshot of the TransQuest’s website where one can find the documentation as well as pre-trained models
TransQuest is an open-source toolkit for Translation Quality Estimation with Cross-lingual Transformers, designed and edited by Dr Tharindu Ranasinghe:
With TransQuest, we have opensourced our research in translation quality estimation which also won the sentence-level direct assessment quality estimation shared task in WMT 2020.
Screenshot of the page hosting information about Inteliterm
Inteliterm is an intelligent multilingual dictionary related to the health and beauty tourism sector designed by Prof Corpas and her team:
Interliterm allows to quickly display the information of the selected terms which are included in the Interliterm database. It also has a TBX database management module and it is linked to a corpus manager that allows searching for concordances, n-grams, etc.