Monthly Archives: May 2017

Short term job opportunity: Research Associate – AUTOR

This post is being offered on a casual basis until 31 July 2017

The Research Group in Computational Linguistics at the University of Wolverhampton is currently recruiting a Research Associate to conduct research on the AUTOR project which aims to help people with Autism read and understand text better (for more info on this project, please visit

As a Research Associate you will use relevant NLP technologies such as lexical, syntactic, and semantic processing to design and implement applications that can help AUTOR improve its core mission by developing educational assistance for people with autism.

You should hold a Bachelor’s or Master’s degree, but ideally a PhD in Information Science, Computer Science or Natural Language Processing and experience in software development or employment in these fields. You should have experience of language technologies and resources and be willing to work as part of an extended team to research computational linguistics approaches to support the development of education-assistance tools for people with autism. Knowledge of machine learning is required.

Interview dates to be confirmed. Start of the post to be agreed with the successful candidates. This is a temporary, zero hour contract.

For informal discussion about the role please contact Dr Victoria Yaneva (

For more information and how to apply online: click here

RGCL Staff Research Seminar

This week Dr Constantin Orasan gave a staff research seminar profiling his current and future research on the Feedback Analysis Tool.  The paper was well received and there was an interesting debate and questions afterwards.

Title:  Presentation of the Feedback Analysis Tool


The Feedback Analyser is an open source intelligent tool designed to analyse feedback provided by participants in various activities. The tool relies on set of modules to analyse the sentiment in unstructured texts, identifies recurring themes that occur in them and allows easy comparison between various activities and users involved in these activities. The tool produces reports fully automatically, but the real strength of the tool comes from the fact that it allows an analyst to drill down into the data and identify information that otherwise cannot without significant effort. The idea of the tool started from a discussion with the University Outreach team who wanted to extract changes in feelings and aspirations towards Higher Education, by processing hundreds of pieces of free text student data in a matter of minutes.

This talk will provide an overview of the modules currently incorporated in the system and present the results on a small scale pilot. The possibility to develop this tool further will be discussed with the audience being invited to give suggestions.

RGCL Welcomes Lut Colman

Last week Lut Colman visited RGCL from the Instituut voor de Nederlandse Taal, Leiden (INT).

The main objective of Lut’s visit was to gain a deeper understanding of Corpus Pattern Analysis (CPA), a corpus-driven technique developed by Prof. Hanks and implemented in the Pattern Dictionary of English Verbs (PDEV), and to test the lexicographic tools used for PDEV in order to establish whether or not they are suitable for her Dutch pilot project.  Whilst Lut was here, she gave a talk on her upcoming research project.

Title: Dutch Verb Patterns Online: A Collocation and Pattern Dictionary of Dutch Verbs


Dutch Verb Patterns Online is a project to be developed at the Dutch Language Institute (INT) in Leiden. A pilot will consist of a collocation and pattern dictionary of a selection of verbs for advanced learners of Dutch as a second language. For that purpose, the institute will form a consortium with two partners who have expertise in developing e-learning material for language learners.

The aim of the project is a database and web application with information sections on verbs for language learners:

1) collocations: semi-fixed lexical combinations and fixed grammatical collocations that need not be defined, such as een fout {maken, begaan} (make a mistake), vertouwen op (rely on), etc.

2) idioms: expressions that have to be defined because the meaning is opaque, such as de strijdbijl begraven (bury the hatchet)

3) GDEX-examples. GDEX stands for good dictionary examples: short, representative and illustrative example sentences from a corpus

4) verb patterns: semantically motivated pieces of phraseology in which the valency slots of the verb are occupied by arguments of a particular semantic type (e.g. human, location). Semantic types are realized by lexical sets: lists of words and phrases that occur as collocates. Each pattern corresponds to a meaning. Patterns are identified by means of Corpus Pattern Analysis (CPA), a lexicographical technique used by Patrick Hanks in the Pattern Dictionary of English Verbs, PDEV ( ) and based on his Theory of Norms and Exploitations (Hanks 2013).

The Dutch project wants to combine a pattern dictionary and a collocation application like SketchEngine for Language Learners (SkeLL)(Baisa & Suchomel, n.d.). The SkeLL can be developed for Dutch before we get started with the more labour-intensive pattern descriptions. Eventually, both functionalities can be merged and included as a plug-in resource in the language material for second language learners. Students will not only have access to patterns or collocation lists separately, but will be able to see which collocations fill in a semantic type in a pattern.


Baisa, V., & Suchomel, V. (n.d.). SkELL: Web Interface for English Language Learning.

Hanks, P. (2013). Lexical Analysis. Norms and Exploitations. MIT Press.