Natural Language Processing

Dr Sanja Štajner, ReadableAI

Automatic Assessment of Conceptual Text Complexity Using Knowledge Graphs

8 March 2021

Abstract

In this talk,  I will present in depth the ideas behind our position paper on “Automatic Assessment of Conceptual Text Complexity Using Knowledge Graphs“ that was published at COLING 2018. First, I will define what we consider under conceptual complexity of texts, what is its role in text understanding, and why it is important to have an automatic way of assessing it. Next, I will introduce in details a number of graph-based measures on a large knowledge base that we proposed as features for automatic assessment of conceptual complexity, and talk about the experimental setup and results. By using a high-quality language learners corpus for English, we showed that graph-based measures of individual text concepts, as well as the way they relate to each other in the knowledge graph, have a high discriminative power when distinguishing between two versions of the same text. Furthermore, when used as features in a binary classification task aiming to choose the simpler of two versions of the same text, our measures achieved high performance even in a default setup.

Bio

Sanja Štajner is currently Chief Research Scientist at ReadableAI and Senior Research Scientist at Symanto. She has obtained her PhD in Computer Science on the topic of Data-Driven Text Simplification from University of Wolverhampton (UK), and holds a multiple Masters degree in Natural Language Processing and Human Language Technologies.

Sanja is one of the most cited researches in the field of Text Simplification, with over 80 peer-reviewed articles in international journals and top-tier NLP/AI conferences, holds several awards at international conferences for her work in text simplification, and is regularly invited as a speaker across academia and industry. She has served as an area chair, reviewer, and member of the scientific committee in many top-tier international NLP/AI conferences, and was the lead organizer of two shared tasks in the field of text simplification (QATS 2016 and CWI 2018).