MA in Corpus Lexicography


The Research Group in Computational Linguistics is looking to develop a new Masters course in Corpus Lexicography. To ensure we get the content just right we need you! Please help us by giving your input via the following short questionnaire. Thank you!

What is Corpus Lexicography?

Corpus Lexicography is an interdisciplinary area of Linguistic that is concerned with the design and construction of modern electronic dictionaries using sophisticated computer tools and large, electronic collections of written and/or spoken text (also known as ‘corpora’).

Why study Corpus Lexicography?

Computers were first employed in the dictionary-making process as early as in the 1960s, and ever since then, the role of technology has become ever more central. Long gone are the days when dictionaries were compiled by gathering linguistic evidence on manually written index cards (also known as ‘slips’). In the twenty-first century, all good dictionaries take corpus data as their starting point and lexicographers depend on a number of technologies – most of them of recent origin – to query corpora online and record dictionary entries in a structured database. These include:

  • personal computers with vast storage capacity, powerful processors, and a fast broadband connection,
  • corpus data, processed using software tools developed in a field of Computer Science called Natural Language Processing and accessed through dedicated querying programs,
  • highly specialized software that allows lexicographers to write dictionary entries, and databases that store and manage this newly created content.

The present course will enable students to learn about the theoretical and practical implications of writing dictionaries from a modern-day, computational perspective – not only will students learn how to use readily available corpus and lexicography tools (e.g. the SketchEngine and dictionary writing systems), but they will also be able to develop advanced computational skills that will enable them to gain a deeper understanding of the ever-changing field of Lexicography and make them competitive in the workplace.

 Who is this course for?

This MA course is a good fit for students who:

  • are passionate about real language and want to explore how people use words to make meanings;
  • are more interested in practical approaches than abstract linguistic theories and speculations about language,
  • have a keen interest in computer science and would like to further develop their skillset (e.g. by learning how to do computer programming and focus on quantitative skills),
  • are potentially interested in comparing the vocabulary of two or more languages or studying domain-specific vocabulary (i.e. the terminology used in law, medicine, sports, art etc.),
  • have a degree in Linguistics, Translation or Computer Science.


The MA in Corpus Lexicography will provide specialized training to students who wish to work as lexicographers, consultants, editors or computational linguists at publishing houses or IT companies that specialize in the development of linguistic resources and tools (e.g. language learning apps).

The MA will also provide a sound intellectual platform for students to progress onto doctorate level study and a career in higher education. The Research Institute of Information and Language Processing (RIILP) offers a wide range of possible PhD topics ranging from Corpus Linguistics and Lexicography to Computational Linguistics, Machine Translation, and Cross-linguistic Studies, which will be of particular interest to prospective MA graduates.