Search Solutions Tutorial on Natural Language Processing.

Dr Michael Oakes

Search Solutions is an annual event run by the Information Retrieval Specialist Group, the section of the British Computer Society which has a special interest in search engines. This year it took place on Wednesday 24th November, and was held online for invited speakers from industry to talk about their work in information retrieval. The British Computer Society has new offices at 25 Copthall Avenue in London, near the Bank of England. On the day before, a series of tuorials designed to introduce people to related topics were held, such as one given by Ingo Frommholz from our own computing department on search engine evaluation.

The tutorial on Natural Language Processing was given by myself. Unlike the others, it was an all-day event, and held face-to-face. After having had experience of online teaching during the pandemic, I know that I prefer the closer interaction with the students which comes with face-to-face teaching.

The contents of the tutorial were almost the same as the first three weeks of lectures that I give on the MA Computational Linguistics module in RIILP. I used the structure of the textbook by Jurafsky and Martin as a skeleton, but brought in other things such as the practical exercises  from the Edinburgh Textbooks in Empirical Linguistics on stemming and automatic part of speech tagging. Stemming covers techniques for regarding different grammatical forms of a word as being related to each other, and part-of-speech tagging is assigning a part-of-speech category (such as noun or verb) to each word in the input sentence. I used the first edition of Jurafsky and Martin to open the discussion with a short dialogue between Dave the astronaut and HAL the computer from the film “2001 – A Space Odyssey”. What natural language techniques would HAL need to know to carry out this conversation?

At the event, I was pleased to see some old friends in the audience, including Ingo in the morning, before his own workshop began.  

More details are available at: