Author Archives: c.orasan

Datasets annotated with signs of syntactic complexity

In our paper:

Evans, R., & Orasan, C. (2013). Annotating signs of syntactic complexity to support sentence simplification. In I. Habernal & V. Matousek (Eds.), Text, Speech and Dialogue. Proceedings of the 16th International Conference TSD 2013. Plzen, Czech Republic: Springer. pp. 92 – 104

we present the annotation of a dataset that is used by our syntactic simplification method to identify places where rewriting rules have to be applied in order to produce simpler sentences. 

The datasets are available in XML format as three independent files, each representing a different genre

Each file contains a list of sentences annotated using the following format:

<S ID="2"><SIGN ID="2" CLASS="SSEV">That</SIGN> is 
<SIGN ID="3" CLASS="HELP">,</SIGN> a high-fibre diet, 
fluid <SIGN ID="4" CLASS="CLN">,</SIGN> etc.</S>

The sentences are marked using the S tag, whilst the signs by the tag SIGN. The type of sign is encoded by the attribute CLASS. The sentences were annotated in isolation, so the files above do not contain coherent texts, but sequences of sentences extracted from different files. 

To understand the difference between different classes and how the annotation process was carried out please consult the annotation guidelines. Specific questions about the annotation should be sent to Richard Evans. A demo of the sign tagger is available at http://rgcl.wlv.ac.uk/demos/SignTaggerWebDemo/  

You can find out more about our approach for syntactic simplification in our recent paper

Evans, Richard, and Constantin Orǎsan. 2018. “Identifying Signs of Syntactic Complexity for Rule-Based Sentence Simplification.” Natural Language Engineering. https://doi.org/10.1017/S1351324918000384.

Lecturer/Senior Lecturer in Translation Technology

The Research Group in Computational Linguistics at the University of Wolverhampton (http://rgcl.wlv.ac.uk) is currently recruiting a Lecturer/Senior Lecturer in Translation Technology (permanent). The purpose of this post is to strengthen the research group by enhancing its research and publications in the field of translation technology. The appointed candidate will be expected to produce REF-returnable outputs, attract external income, seek industrial collaborations, teach at Masters level and supervise PhD students. He/she will join a recently appointed research fellow and two PhD students in translation technology. All these posts are part of a university investment in the area of translation technology.

Continue reading

2 PhD studentships in Translation Technology

The Research Group in Computational Linguistics invites applications for TWO 3-year PhD studentships in the area of translation technology. These two PhD studentships are part of a larger university investment which includes other PhD students and members of staff with the aim to strengthen the existing research undertaken by members of the group in this area. These funded student bursaries consist of a stipend towards living expenses (£14,500 per year) and remission of fees.

Continue reading

Journal of Natural Language Engineering: Call for special issue

The area of Natural Language Engineering, and Natural Language Processing in general, is following the trend of many other areas in becoming highly specialised, with a number of application-orientated and narrow-domain topics emerging or growing in importance. These developments, often coinciding with a lack of related literature, necessitate and warrant the publication of specialised volumes focusing on a specific topic of interest to the Natural Language Processing (NLP) research community.

The Journal of Natural Language Engineering (JNLE), which now features six 160-page issues per year and has increased its impact factor for third consecutive year, invites proposals for special issues on a competitive basis regarding any topics surrounding applied NLP which have emerged as important recent developments and that have attracted the attention of a number of researchers or research groups. In recent years, Calls for Proposals for special issues have resulted in high-quality outputs and this year we look forward to another successful competition.

Continue reading

Jobs in translation technology at the Research Group in Computational Linguistics

The Research Group in Computational Linguistics at the University of Wolverhampton is currently recruiting a Reader in Translation Technology (permanent) and a Research Fellow in Translation Technology (3 year position with the possibility of extension). The purpose of these posts is to  strengthen the research group by enhancing its research and publications in the field of translation technology. The appointed candidates will be expected to produce REF-returnable outputs, attract external income, seek industrial collaborations, teach at Masters level and supervise PhD students. Continue reading

Syntactic complexity sign tagger demo released

The successfully completed FIRST project has developed various components which help users to analyse the complexity of texts and rewrite texts in order to make them more accessible for readers with Autistic Spectrum Disorder (ASD). These components were integrated in the OpenBook tool, but they cannot be used in isolation. In an attempt to make some of this technology available for other researchers, we started a process of releasing some of the components individually. The first component to be released as a web demo is the syntactic
complexity sign tagger. This is a tool that assigns words and punctuation marks from a predefined set to categories indicating their syntactic linking and bounding functions. Some of these categories are used by our sentence rewriting algorithm. Continue reading