Branislava Šandrih, from the University of Belgrade, has spent a week with us at RGCL. During this time, she has formed collaborations with many members of the group. Branislava gave a talk to the group which outlined her research.
Title: Fingerprints in SMS messages
The presentation will present a study which seeks to find answers to the following questions:
- Is it possible to tell who is the sender of the short message only by analysing a distribution of characters, and not the meaning of the content itself?
- If possible, how reliable would the judgment be?
- Are we leaving some kind of ‘fingerprints’ when we text, and can we tell anything about a person based on the way this person writes short messages?
A multilingual corpus of SMS messages was collected from a single smart phone to underpin the development of a methodology to address the above challenges. First, a binary classifier was trained to distinguish between messages composed and sent by a public service (e.g. parking service, bank reports etc.) and messages written by humans. A second classifier caters for the more challenging task of distinguishing between messages written by the owner of the smart phone and messages sent by other senders.
Branislava’s presentation outlined the experiments related to the above classifiers and reported the evaluation results.