Jacky Casas, S. Berger, Omar Abou Khaled, E. Mugellini, D. Lalanne
{"title":"Country Localisation of Twitter Users","authors":"Jacky Casas, S. Berger, Omar Abou Khaled, E. Mugellini, D. Lalanne","doi":"10.1109/ICICS52457.2021.9464545","DOIUrl":null,"url":null,"abstract":"Localising Twitter users when trying to analyse local trends, events, or mood is a useful capability. However, there is still no method able to reach high precision and recall. Research projects attempting to localise Twitter users to a precise radius (e.g., 10km) managed to localise at most 60% of users correctly. In this paper, we propose a way to classify them by the country they are located in, instead of finding a precise localisation. We apply our technique to Switzerland and locate the users to inside or outside of the country. Among different features, we used relations of users to a list of \"Swiss Influencers\" accounts - that is, accounts which are mostly of interest to Swiss people. A full classification pipeline was implemented and tested. We have found that our best classification models achieved an accuracy of 95%, with a maximum precision of 98%, and a maximum recall of 91%. This goes to show that our binary classification problem, while potentially not being specific enough for certain types of applications, can amount to significantly more reliable results.","PeriodicalId":421803,"journal":{"name":"2021 12th International Conference on Information and Communication Systems (ICICS)","volume":"44 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-05-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 12th International Conference on Information and Communication Systems (ICICS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICICS52457.2021.9464545","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Localising Twitter users when trying to analyse local trends, events, or mood is a useful capability. However, there is still no method able to reach high precision and recall. Research projects attempting to localise Twitter users to a precise radius (e.g., 10km) managed to localise at most 60% of users correctly. In this paper, we propose a way to classify them by the country they are located in, instead of finding a precise localisation. We apply our technique to Switzerland and locate the users to inside or outside of the country. Among different features, we used relations of users to a list of "Swiss Influencers" accounts - that is, accounts which are mostly of interest to Swiss people. A full classification pipeline was implemented and tested. We have found that our best classification models achieved an accuracy of 95%, with a maximum precision of 98%, and a maximum recall of 91%. This goes to show that our binary classification problem, while potentially not being specific enough for certain types of applications, can amount to significantly more reliable results.