Lisa Kaati, Elias Lundeqvist, A. Shrestha, Maria Svensson
{"title":"《野外作家剖析","authors":"Lisa Kaati, Elias Lundeqvist, A. Shrestha, Maria Svensson","doi":"10.1109/EISIC.2017.32","DOIUrl":null,"url":null,"abstract":"In this paper, we use machine learning for profiling authors of online textual media. We are interested in determining the gender and age of an author. We use two different approaches, one where the features are learned from raw data and one where features are manually extracted.We are interested in understanding how well author profiling works in the wild and therefore we have tested our models on different domains than they are trained on. Our results show that applying models to a different domain then they were trained on significantly decreases the performance of the models. The results show that more efforts need to be put into making models domain independent if techniques such as author profiling should be used operationally, for example by training on many different datasets and by using domain independent features.","PeriodicalId":436947,"journal":{"name":"2017 European Intelligence and Security Informatics Conference (EISIC)","volume":"53 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Author Profiling in the Wild\",\"authors\":\"Lisa Kaati, Elias Lundeqvist, A. Shrestha, Maria Svensson\",\"doi\":\"10.1109/EISIC.2017.32\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, we use machine learning for profiling authors of online textual media. We are interested in determining the gender and age of an author. We use two different approaches, one where the features are learned from raw data and one where features are manually extracted.We are interested in understanding how well author profiling works in the wild and therefore we have tested our models on different domains than they are trained on. Our results show that applying models to a different domain then they were trained on significantly decreases the performance of the models. The results show that more efforts need to be put into making models domain independent if techniques such as author profiling should be used operationally, for example by training on many different datasets and by using domain independent features.\",\"PeriodicalId\":436947,\"journal\":{\"name\":\"2017 European Intelligence and Security Informatics Conference (EISIC)\",\"volume\":\"53 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 European Intelligence and Security Informatics Conference (EISIC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/EISIC.2017.32\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 European Intelligence and Security Informatics Conference (EISIC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/EISIC.2017.32","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
In this paper, we use machine learning for profiling authors of online textual media. We are interested in determining the gender and age of an author. We use two different approaches, one where the features are learned from raw data and one where features are manually extracted.We are interested in understanding how well author profiling works in the wild and therefore we have tested our models on different domains than they are trained on. Our results show that applying models to a different domain then they were trained on significantly decreases the performance of the models. The results show that more efforts need to be put into making models domain independent if techniques such as author profiling should be used operationally, for example by training on many different datasets and by using domain independent features.