Rosario Gilmary, Akila Venketesan, M. Praveen, Hari R Prasath, Govindasamy Vaiyapuri
{"title":"Detection of Twitter Bots using DNA-based Entropy Technique","authors":"Rosario Gilmary, Akila Venketesan, M. Praveen, Hari R Prasath, Govindasamy Vaiyapuri","doi":"10.1109/ICEEICT53079.2022.9768516","DOIUrl":null,"url":null,"abstract":"Twitter is an interactive microblogging platform where registered users share their thoughts using tweets. Currently, Twitter has reached almost 396.5 million users. The proportion of Twitter bots has grown with their popularity. It is estimated that about 52 million Twitter accounts are bots. Bot identification is significant to prevent false information, malware and protect the reliability of online discussions. Most techniques focus on Twitter's topological structure, neglecting the account heterogeneity. Further, they use supervised learning, which demands large training sets. In this paper, the user behaviors are modeled as DNA sequences. Information gain-based entropy is computed on fragments of DNA sequences throughterm frequency-inverse document frequency to determine DNA patterns that contribute to bots.","PeriodicalId":201910,"journal":{"name":"2022 First International Conference on Electrical, Electronics, Information and Communication Technologies (ICEEICT)","volume":"71 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-02-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 First International Conference on Electrical, Electronics, Information and Communication Technologies (ICEEICT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICEEICT53079.2022.9768516","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Twitter is an interactive microblogging platform where registered users share their thoughts using tweets. Currently, Twitter has reached almost 396.5 million users. The proportion of Twitter bots has grown with their popularity. It is estimated that about 52 million Twitter accounts are bots. Bot identification is significant to prevent false information, malware and protect the reliability of online discussions. Most techniques focus on Twitter's topological structure, neglecting the account heterogeneity. Further, they use supervised learning, which demands large training sets. In this paper, the user behaviors are modeled as DNA sequences. Information gain-based entropy is computed on fragments of DNA sequences throughterm frequency-inverse document frequency to determine DNA patterns that contribute to bots.