Mariya Raphel, P. J. Parvathi, Rizwana Yasmin Hashim, Rohan J Thevara, P. Deepasree Varma
{"title":"Analysing Gender and Age Aspects of Cyberbullying through Online Social Media","authors":"Mariya Raphel, P. J. Parvathi, Rizwana Yasmin Hashim, Rohan J Thevara, P. Deepasree Varma","doi":"10.1109/ICACC-202152719.2021.9708197","DOIUrl":null,"url":null,"abstract":"In this paper, we focus at tracking down cyberbullies and categorize them based on their age and gender. The dataset that we use to analyze this information is provided by the MySpace group data labeled for cyberbullying. Machine learning classifiers are trained using this data to detect cyberbullies and later we analyze the age and gender patterns of those cyberbullies. We look for features that are simple to extract as well as yield good outcomes. As appropriate training data is often tough to obtain in machine learning-specially in the domain of cyberbullying detection - we also examine to what extend does lesser amounts of training data would contribute to better outcomes by performing cross-validation. Our findings show that employing a few yet expressive features has a significant benefit in detecting cyberbullies, particularly when size of training data is small.","PeriodicalId":198810,"journal":{"name":"2021 International Conference on Advances in Computing and Communications (ICACC)","volume":"37 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 International Conference on Advances in Computing and Communications (ICACC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICACC-202152719.2021.9708197","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
In this paper, we focus at tracking down cyberbullies and categorize them based on their age and gender. The dataset that we use to analyze this information is provided by the MySpace group data labeled for cyberbullying. Machine learning classifiers are trained using this data to detect cyberbullies and later we analyze the age and gender patterns of those cyberbullies. We look for features that are simple to extract as well as yield good outcomes. As appropriate training data is often tough to obtain in machine learning-specially in the domain of cyberbullying detection - we also examine to what extend does lesser amounts of training data would contribute to better outcomes by performing cross-validation. Our findings show that employing a few yet expressive features has a significant benefit in detecting cyberbullies, particularly when size of training data is small.