{"title":"Depression Detection on Social Media With User Network and Engagement Features Using Machine Learning Methods","authors":"Aik Seng Liaw, Hui Na Chua","doi":"10.1109/IICAIET55139.2022.9936814","DOIUrl":null,"url":null,"abstract":"Depression is a complicated mental health disorder with many different forms and symptoms. Traditional methods face barriers when detecting and diagnosing depression, including social stigma and societal labeling. As social media platforms became commonplace for information sharing, their anonymity meant that the barriers had considerably lessened. An alternative method to depression detection being researched is using social media data to build machine learning models for depression detection. To that end, this research uses machine learning models to incorporate new user networks and user engagement features into depression detection on Twitter users. These two features provide an additional understanding of users and may significantly affect depression detection. A Twitter dataset is constructed to include additional data on users' following list and the history of liked tweets not examined in prior studies. Ten machine learning models are constructed using five different machine learning algorithms tested on two sets of features. Models with proposed features outperformed other machine learning models without proposed features, with the best model yielding 82.05% performance for both accuracy and F1 score. This study discovered that the most important feature is the number of depression keywords in liked tweets, with at least twice the gain compared to 88% of other features used. Topic modelling features for liked tweets also have high gain and are important in detecting depression. Additionally, features derived from original tweets, replies, and liked tweets have higher gain and are more important than retweets and quote tweets in detecting depression.","PeriodicalId":142482,"journal":{"name":"2022 IEEE International Conference on Artificial Intelligence in Engineering and Technology (IICAIET)","volume":"23 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE International Conference on Artificial Intelligence in Engineering and Technology (IICAIET)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IICAIET55139.2022.9936814","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
Depression is a complicated mental health disorder with many different forms and symptoms. Traditional methods face barriers when detecting and diagnosing depression, including social stigma and societal labeling. As social media platforms became commonplace for information sharing, their anonymity meant that the barriers had considerably lessened. An alternative method to depression detection being researched is using social media data to build machine learning models for depression detection. To that end, this research uses machine learning models to incorporate new user networks and user engagement features into depression detection on Twitter users. These two features provide an additional understanding of users and may significantly affect depression detection. A Twitter dataset is constructed to include additional data on users' following list and the history of liked tweets not examined in prior studies. Ten machine learning models are constructed using five different machine learning algorithms tested on two sets of features. Models with proposed features outperformed other machine learning models without proposed features, with the best model yielding 82.05% performance for both accuracy and F1 score. This study discovered that the most important feature is the number of depression keywords in liked tweets, with at least twice the gain compared to 88% of other features used. Topic modelling features for liked tweets also have high gain and are important in detecting depression. Additionally, features derived from original tweets, replies, and liked tweets have higher gain and are more important than retweets and quote tweets in detecting depression.