{"title":"一种叠式泛化方法预测数字足迹神经障碍的新算法","authors":"Tejaswita Garg, Sanjay K. Gupta","doi":"10.5815/ijmecs.2023.05.05","DOIUrl":null,"url":null,"abstract":"Digital footprints track online behaviors of an individual when communicating over social media platforms. In this paper, sentiment classification is carried out over online posts and tweets to pre detect whether a person is having neurological disorder or not. This study proposed a Hybrid Optimized Model Ensemble STACKed (HOMESTACK) algorithm built on stacked generalization approach that uses stacking and blending ensemble learning technique. The model is then evaluated over two datasets (Reddit Dataset1 & Twitter Dataset2) that include varied number of tweets. The pre-processing of the data and feature extraction is carried out to get cleaned text and vector corpus. The proposed HOMESTACK algorithm is then applied over training data using four base classifiers as Support Vector, Random Forest, K-Nearest Neighbor and CatBoost along with a Meta classifier as Logistic Regression. The testing data is then fed to the tuned model to compare the classification results and analysis. Also, Stacking and Blending ensemble frameworks and algorithms are proposed in this study. Execution time and metric evaluation are calculated in respect of Accuracy, Precision, Recall and F1-score. The experimental results clearly show that the proposed HOMESTACK algorithm performed better over chosen datasets as compared to blending ensemble and standalone machine learning classifiers.","PeriodicalId":36486,"journal":{"name":"International Journal of Modern Education and Computer Science","volume":"44 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A Novel Algorithm for Stacked Generalization Approach to Predict Neurological Disorder over Digital Footprints\",\"authors\":\"Tejaswita Garg, Sanjay K. Gupta\",\"doi\":\"10.5815/ijmecs.2023.05.05\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Digital footprints track online behaviors of an individual when communicating over social media platforms. In this paper, sentiment classification is carried out over online posts and tweets to pre detect whether a person is having neurological disorder or not. This study proposed a Hybrid Optimized Model Ensemble STACKed (HOMESTACK) algorithm built on stacked generalization approach that uses stacking and blending ensemble learning technique. The model is then evaluated over two datasets (Reddit Dataset1 & Twitter Dataset2) that include varied number of tweets. The pre-processing of the data and feature extraction is carried out to get cleaned text and vector corpus. The proposed HOMESTACK algorithm is then applied over training data using four base classifiers as Support Vector, Random Forest, K-Nearest Neighbor and CatBoost along with a Meta classifier as Logistic Regression. The testing data is then fed to the tuned model to compare the classification results and analysis. Also, Stacking and Blending ensemble frameworks and algorithms are proposed in this study. Execution time and metric evaluation are calculated in respect of Accuracy, Precision, Recall and F1-score. The experimental results clearly show that the proposed HOMESTACK algorithm performed better over chosen datasets as compared to blending ensemble and standalone machine learning classifiers.\",\"PeriodicalId\":36486,\"journal\":{\"name\":\"International Journal of Modern Education and Computer Science\",\"volume\":\"44 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-10-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal of Modern Education and Computer Science\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.5815/ijmecs.2023.05.05\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"Social Sciences\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Modern Education and Computer Science","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5815/ijmecs.2023.05.05","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"Social Sciences","Score":null,"Total":0}
A Novel Algorithm for Stacked Generalization Approach to Predict Neurological Disorder over Digital Footprints
Digital footprints track online behaviors of an individual when communicating over social media platforms. In this paper, sentiment classification is carried out over online posts and tweets to pre detect whether a person is having neurological disorder or not. This study proposed a Hybrid Optimized Model Ensemble STACKed (HOMESTACK) algorithm built on stacked generalization approach that uses stacking and blending ensemble learning technique. The model is then evaluated over two datasets (Reddit Dataset1 & Twitter Dataset2) that include varied number of tweets. The pre-processing of the data and feature extraction is carried out to get cleaned text and vector corpus. The proposed HOMESTACK algorithm is then applied over training data using four base classifiers as Support Vector, Random Forest, K-Nearest Neighbor and CatBoost along with a Meta classifier as Logistic Regression. The testing data is then fed to the tuned model to compare the classification results and analysis. Also, Stacking and Blending ensemble frameworks and algorithms are proposed in this study. Execution time and metric evaluation are calculated in respect of Accuracy, Precision, Recall and F1-score. The experimental results clearly show that the proposed HOMESTACK algorithm performed better over chosen datasets as compared to blending ensemble and standalone machine learning classifiers.