{"title":"基于两阶段建模的文本与调查数据组合特征提取与预测","authors":"A. A. Neloy, M. Turgeon","doi":"10.1109/ICDMW58026.2022.00064","DOIUrl":null,"url":null,"abstract":"Deep learning (DL) based natural language processing (NLP) has recently grown as one the fastest research domain and retained remarkable improvement in many applications. Due to the significant amount of data, the adaptation of feature learning and symmetric data efficiency is a critical underlying task in such applications. However, their ability to extract features is limited due to a lack of proper model formation. Moreover, the use of these methods on smaller datasets is unexplored and underdeveloped compared to more popular research areas. This work introduces a two-stage modeling approach to combine classical statistical analysis with NLP problems in a real-world dataset. We effectively layout a combination of the classical statistical model incorporating a stacked ensemble classifier and a DL framework of convolutional neural network (CNN) and Bidirectional Recurrent Neural Networks (Bi-RNN) to structure a more decomposed architecture with lower computational complexity. Additionally, the experimental results illustrating 96.69 % training and 70.56 % testing accuracy and hypothesis testing from our DL models followed by an ablation study empirically demonstrate the validation of our proposed combined modeling technique.","PeriodicalId":146687,"journal":{"name":"2022 IEEE International Conference on Data Mining Workshops (ICDMW)","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2022-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Feature Extraction and Prediction of Combined Text and Survey Data using Two-Staged Modeling\",\"authors\":\"A. A. Neloy, M. Turgeon\",\"doi\":\"10.1109/ICDMW58026.2022.00064\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Deep learning (DL) based natural language processing (NLP) has recently grown as one the fastest research domain and retained remarkable improvement in many applications. Due to the significant amount of data, the adaptation of feature learning and symmetric data efficiency is a critical underlying task in such applications. However, their ability to extract features is limited due to a lack of proper model formation. Moreover, the use of these methods on smaller datasets is unexplored and underdeveloped compared to more popular research areas. This work introduces a two-stage modeling approach to combine classical statistical analysis with NLP problems in a real-world dataset. We effectively layout a combination of the classical statistical model incorporating a stacked ensemble classifier and a DL framework of convolutional neural network (CNN) and Bidirectional Recurrent Neural Networks (Bi-RNN) to structure a more decomposed architecture with lower computational complexity. Additionally, the experimental results illustrating 96.69 % training and 70.56 % testing accuracy and hypothesis testing from our DL models followed by an ablation study empirically demonstrate the validation of our proposed combined modeling technique.\",\"PeriodicalId\":146687,\"journal\":{\"name\":\"2022 IEEE International Conference on Data Mining Workshops (ICDMW)\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 IEEE International Conference on Data Mining Workshops (ICDMW)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICDMW58026.2022.00064\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE International Conference on Data Mining Workshops (ICDMW)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDMW58026.2022.00064","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Feature Extraction and Prediction of Combined Text and Survey Data using Two-Staged Modeling
Deep learning (DL) based natural language processing (NLP) has recently grown as one the fastest research domain and retained remarkable improvement in many applications. Due to the significant amount of data, the adaptation of feature learning and symmetric data efficiency is a critical underlying task in such applications. However, their ability to extract features is limited due to a lack of proper model formation. Moreover, the use of these methods on smaller datasets is unexplored and underdeveloped compared to more popular research areas. This work introduces a two-stage modeling approach to combine classical statistical analysis with NLP problems in a real-world dataset. We effectively layout a combination of the classical statistical model incorporating a stacked ensemble classifier and a DL framework of convolutional neural network (CNN) and Bidirectional Recurrent Neural Networks (Bi-RNN) to structure a more decomposed architecture with lower computational complexity. Additionally, the experimental results illustrating 96.69 % training and 70.56 % testing accuracy and hypothesis testing from our DL models followed by an ablation study empirically demonstrate the validation of our proposed combined modeling technique.