文本文档分类的改进混合模型

{"title":"文本文档分类的改进混合模型","authors":"","doi":"10.24018/ejai.2023.2.2.22","DOIUrl":null,"url":null,"abstract":"All universities in and around the globe have senate members whose responsibility is to deliberate on matters that affect the smooth running of the university in senate meetings, such matters include, personnel, management, and student matters. Reports are generated at the end of each senate meeting on these matters and are printed on paper or stored in the system without proper grouping of the matters as a result of lack of efficient classification model. This paper proposes hybrid machine learning and deep learning models for the development of efficient classification model for textual documents and tested with reports from senate deliberations from university of Port Harcourt. The dataset for over ten years was collected and pre-processed, noise and other non-alphanumeric values removed by tokenization. Principal component analysis algorithm which is a machine learning approach was used extensively for feature selection and LSTM a deep learning architecture was used to build the model which has the capacity of retaining the content in its memory for a long time which solves the challenges of memory retention in other models. The model built depicts classification accuracy of 99% and the classification application was able to classify decisions made by the senate into different categories which will assist to eliminate conflicting decisions on the floor of any university senate.","PeriodicalId":360205,"journal":{"name":"European Journal of Artificial Intelligence and Machine Learning","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Improved Hybrid Model for Classification of Text Documents\",\"authors\":\"\",\"doi\":\"10.24018/ejai.2023.2.2.22\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"All universities in and around the globe have senate members whose responsibility is to deliberate on matters that affect the smooth running of the university in senate meetings, such matters include, personnel, management, and student matters. Reports are generated at the end of each senate meeting on these matters and are printed on paper or stored in the system without proper grouping of the matters as a result of lack of efficient classification model. This paper proposes hybrid machine learning and deep learning models for the development of efficient classification model for textual documents and tested with reports from senate deliberations from university of Port Harcourt. The dataset for over ten years was collected and pre-processed, noise and other non-alphanumeric values removed by tokenization. Principal component analysis algorithm which is a machine learning approach was used extensively for feature selection and LSTM a deep learning architecture was used to build the model which has the capacity of retaining the content in its memory for a long time which solves the challenges of memory retention in other models. The model built depicts classification accuracy of 99% and the classification application was able to classify decisions made by the senate into different categories which will assist to eliminate conflicting decisions on the floor of any university senate.\",\"PeriodicalId\":360205,\"journal\":{\"name\":\"European Journal of Artificial Intelligence and Machine Learning\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-04-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"European Journal of Artificial Intelligence and Machine Learning\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.24018/ejai.2023.2.2.22\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"European Journal of Artificial Intelligence and Machine Learning","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.24018/ejai.2023.2.2.22","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

世界上所有的大学都有参议院成员,他们的职责是在参议院会议上审议影响大学顺利运行的问题,这些问题包括人事、管理和学生问题。由于缺乏有效的分类模型,每次参议院会议结束时都会生成关于这些事项的报告,并将其打印在纸上或存储在系统中,而没有对事项进行适当的分组。本文提出了混合机器学习和深度学习模型,用于开发文本文档的有效分类模型,并使用来自哈科特港大学参议院审议的报告进行了测试。收集十多年的数据集并进行预处理,通过标记化去除噪声和其他非字母数字值。广泛采用机器学习方法主成分分析算法进行特征选择,采用深度学习架构LSTM构建具有长时间记忆能力的模型,解决了其他模型记忆保留的难题。所建立的模型描述了99%的分类准确率,分类应用程序能够将参议院做出的决策分类为不同的类别,这将有助于消除任何大学参议院地板上的冲突决策。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Improved Hybrid Model for Classification of Text Documents
All universities in and around the globe have senate members whose responsibility is to deliberate on matters that affect the smooth running of the university in senate meetings, such matters include, personnel, management, and student matters. Reports are generated at the end of each senate meeting on these matters and are printed on paper or stored in the system without proper grouping of the matters as a result of lack of efficient classification model. This paper proposes hybrid machine learning and deep learning models for the development of efficient classification model for textual documents and tested with reports from senate deliberations from university of Port Harcourt. The dataset for over ten years was collected and pre-processed, noise and other non-alphanumeric values removed by tokenization. Principal component analysis algorithm which is a machine learning approach was used extensively for feature selection and LSTM a deep learning architecture was used to build the model which has the capacity of retaining the content in its memory for a long time which solves the challenges of memory retention in other models. The model built depicts classification accuracy of 99% and the classification application was able to classify decisions made by the senate into different categories which will assist to eliminate conflicting decisions on the floor of any university senate.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信