Zsolt T. Kosztyán , Tünde Király , Tibor Csizmadia , Attila Imre Katona , Ágnes Vathy-Fogarassy
{"title":"使用机器学习的自动化研究方法分类","authors":"Zsolt T. Kosztyán , Tünde Király , Tibor Csizmadia , Attila Imre Katona , Ágnes Vathy-Fogarassy","doi":"10.1016/j.engappai.2025.111039","DOIUrl":null,"url":null,"abstract":"<div><div>Scientific papers have become the primary means for disseminating scientific research, and thus, the ability to classify research papers based on different aspects has become essential. Therefore, many works have developed classification approaches; however, they focused solely on research topic-based classification. In addition, no solution has been developed to classify papers based on the applied methodology, and finally, the accuracy of the existing paper classification methods is not satisfactory. In this study, a novel automated classification methodology using a refined Extreme Gradient boosting (XGBoost) model is presented to classify the research methods employed in scientific papers. Three article sets, including quantitative and qualitative research methods, were collected from the topics of tourism, medical science and information systems, consisting of 229, 557 and 787 papers, respectively. The classification problem was considered a binary classification task to maintain interpretability. The developed model was trained and tested on article set 1 (tourism) and 2 (medical science), and then, the proposed model was applied to article set 3, (information systems and tourism). The high accuracy achieved in different research fields (90%–95% accuracies on average) indicates that the proposed classification model is generalizable because it can be successfully applied in many disciplines. The automated classifier enables the rapid acquisition of vital information and the identification of significant differences among the applied methodologies in various research domains. A future development direction will be to increase the scalability of the proposed model to achieve efficient operations on large volumes of research papers.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"156 ","pages":"Article 111039"},"PeriodicalIF":8.0000,"publicationDate":"2025-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Automated research methodology classification using machine learning\",\"authors\":\"Zsolt T. Kosztyán , Tünde Király , Tibor Csizmadia , Attila Imre Katona , Ágnes Vathy-Fogarassy\",\"doi\":\"10.1016/j.engappai.2025.111039\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Scientific papers have become the primary means for disseminating scientific research, and thus, the ability to classify research papers based on different aspects has become essential. Therefore, many works have developed classification approaches; however, they focused solely on research topic-based classification. In addition, no solution has been developed to classify papers based on the applied methodology, and finally, the accuracy of the existing paper classification methods is not satisfactory. In this study, a novel automated classification methodology using a refined Extreme Gradient boosting (XGBoost) model is presented to classify the research methods employed in scientific papers. Three article sets, including quantitative and qualitative research methods, were collected from the topics of tourism, medical science and information systems, consisting of 229, 557 and 787 papers, respectively. The classification problem was considered a binary classification task to maintain interpretability. The developed model was trained and tested on article set 1 (tourism) and 2 (medical science), and then, the proposed model was applied to article set 3, (information systems and tourism). The high accuracy achieved in different research fields (90%–95% accuracies on average) indicates that the proposed classification model is generalizable because it can be successfully applied in many disciplines. The automated classifier enables the rapid acquisition of vital information and the identification of significant differences among the applied methodologies in various research domains. A future development direction will be to increase the scalability of the proposed model to achieve efficient operations on large volumes of research papers.</div></div>\",\"PeriodicalId\":50523,\"journal\":{\"name\":\"Engineering Applications of Artificial Intelligence\",\"volume\":\"156 \",\"pages\":\"Article 111039\"},\"PeriodicalIF\":8.0000,\"publicationDate\":\"2025-05-31\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Engineering Applications of Artificial Intelligence\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0952197625010395\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"AUTOMATION & CONTROL SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Engineering Applications of Artificial Intelligence","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0952197625010395","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}
Automated research methodology classification using machine learning
Scientific papers have become the primary means for disseminating scientific research, and thus, the ability to classify research papers based on different aspects has become essential. Therefore, many works have developed classification approaches; however, they focused solely on research topic-based classification. In addition, no solution has been developed to classify papers based on the applied methodology, and finally, the accuracy of the existing paper classification methods is not satisfactory. In this study, a novel automated classification methodology using a refined Extreme Gradient boosting (XGBoost) model is presented to classify the research methods employed in scientific papers. Three article sets, including quantitative and qualitative research methods, were collected from the topics of tourism, medical science and information systems, consisting of 229, 557 and 787 papers, respectively. The classification problem was considered a binary classification task to maintain interpretability. The developed model was trained and tested on article set 1 (tourism) and 2 (medical science), and then, the proposed model was applied to article set 3, (information systems and tourism). The high accuracy achieved in different research fields (90%–95% accuracies on average) indicates that the proposed classification model is generalizable because it can be successfully applied in many disciplines. The automated classifier enables the rapid acquisition of vital information and the identification of significant differences among the applied methodologies in various research domains. A future development direction will be to increase the scalability of the proposed model to achieve efficient operations on large volumes of research papers.
期刊介绍:
Artificial Intelligence (AI) is pivotal in driving the fourth industrial revolution, witnessing remarkable advancements across various machine learning methodologies. AI techniques have become indispensable tools for practicing engineers, enabling them to tackle previously insurmountable challenges. Engineering Applications of Artificial Intelligence serves as a global platform for the swift dissemination of research elucidating the practical application of AI methods across all engineering disciplines. Submitted papers are expected to present novel aspects of AI utilized in real-world engineering applications, validated using publicly available datasets to ensure the replicability of research outcomes. Join us in exploring the transformative potential of AI in engineering.