{"title":"使用 BERTopic 和传统主题建模技术高效识别 MOOC 论坛紧急帖子的主题","authors":"Nabila Khodeir, Fatma Elghannam","doi":"10.1007/s10639-024-13003-4","DOIUrl":null,"url":null,"abstract":"<p>MOOC platforms provide a means of communication through forums, allowing learners to express their difficulties and challenges while studying various courses. Within these forums, some posts require urgent attention from instructors. Failing to respond promptly to these posts can contribute to higher dropout rates and lower course completion rates. While existing research primarily focuses on identifying urgent posts through various classification techniques, it has not adequately addressed the underlying reasons behind them. This research aims to delve into these reasons and assess the extent to which they vary. By understanding the root causes of urgency, instructors can effectively address these issues and provide appropriate support and solutions. BERTopic utilizes the advanced language capabilities of transformer models and represents an advanced approach in topic modeling. In this study, a comparison was conducted to evaluate the performance of BERTopic in topic modeling on MOOCs discussion forums, alongside traditional topic models such as LDA, LSI, and NMF. The experimental results revealed that the NMF and BERTopic models outperformed the other models. Specifically, the NMF model demonstrated superior performance when a lower number of topics was required, whereas the BERTopic model excelled in generating topics with higher coherence when a larger number of topics was needed.The results considering all urgent posts from the dataset were as follows: Optimal number of topics is 6 for NMF and 50 for BERTopic; coherence scores is 0.66 for NMF and 0.616 for BERTopic; and IRBO scores is 1 for both models. This highlights the BERTopic model capability to distinguish and extract diverse topics comprehensively and coherently, aiding in the identification of various reasons behind MOOC Forum posts.</p>","PeriodicalId":51494,"journal":{"name":"Education and Information Technologies","volume":"33 1","pages":""},"PeriodicalIF":4.8000,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Efficient topic identification for urgent MOOC Forum posts using BERTopic and traditional topic modeling techniques\",\"authors\":\"Nabila Khodeir, Fatma Elghannam\",\"doi\":\"10.1007/s10639-024-13003-4\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>MOOC platforms provide a means of communication through forums, allowing learners to express their difficulties and challenges while studying various courses. Within these forums, some posts require urgent attention from instructors. Failing to respond promptly to these posts can contribute to higher dropout rates and lower course completion rates. While existing research primarily focuses on identifying urgent posts through various classification techniques, it has not adequately addressed the underlying reasons behind them. This research aims to delve into these reasons and assess the extent to which they vary. By understanding the root causes of urgency, instructors can effectively address these issues and provide appropriate support and solutions. BERTopic utilizes the advanced language capabilities of transformer models and represents an advanced approach in topic modeling. In this study, a comparison was conducted to evaluate the performance of BERTopic in topic modeling on MOOCs discussion forums, alongside traditional topic models such as LDA, LSI, and NMF. The experimental results revealed that the NMF and BERTopic models outperformed the other models. Specifically, the NMF model demonstrated superior performance when a lower number of topics was required, whereas the BERTopic model excelled in generating topics with higher coherence when a larger number of topics was needed.The results considering all urgent posts from the dataset were as follows: Optimal number of topics is 6 for NMF and 50 for BERTopic; coherence scores is 0.66 for NMF and 0.616 for BERTopic; and IRBO scores is 1 for both models. This highlights the BERTopic model capability to distinguish and extract diverse topics comprehensively and coherently, aiding in the identification of various reasons behind MOOC Forum posts.</p>\",\"PeriodicalId\":51494,\"journal\":{\"name\":\"Education and Information Technologies\",\"volume\":\"33 1\",\"pages\":\"\"},\"PeriodicalIF\":4.8000,\"publicationDate\":\"2024-09-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Education and Information Technologies\",\"FirstCategoryId\":\"95\",\"ListUrlMain\":\"https://doi.org/10.1007/s10639-024-13003-4\",\"RegionNum\":2,\"RegionCategory\":\"教育学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"EDUCATION & EDUCATIONAL RESEARCH\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Education and Information Technologies","FirstCategoryId":"95","ListUrlMain":"https://doi.org/10.1007/s10639-024-13003-4","RegionNum":2,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"EDUCATION & EDUCATIONAL RESEARCH","Score":null,"Total":0}
Efficient topic identification for urgent MOOC Forum posts using BERTopic and traditional topic modeling techniques
MOOC platforms provide a means of communication through forums, allowing learners to express their difficulties and challenges while studying various courses. Within these forums, some posts require urgent attention from instructors. Failing to respond promptly to these posts can contribute to higher dropout rates and lower course completion rates. While existing research primarily focuses on identifying urgent posts through various classification techniques, it has not adequately addressed the underlying reasons behind them. This research aims to delve into these reasons and assess the extent to which they vary. By understanding the root causes of urgency, instructors can effectively address these issues and provide appropriate support and solutions. BERTopic utilizes the advanced language capabilities of transformer models and represents an advanced approach in topic modeling. In this study, a comparison was conducted to evaluate the performance of BERTopic in topic modeling on MOOCs discussion forums, alongside traditional topic models such as LDA, LSI, and NMF. The experimental results revealed that the NMF and BERTopic models outperformed the other models. Specifically, the NMF model demonstrated superior performance when a lower number of topics was required, whereas the BERTopic model excelled in generating topics with higher coherence when a larger number of topics was needed.The results considering all urgent posts from the dataset were as follows: Optimal number of topics is 6 for NMF and 50 for BERTopic; coherence scores is 0.66 for NMF and 0.616 for BERTopic; and IRBO scores is 1 for both models. This highlights the BERTopic model capability to distinguish and extract diverse topics comprehensively and coherently, aiding in the identification of various reasons behind MOOC Forum posts.
期刊介绍:
The Journal of Education and Information Technologies (EAIT) is a platform for the range of debates and issues in the field of Computing Education as well as the many uses of information and communication technology (ICT) across many educational subjects and sectors. It probes the use of computing to improve education and learning in a variety of settings, platforms and environments.
The journal aims to provide perspectives at all levels, from the micro level of specific pedagogical approaches in Computing Education and applications or instances of use in classrooms, to macro concerns of national policies and major projects; from pre-school classes to adults in tertiary institutions; from teachers and administrators to researchers and designers; from institutions to online and lifelong learning. The journal is embedded in the research and practice of professionals within the contemporary global context and its breadth and scope encourage debate on fundamental issues at all levels and from different research paradigms and learning theories. The journal does not proselytize on behalf of the technologies (whether they be mobile, desktop, interactive, virtual, games-based or learning management systems) but rather provokes debate on all the complex relationships within and between computing and education, whether they are in informal or formal settings. It probes state of the art technologies in Computing Education and it also considers the design and evaluation of digital educational artefacts. The journal aims to maintain and expand its international standing by careful selection on merit of the papers submitted, thus providing a credible ongoing forum for debate and scholarly discourse. Special Issues are occasionally published to cover particular issues in depth. EAIT invites readers to submit papers that draw inferences, probe theory and create new knowledge that informs practice, policy and scholarship. Readers are also invited to comment and reflect upon the argument and opinions published. EAIT is the official journal of the Technical Committee on Education of the International Federation for Information Processing (IFIP) in partnership with UNESCO.