Modeling Education Studies Indexed in Web of Science Using Natural Language Processing

Tuncer Akbay
{"title":"Modeling Education Studies Indexed in Web of Science Using Natural Language Processing","authors":"Tuncer Akbay","doi":"10.52911/itall.1193460","DOIUrl":null,"url":null,"abstract":"Easier access to information and resources allowed researchers to conduct more studies and publish most of them electronically. They are indexed in scholarly citation databases such as Web of Science and Scopus. These databases index huge volumes of research reports. Even though they offer search engine filtering options, it is still hard to locate the publications in which their contents are closely related. Artificial intelligence technologies, such as Natural Language Processing, allow documents to be categorized based on their content. Top2Vec is an unsupervised topic modeling algorithm that enables users to categorize documents semantically. The purpose of the current study is twofold: (1) to provide users with the ability to group documents applying Natural Language Processing techniques, and (2) to reveal the topics with the highest number of articles indexed in the ‘education scientific disciplines’ category within the Web of Science Core Collection scholarly database in 2021. Colab notebook used to type Python codes for executing Top2Vec algorithm. This study yielded 68 distinct topics among the 8125 articles published in 2021 and indexed in the Web of Science database under the Education Scientific Disciplines category. After modeled topics were ranked from the topic having the largest number of documents (i.e., N=549) to the topic having the least number of documents (i.e., N=29), the first eight topics' findings were presented and discussed. These eight most studies topics are listed as follows: Physics (N=549), online education and covid (N=438), Chemistry (N=381), Math and Reasoning (N=377), Psychology and Emotions (N=257), Educational Diversity (N=228), Health and Life (N=223), Mentoring and Leadership (N=204).","PeriodicalId":340105,"journal":{"name":"Öğretim Teknolojisi ve Hayat Boyu Öğrenme Dergisi - Instructional Technology and Lifelong Learning","volume":"14 3","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Öğretim Teknolojisi ve Hayat Boyu Öğrenme Dergisi - Instructional Technology and Lifelong Learning","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.52911/itall.1193460","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Easier access to information and resources allowed researchers to conduct more studies and publish most of them electronically. They are indexed in scholarly citation databases such as Web of Science and Scopus. These databases index huge volumes of research reports. Even though they offer search engine filtering options, it is still hard to locate the publications in which their contents are closely related. Artificial intelligence technologies, such as Natural Language Processing, allow documents to be categorized based on their content. Top2Vec is an unsupervised topic modeling algorithm that enables users to categorize documents semantically. The purpose of the current study is twofold: (1) to provide users with the ability to group documents applying Natural Language Processing techniques, and (2) to reveal the topics with the highest number of articles indexed in the ‘education scientific disciplines’ category within the Web of Science Core Collection scholarly database in 2021. Colab notebook used to type Python codes for executing Top2Vec algorithm. This study yielded 68 distinct topics among the 8125 articles published in 2021 and indexed in the Web of Science database under the Education Scientific Disciplines category. After modeled topics were ranked from the topic having the largest number of documents (i.e., N=549) to the topic having the least number of documents (i.e., N=29), the first eight topics' findings were presented and discussed. These eight most studies topics are listed as follows: Physics (N=549), online education and covid (N=438), Chemistry (N=381), Math and Reasoning (N=377), Psychology and Emotions (N=257), Educational Diversity (N=228), Health and Life (N=223), Mentoring and Leadership (N=204).
利用自然语言处理为科学网索引的教育研究建模
更容易获得信息和资源,使研究人员能够进行更多的研究,并以电子方式发表大部分研究成果。它们被诸如Web of Science和Scopus等学术引文数据库编入索引。这些数据库索引了大量的研究报告。即使他们提供搜索引擎过滤选项,仍然很难找到与其内容密切相关的出版物。人工智能技术,如自然语言处理,允许根据文档的内容对其进行分类。Top2Vec是一种无监督主题建模算法,使用户能够对文档进行语义分类。本研究的目的有两个:(1)为用户提供应用自然语言处理技术对文档进行分组的能力;(2)揭示2021年Web of Science Core Collection学术数据库中“教育科学学科”类别中被索引的文章数量最多的主题。用于输入执行Top2Vec算法的Python代码的Colab笔记本。这项研究从2021年发表的8125篇文章中得出了68个不同的主题,并在Web of Science数据库的教育科学学科类别下被索引。将建模的主题从拥有最多文档的主题(即N=549)到拥有最少文档的主题(即N=29)进行排序后,展示并讨论了前八个主题的发现。这八个最受欢迎的研究主题如下:物理(N=549)、在线教育和covid (N=438)、化学(N=381)、数学和推理(N=377)、心理学和情感(N=257)、教育多样性(N=228)、健康与生活(N=223)、指导与领导力(N=204)。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信