基于文本挖掘的科学项目跨学科性探索

IF 1.8 4区 管理学 Q3 COMPUTER SCIENCE, INFORMATION SYSTEMS
Zhang Xue, Zhiqiang Zhang, Zhengyin Hu
{"title":"基于文本挖掘的科学项目跨学科性探索","authors":"Zhang Xue, Zhiqiang Zhang, Zhengyin Hu","doi":"10.1177/01655515231182075","DOIUrl":null,"url":null,"abstract":"Interdisciplinary research has gradually become one of the main driving forces to promote original innovation of scientific research, and how to measure the interdisciplinarity of science project is becoming an important topic in the science foundation managements. Existing researches mainly using methods, such as academic degree or institutional discipline or discipline category mapping of journals, to measure the interdisciplinarity. This study proposes an approach to mine and capture the different or complementary characteristics of interdisciplinarity of projects by combining text mining and machine learning methods. First, we construct the classification system and extract a raw paper and its discipline matrix according to the discipline category of journals where the references were published in. Second, we cut the matrix to summarise the distribution of key disciplines in each paper and extract the text features in the abstract and title to form a training set. Finally, we compare and analyse the classification effects of Naive Bayesian Model, Support Vector Machine and Bidirectional Encoder Representations from Transformers (BERT) model. Then, the model evaluation indicators show that the best classification effect was achieved by the BERT model. Therefore, the deep pre-trained linguistic model BERT is chosen to predict the discipline distribution of each project. In addition, the different aspects of interdisciplinarity are measured using network coherence and discipline diversity indicators. Besides, experts are invited to evaluate and interpret the results. This proposed approach could be applied to deeply understand the discipline integration from a new perspective.","PeriodicalId":54796,"journal":{"name":"Journal of Information Science","volume":" ","pages":""},"PeriodicalIF":1.8000,"publicationDate":"2023-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Exploring interdisciplinarity of science projects based on the text mining\",\"authors\":\"Zhang Xue, Zhiqiang Zhang, Zhengyin Hu\",\"doi\":\"10.1177/01655515231182075\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Interdisciplinary research has gradually become one of the main driving forces to promote original innovation of scientific research, and how to measure the interdisciplinarity of science project is becoming an important topic in the science foundation managements. Existing researches mainly using methods, such as academic degree or institutional discipline or discipline category mapping of journals, to measure the interdisciplinarity. This study proposes an approach to mine and capture the different or complementary characteristics of interdisciplinarity of projects by combining text mining and machine learning methods. First, we construct the classification system and extract a raw paper and its discipline matrix according to the discipline category of journals where the references were published in. Second, we cut the matrix to summarise the distribution of key disciplines in each paper and extract the text features in the abstract and title to form a training set. Finally, we compare and analyse the classification effects of Naive Bayesian Model, Support Vector Machine and Bidirectional Encoder Representations from Transformers (BERT) model. Then, the model evaluation indicators show that the best classification effect was achieved by the BERT model. Therefore, the deep pre-trained linguistic model BERT is chosen to predict the discipline distribution of each project. In addition, the different aspects of interdisciplinarity are measured using network coherence and discipline diversity indicators. Besides, experts are invited to evaluate and interpret the results. This proposed approach could be applied to deeply understand the discipline integration from a new perspective.\",\"PeriodicalId\":54796,\"journal\":{\"name\":\"Journal of Information Science\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":1.8000,\"publicationDate\":\"2023-07-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Information Science\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1177/01655515231182075\",\"RegionNum\":4,\"RegionCategory\":\"管理学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Information Science","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1177/01655515231182075","RegionNum":4,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0

摘要

跨学科研究已逐渐成为推动科学研究原始创新的主要动力之一,如何衡量科学项目的跨学科性正成为科学基金管理中的一个重要课题。现有研究主要采用学位或机构学科或期刊学科类别映射等方法来衡量跨学科性。本研究提出了一种通过结合文本挖掘和机器学习方法来挖掘和捕获项目跨学科性的不同或互补特征的方法。首先,构建分类体系,根据文献所在期刊的学科类别提取原论文及其学科矩阵;其次,对矩阵进行裁剪,总结每篇论文中重点学科的分布,提取摘要和标题中的文本特征,形成训练集;最后,比较分析了朴素贝叶斯模型、支持向量机模型和双向编码器表示的分类效果。然后,模型评价指标表明BERT模型的分类效果最好。因此,选择深度预训练语言模型BERT来预测每个项目的学科分布。此外,使用网络一致性和学科多样性指标来衡量跨学科性的不同方面。此外,还邀请专家对结果进行评价和解释。该方法可以从一个新的视角来深入理解学科整合。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Exploring interdisciplinarity of science projects based on the text mining
Interdisciplinary research has gradually become one of the main driving forces to promote original innovation of scientific research, and how to measure the interdisciplinarity of science project is becoming an important topic in the science foundation managements. Existing researches mainly using methods, such as academic degree or institutional discipline or discipline category mapping of journals, to measure the interdisciplinarity. This study proposes an approach to mine and capture the different or complementary characteristics of interdisciplinarity of projects by combining text mining and machine learning methods. First, we construct the classification system and extract a raw paper and its discipline matrix according to the discipline category of journals where the references were published in. Second, we cut the matrix to summarise the distribution of key disciplines in each paper and extract the text features in the abstract and title to form a training set. Finally, we compare and analyse the classification effects of Naive Bayesian Model, Support Vector Machine and Bidirectional Encoder Representations from Transformers (BERT) model. Then, the model evaluation indicators show that the best classification effect was achieved by the BERT model. Therefore, the deep pre-trained linguistic model BERT is chosen to predict the discipline distribution of each project. In addition, the different aspects of interdisciplinarity are measured using network coherence and discipline diversity indicators. Besides, experts are invited to evaluate and interpret the results. This proposed approach could be applied to deeply understand the discipline integration from a new perspective.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Journal of Information Science
Journal of Information Science 工程技术-计算机:信息系统
CiteScore
6.80
自引率
8.30%
发文量
121
审稿时长
4 months
期刊介绍: The Journal of Information Science is a peer-reviewed international journal of high repute covering topics of interest to all those researching and working in the sciences of information and knowledge management. The Editors welcome material on any aspect of information science theory, policy, application or practice that will advance thinking in the field.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信