知识映射的先前步骤：文本挖掘的应用和比较

Q3 Social Sciences

Issues in Science and Technology Librarianship Pub Date : 2023-04-03 DOI:10.29173/istl2736

Faizhal Arif Santosa

{"title":"知识映射的先前步骤：文本挖掘的应用和比较","authors":"Faizhal Arif Santosa","doi":"10.29173/istl2736","DOIUrl":null,"url":null,"abstract":"Bibliometrics is increasingly being used by the knowledge community and librarians to easily analyze patterns in knowledge. In the field, the use of data from databases that provide bibliometric information is not always completely clean, so pre-processing is required. Several previous studies have shown that bibliometric analysis begins with a simple pre-processing step. The goal of this research is to use text mining to perform pre-processing to find the basic terms of the keywords that appear – to essentially construct a controlled vocabulary for a bibliographic dataset. The method used in this study is cleaning keywords with the stemming method using RapidMiner software. Bibliometrix was used to compare the results. A total of 85 keywords were combined into basic words. Using the built process, this study discovers differences in the network built between raw data and data that has been pre-processed, resulting in differences in the analysis that will be produced. The built process can also be reused in a variety of real-world situations.","PeriodicalId":39287,"journal":{"name":"Issues in Science and Technology Librarianship","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2023-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Prior Steps into Knowledge Mapping: Text Mining Application and Comparison\",\"authors\":\"Faizhal Arif Santosa\",\"doi\":\"10.29173/istl2736\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Bibliometrics is increasingly being used by the knowledge community and librarians to easily analyze patterns in knowledge. In the field, the use of data from databases that provide bibliometric information is not always completely clean, so pre-processing is required. Several previous studies have shown that bibliometric analysis begins with a simple pre-processing step. The goal of this research is to use text mining to perform pre-processing to find the basic terms of the keywords that appear – to essentially construct a controlled vocabulary for a bibliographic dataset. The method used in this study is cleaning keywords with the stemming method using RapidMiner software. Bibliometrix was used to compare the results. A total of 85 keywords were combined into basic words. Using the built process, this study discovers differences in the network built between raw data and data that has been pre-processed, resulting in differences in the analysis that will be produced. The built process can also be reused in a variety of real-world situations.\",\"PeriodicalId\":39287,\"journal\":{\"name\":\"Issues in Science and Technology Librarianship\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-04-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Issues in Science and Technology Librarianship\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.29173/istl2736\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"Social Sciences\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Issues in Science and Technology Librarianship","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.29173/istl2736","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"Social Sciences","Score":null,"Total":0}

引用次数: 1

摘要

文献计量学越来越多地被知识社区和图书馆员用来方便地分析知识模式。在该领域，使用提供文献计量信息的数据库中的数据并不总是完全干净的，因此需要进行预处理。先前的几项研究表明，文献计量分析从一个简单的预处理步骤开始。这项研究的目标是使用文本挖掘进行预处理，以找到出现的关键词的基本术语——本质上为书目数据集构建一个受控词汇表。本研究中使用的方法是使用RapidMiner软件用词干法清理关键词。使用Bibliometrix对结果进行比较。共有85个关键词被组合成基本单词。使用构建的过程，本研究发现原始数据和预处理数据之间构建的网络存在差异，从而导致将产生的分析存在差异。构建的过程也可以在各种实际情况下重复使用。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Prior Steps into Knowledge Mapping: Text Mining Application and Comparison

Bibliometrics is increasingly being used by the knowledge community and librarians to easily analyze patterns in knowledge. In the field, the use of data from databases that provide bibliometric information is not always completely clean, so pre-processing is required. Several previous studies have shown that bibliometric analysis begins with a simple pre-processing step. The goal of this research is to use text mining to perform pre-processing to find the basic terms of the keywords that appear – to essentially construct a controlled vocabulary for a bibliographic dataset. The method used in this study is cleaning keywords with the stemming method using RapidMiner software. Bibliometrix was used to compare the results. A total of 85 keywords were combined into basic words. Using the built process, this study discovers differences in the network built between raw data and data that has been pre-processed, resulting in differences in the analysis that will be produced. The built process can also be reused in a variety of real-world situations.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Issues in Science and Technology Librarianship Social Sciences-Library and Information Sciences

CiteScore

1.00

自引率

0.00%

发文量