A review of cluster analysis techniques and their uses in library and information science research: k-means and k-medoids clustering

IF 1.8 Q2 INFORMATION SCIENCE & LIBRARY SCIENCE
B. Lund, Jinxuan Ma
{"title":"A review of cluster analysis techniques and their uses in library and information science research: k-means and k-medoids clustering","authors":"B. Lund, Jinxuan Ma","doi":"10.1108/pmm-05-2021-0026","DOIUrl":null,"url":null,"abstract":"PurposeThis literature review explores the definitions and characteristics of cluster analysis, a machine-learning technique that is frequently implemented to identify groupings in big datasets and its applicability to library and information science (LIS) research. This overview is intended for researchers who are interested in expanding their data analysis repertory to include cluster analysis, rather than for existing experts in this area.Design/methodology/approachA review of LIS articles included in the Library and Information Source (EBSCO) database that employ cluster analysis is performed. An overview of cluster analysis in general (how it works from a statistical standpoint, and how it can be performed by researchers), the most popular cluster analysis techniques and the uses of cluster analysis in LIS is presented.FindingsThe number of LIS studies that employ a cluster analytic approach has grown from about 5 per year in the early 2000s to an average of 35 studies per year in the mid- and late-2010s. The journal Scientometrics has the most articles published within LIS that use cluster analysis (102 studies). Scientometrics is the most common subject area to employ a cluster analytic approach (152 studies). The findings of this review indicate that cluster analysis could make LIS research more accessible by providing an innovative and insightful process of knowledge discovery.Originality/valueThis review is the first to present cluster analysis as an accessible data analysis approach, specifically from an LIS perspective.","PeriodicalId":44583,"journal":{"name":"Performance Measurement and Metrics","volume":"1 1","pages":""},"PeriodicalIF":1.8000,"publicationDate":"2021-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"12","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Performance Measurement and Metrics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1108/pmm-05-2021-0026","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"INFORMATION SCIENCE & LIBRARY SCIENCE","Score":null,"Total":0}
引用次数: 12

Abstract

PurposeThis literature review explores the definitions and characteristics of cluster analysis, a machine-learning technique that is frequently implemented to identify groupings in big datasets and its applicability to library and information science (LIS) research. This overview is intended for researchers who are interested in expanding their data analysis repertory to include cluster analysis, rather than for existing experts in this area.Design/methodology/approachA review of LIS articles included in the Library and Information Source (EBSCO) database that employ cluster analysis is performed. An overview of cluster analysis in general (how it works from a statistical standpoint, and how it can be performed by researchers), the most popular cluster analysis techniques and the uses of cluster analysis in LIS is presented.FindingsThe number of LIS studies that employ a cluster analytic approach has grown from about 5 per year in the early 2000s to an average of 35 studies per year in the mid- and late-2010s. The journal Scientometrics has the most articles published within LIS that use cluster analysis (102 studies). Scientometrics is the most common subject area to employ a cluster analytic approach (152 studies). The findings of this review indicate that cluster analysis could make LIS research more accessible by providing an innovative and insightful process of knowledge discovery.Originality/valueThis review is the first to present cluster analysis as an accessible data analysis approach, specifically from an LIS perspective.
聚类分析技术及其在图书馆情报学研究中的应用综述:k-means和k- medioids聚类
目的本文献综述探讨了聚类分析的定义和特征,这是一种经常用于识别大数据集中分组的机器学习技术,以及它在图书馆和信息科学(LIS)研究中的适用性。本综述旨在为有兴趣扩大其数据分析库以包括聚类分析的研究人员提供,而不是为该领域的现有专家提供。设计/方法/方法对图书馆和信息源(EBSCO)数据库中使用聚类分析的LIS文章进行审查。综述了聚类分析的一般情况(从统计学的角度来看,它是如何工作的,以及研究人员如何执行它)、最流行的聚类分析技术以及聚类分析在LIS中的应用。发现采用聚类分析方法的LIS研究数量已从21世纪初的每年约5项增长到20世纪中后期的平均每年35项。科学计量学杂志在LIS中发表的使用聚类分析的文章最多(102项研究)。科学计量学是采用聚类分析方法的最常见的学科领域(152项研究)。这篇综述的结果表明,聚类分析可以通过提供一个创新和深入的知识发现过程,使LIS研究更容易获得。原创性/价值这篇综述首次将聚类分析作为一种可访问的数据分析方法,特别是从LIS的角度。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Performance Measurement and Metrics
Performance Measurement and Metrics INFORMATION SCIENCE & LIBRARY SCIENCE-
CiteScore
2.20
自引率
0.00%
发文量
1
期刊介绍: ■Quantitative and qualitative analysis ■Benchmarking ■The measurement and role of information in enhancing organizational effectiveness ■Quality techniques and quality improvement ■Training and education ■Methods for performance measurement and metrics ■Standard assessment tools ■Using emerging technologies ■Setting standards or service quality
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信