Text document clustering using self organizing map: Theses and dissertations of universitas Indonesia

Yantine Arsita Br. Panjaitan, I. Surjandari, Asma Rosyidah
{"title":"Text document clustering using self organizing map: Theses and dissertations of universitas Indonesia","authors":"Yantine Arsita Br. Panjaitan, I. Surjandari, Asma Rosyidah","doi":"10.1109/ICSITECH.2017.8257096","DOIUrl":null,"url":null,"abstract":"Accessibility is a critical aspect to be considered by college library in order to facilitate users in searching library collections. The Library of Universitas Indonesia, as one of Asia's largest library with more than 1,500,000 book collections, should also concern about accessibility to balance its numerous collections. UI-ana collections or works produced by and associated with Universitas Indonesia; in particular theses (undergraduate and graduate theses) and dissertations are one of the largest numbers of collections in Universitas Indonesia's Library. However, the current collection's management system was still based on the submission of the collection in Universitas Indonesia's Library. Since these collections are arranged with no exact criterion, it is harder for users to find theses and dissertations with the same topic. Therefore, management of these collections based on certain criterion is extremely needed to facilitate users in searching these collections. This research aims to determine the categories that can represent theses and dissertations through abstract text mining of each collection in 2005–2015 with a clustering algorithm, namely Self-organizing Map. This study found 139 categories which will be used to classify theses and dissertations of Universitas Indonesia.","PeriodicalId":165045,"journal":{"name":"2017 3rd International Conference on Science in Information Technology (ICSITech)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 3rd International Conference on Science in Information Technology (ICSITech)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSITECH.2017.8257096","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

Accessibility is a critical aspect to be considered by college library in order to facilitate users in searching library collections. The Library of Universitas Indonesia, as one of Asia's largest library with more than 1,500,000 book collections, should also concern about accessibility to balance its numerous collections. UI-ana collections or works produced by and associated with Universitas Indonesia; in particular theses (undergraduate and graduate theses) and dissertations are one of the largest numbers of collections in Universitas Indonesia's Library. However, the current collection's management system was still based on the submission of the collection in Universitas Indonesia's Library. Since these collections are arranged with no exact criterion, it is harder for users to find theses and dissertations with the same topic. Therefore, management of these collections based on certain criterion is extremely needed to facilitate users in searching these collections. This research aims to determine the categories that can represent theses and dissertations through abstract text mining of each collection in 2005–2015 with a clustering algorithm, namely Self-organizing Map. This study found 139 categories which will be used to classify theses and dissertations of Universitas Indonesia.
使用自组织地图的文本文档聚类:印尼大学的论文
可访问性是高校图书馆为方便用户检索馆藏而必须考虑的一个重要方面。印度尼西亚大学图书馆作为亚洲最大的图书馆之一,拥有超过150万册藏书,也应该关注可访问性,以平衡其众多的藏书。由印度尼西亚大学制作或与之有关的UI-ana收藏品或作品;特别是论文(本科生和研究生论文)和学位论文是印度尼西亚大学图书馆最大的馆藏之一。然而,目前的馆藏管理系统仍然是基于向印尼大学图书馆提交的馆藏。由于这些集合的排列没有精确的标准,用户很难找到相同主题的论文和学位论文。因此,迫切需要根据一定的标准对这些馆藏进行管理,以方便用户查找这些馆藏。本研究旨在利用聚类算法Self-organizing Map对2005-2015年的每个文集进行抽象文本挖掘,确定可以代表论文和学位论文的类别。本研究发现了139个类别,这些类别将用于对印度尼西亚大学的论文进行分类。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信