使用关联规则消除出版场所名称的歧义

D. Pereira, Eduardo Emanuel Braga da Silva, A. Esmin
{"title":"使用关联规则消除出版场所名称的歧义","authors":"D. Pereira, Eduardo Emanuel Braga da Silva, A. Esmin","doi":"10.1109/JCDL.2014.6970153","DOIUrl":null,"url":null,"abstract":"Research agencies in several countries evaluate the impact of scientific publications of researcher groups to define their investments, and one of the main used metrics is the quality of the publication venues where their works were published. Several bibliometric indexes have been formulated by measuring the quality of a publication venue. However, given a set of citations extracted, for example, from curricula vitae of a researcher group, to effectively use bibliometric indexes to evaluate their quality it is necessary to identify correctly the publication venue title of each citation. This task is not easy, since there are not unique identifiers for publication venues. Frequently, citations contain abbreviated forms and acronyms, publication venues share similar titles, sometimes they change their titles, divide or merge, creating new ones. Traditional digital libraries deal with this problem by creating Authority Files. In this work, we present a twofold contribution: (i) the creation of a Computer Science publication venue authority file and (ii) the proposal of a method that uses association rules to disambiguate publication venue titles originated from citations. The disambiguator is a supervised learning method that uses the authority file to train a classifier, whose generated model is a set of association rules to identify publication venues. Experiments show that our method obtains better results than three state of art baselines.","PeriodicalId":92278,"journal":{"name":"Proceedings of the ... ACM/IEEE Joint Conference on Digital Libraries. ACM/IEEE Joint Conference on Digital Libraries","volume":"60 1","pages":"77-86"},"PeriodicalIF":0.0000,"publicationDate":"2014-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"14","resultStr":"{\"title\":\"Disambiguating publication venue titles using association rules\",\"authors\":\"D. Pereira, Eduardo Emanuel Braga da Silva, A. Esmin\",\"doi\":\"10.1109/JCDL.2014.6970153\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Research agencies in several countries evaluate the impact of scientific publications of researcher groups to define their investments, and one of the main used metrics is the quality of the publication venues where their works were published. Several bibliometric indexes have been formulated by measuring the quality of a publication venue. However, given a set of citations extracted, for example, from curricula vitae of a researcher group, to effectively use bibliometric indexes to evaluate their quality it is necessary to identify correctly the publication venue title of each citation. This task is not easy, since there are not unique identifiers for publication venues. Frequently, citations contain abbreviated forms and acronyms, publication venues share similar titles, sometimes they change their titles, divide or merge, creating new ones. Traditional digital libraries deal with this problem by creating Authority Files. In this work, we present a twofold contribution: (i) the creation of a Computer Science publication venue authority file and (ii) the proposal of a method that uses association rules to disambiguate publication venue titles originated from citations. The disambiguator is a supervised learning method that uses the authority file to train a classifier, whose generated model is a set of association rules to identify publication venues. Experiments show that our method obtains better results than three state of art baselines.\",\"PeriodicalId\":92278,\"journal\":{\"name\":\"Proceedings of the ... ACM/IEEE Joint Conference on Digital Libraries. ACM/IEEE Joint Conference on Digital Libraries\",\"volume\":\"60 1\",\"pages\":\"77-86\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2014-09-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"14\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the ... ACM/IEEE Joint Conference on Digital Libraries. ACM/IEEE Joint Conference on Digital Libraries\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/JCDL.2014.6970153\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the ... ACM/IEEE Joint Conference on Digital Libraries. ACM/IEEE Joint Conference on Digital Libraries","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/JCDL.2014.6970153","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 14

摘要

一些国家的研究机构评估研究小组的科学出版物的影响,以确定他们的投资,其中一个主要使用的指标是他们发表作品的出版场所的质量。为了衡量出版场所的质量,已经制定了几个文献计量指标。然而,给定一组引文,例如,从一个研究小组的简历中提取,为了有效地使用文献计量指标来评估它们的质量,有必要正确识别每个引文的出版地点标题。这项任务并不容易,因为没有发布场所的唯一标识符。通常,引文包含缩写形式和首字母缩略词,出版场所有相似的标题,有时他们会改变标题,分裂或合并,创建新的标题。传统的数字图书馆通过创建权限文件来解决这个问题。在这项工作中,我们提出了双重贡献:(i)创建计算机科学出版场所权威文件;(ii)提出了一种使用关联规则消除源自引用的出版场所名称歧义的方法。消歧器是一种监督学习方法,它使用权威文件训练分类器,分类器生成的模型是一组关联规则,用于识别出版场所。实验表明,该方法比现有的三种基线方法获得了更好的效果。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Disambiguating publication venue titles using association rules
Research agencies in several countries evaluate the impact of scientific publications of researcher groups to define their investments, and one of the main used metrics is the quality of the publication venues where their works were published. Several bibliometric indexes have been formulated by measuring the quality of a publication venue. However, given a set of citations extracted, for example, from curricula vitae of a researcher group, to effectively use bibliometric indexes to evaluate their quality it is necessary to identify correctly the publication venue title of each citation. This task is not easy, since there are not unique identifiers for publication venues. Frequently, citations contain abbreviated forms and acronyms, publication venues share similar titles, sometimes they change their titles, divide or merge, creating new ones. Traditional digital libraries deal with this problem by creating Authority Files. In this work, we present a twofold contribution: (i) the creation of a Computer Science publication venue authority file and (ii) the proposal of a method that uses association rules to disambiguate publication venue titles originated from citations. The disambiguator is a supervised learning method that uses the authority file to train a classifier, whose generated model is a set of association rules to identify publication venues. Experiments show that our method obtains better results than three state of art baselines.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信