用连体长短期记忆检测科学论文中的利益冲突

Akhmad Bakhrul Ilmi, D. Purwitasari, C. Fatichah
{"title":"用连体长短期记忆检测科学论文中的利益冲突","authors":"Akhmad Bakhrul Ilmi, D. Purwitasari, C. Fatichah","doi":"10.12962/j20882033.v30i2.5008","DOIUrl":null,"url":null,"abstract":"Scientific articles cited by other researchers have an impact on increasing author credibility. However, the citation process may be misused to unnaturally raise a bibliometric indicator value such as researcher’s h-index. Researchers may overly cites their own works, referred as self-citation, even though the topic of the references are not related to the current article. Further misconduct is excessive citations on the works of peoples related to the researcher which can be coercive or not, referred as conflict of interest (CoI). The proposed method uses a deep learning approach, Siamese Long ShortTerm Memory (LSTM), to recognize subject similarities between a scientific article and its references. Standard text similarity fails to do so because contextual relatedness of sentences in the articles need some learning process. Siamese-LSTM learns contextual relatedness of sentences in the article using two identical LSTM. Steps of the proposed method are (i) wordembedding to get weight values of terms but still considers their semantic relations, (ii) k-means clustering to generate training data for reducing time complexity in Siamese-LSTM learning of scientific articles, (iii) learns Siamese-LSTM weight from training data to identify contextual relatedness of sentences, (iv) calculate similarity of a scientific article with its references based on Siamese-LSTM. The empirical experiments are used to analyze similarity values and the possibility for conflict of interest in an article. KeywordsCitation, Conflict of Interest, Scientific Text, Deep Learning, Similarity, Text Processing.","PeriodicalId":14549,"journal":{"name":"IPTEK: The Journal for Technology and Science","volume":"41 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2019-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Siamese Long Short-Term Memory for Detecting Conflict of Interest on Scientific Papers\",\"authors\":\"Akhmad Bakhrul Ilmi, D. Purwitasari, C. Fatichah\",\"doi\":\"10.12962/j20882033.v30i2.5008\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Scientific articles cited by other researchers have an impact on increasing author credibility. However, the citation process may be misused to unnaturally raise a bibliometric indicator value such as researcher’s h-index. Researchers may overly cites their own works, referred as self-citation, even though the topic of the references are not related to the current article. Further misconduct is excessive citations on the works of peoples related to the researcher which can be coercive or not, referred as conflict of interest (CoI). The proposed method uses a deep learning approach, Siamese Long ShortTerm Memory (LSTM), to recognize subject similarities between a scientific article and its references. Standard text similarity fails to do so because contextual relatedness of sentences in the articles need some learning process. Siamese-LSTM learns contextual relatedness of sentences in the article using two identical LSTM. Steps of the proposed method are (i) wordembedding to get weight values of terms but still considers their semantic relations, (ii) k-means clustering to generate training data for reducing time complexity in Siamese-LSTM learning of scientific articles, (iii) learns Siamese-LSTM weight from training data to identify contextual relatedness of sentences, (iv) calculate similarity of a scientific article with its references based on Siamese-LSTM. The empirical experiments are used to analyze similarity values and the possibility for conflict of interest in an article. KeywordsCitation, Conflict of Interest, Scientific Text, Deep Learning, Similarity, Text Processing.\",\"PeriodicalId\":14549,\"journal\":{\"name\":\"IPTEK: The Journal for Technology and Science\",\"volume\":\"41 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-07-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IPTEK: The Journal for Technology and Science\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.12962/j20882033.v30i2.5008\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IPTEK: The Journal for Technology and Science","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.12962/j20882033.v30i2.5008","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

摘要

被其他研究人员引用的科学文章对提高作者的可信度有影响。然而,引文过程可能被滥用,以不自然地提高文献计量指标值,如研究者的h指数。研究人员可能会过度引用自己的作品,被称为自引,即使参考文献的主题与当前文章无关。进一步的不当行为是过度引用与研究人员有关的人的作品,这可能是强制性的,也可能不是,称为利益冲突(CoI)。提出的方法使用深度学习方法,暹罗长短期记忆(LSTM),以识别科学文章及其参考文献之间的主题相似性。标准文本相似度无法做到这一点,因为文章中句子的语境相关性需要一定的学习过程。siame -LSTM使用两个相同的LSTM学习文章中句子的上下文相关性。本文提出的方法的步骤是(i)在考虑其语义关系的情况下,对术语进行词嵌入,获得其权重值;(ii) k-means聚类,生成训练数据,降低科学文章暹罗- lstm学习的时间复杂度;(iii)从训练数据中学习暹罗- lstm权重,识别句子的上下文相关性;(iv)基于暹罗- lstm计算科学文章与其参考文献的相似度。通过实证实验分析了文章的相似度值和利益冲突的可能性。关键词引文,利益冲突,科学文本,深度学习,相似度,文本处理。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Siamese Long Short-Term Memory for Detecting Conflict of Interest on Scientific Papers
Scientific articles cited by other researchers have an impact on increasing author credibility. However, the citation process may be misused to unnaturally raise a bibliometric indicator value such as researcher’s h-index. Researchers may overly cites their own works, referred as self-citation, even though the topic of the references are not related to the current article. Further misconduct is excessive citations on the works of peoples related to the researcher which can be coercive or not, referred as conflict of interest (CoI). The proposed method uses a deep learning approach, Siamese Long ShortTerm Memory (LSTM), to recognize subject similarities between a scientific article and its references. Standard text similarity fails to do so because contextual relatedness of sentences in the articles need some learning process. Siamese-LSTM learns contextual relatedness of sentences in the article using two identical LSTM. Steps of the proposed method are (i) wordembedding to get weight values of terms but still considers their semantic relations, (ii) k-means clustering to generate training data for reducing time complexity in Siamese-LSTM learning of scientific articles, (iii) learns Siamese-LSTM weight from training data to identify contextual relatedness of sentences, (iv) calculate similarity of a scientific article with its references based on Siamese-LSTM. The empirical experiments are used to analyze similarity values and the possibility for conflict of interest in an article. KeywordsCitation, Conflict of Interest, Scientific Text, Deep Learning, Similarity, Text Processing.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
17
审稿时长
9 weeks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信