Text duplication of papers in four medical related fields

IF 1.5 3区 管理学 Q2 INFORMATION SCIENCE & LIBRARY SCIENCE
Ping Ni, Lianhui Shan, Yong Li, Xinying An
{"title":"Text duplication of papers in four medical related fields","authors":"Ping Ni, Lianhui Shan, Yong Li, Xinying An","doi":"10.2478/jdis-2023-0024","DOIUrl":null,"url":null,"abstract":"Abstract Purpose To reveal the typical features of text duplication in papers from four medical fields: basic medicine, health management, pharmacology and pharmacy, and public health and preventive medicine. To analyze the reasons for duplication and provide suggestions for the management of medical academic misconduct. Design/methodology/approach In total, 2,469 representative Chinese journal papers were included in our research, which were submitted by researchers in 2020 and 2021. A plagiarism check was carried out using the Academic Misconduct Literature Check System (AMLC). We generated a corrected similarity index based on the AMLC general similarity index for further analysis. We compared the similarity indices of papers in four medical fields and revealed their trends over time; differences in similarity index between review and research articles were also analyzed according to the different fields. Further analysis of 143 papers suspected of plagiarism was also performed from the perspective of sections containing duplication and according to the field of research. Findings Papers in the field of pharmacology and pharmacy had the highest similarity index (8.67 ± 5.92%), which was significantly higher than that in other fields, except health management. The similarity index of review articles (9.77 ± 10.28%) was significantly higher than that of research articles (7.41 ± 6.26%). In total, 143 papers were suspected of plagiarism (5.80%) with similarity indices ≥ 15%; most were papers on health management (78, 54.55%), followed by public health and preventive medicine (38, 26.58%); 90.21% of the 143 papers had duplication in multiple sections, while only 9.79% had duplication in a single section. The distribution of sections with duplication varied among different fields; papers in pharmacology and pharmacy were more likely to have duplication in the data/methods and introduction/background sections, however, papers in health management were more likely to contain duplication in the introduction/background or results/discussion sections. Different structures for papers in different fields may have caused these differences. Research limitations There were three limitations to our research. Firstly, we observed that a small number of papers have been checked early. It is unknown who conducted the plagiarism check as this can be included in other evaluations, such as applications for Science and technology projects or awards. If the authors carried out the check, text with high similarity indices may have been excluded before submission, meaning the similarity index in our research may have been lower than the original value. Secondly, there were only four medical fields included in our research. Additional analysis on a wider scale is required in the future. Thirdly, only a general similarity index was calculated in our study; other similarity indices were not tested. Practical implications A comprehensive analysis of similarity indices in four medical fields was performed. We made several recommendations for the supervision of medical academic misconduct and the formation of criteria for defining suspected plagiarism for medical papers, as well as for the improved accuracy of text duplication checks. Originality/value We quantified the differences between the AMLC general similarity index and the corrected index, described the situation around text duplication and plagiarism in papers from four medical fields, and revealed differences in similarity indices between different article types. We also revealed differences in the sections containing duplication for papers with suspected plagiarism among different fields.","PeriodicalId":44622,"journal":{"name":"Journal of Data and Information Science","volume":"52 1","pages":"0"},"PeriodicalIF":1.5000,"publicationDate":"2023-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Data and Information Science","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2478/jdis-2023-0024","RegionNum":3,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"INFORMATION SCIENCE & LIBRARY SCIENCE","Score":null,"Total":0}
引用次数: 0

Abstract

Abstract Purpose To reveal the typical features of text duplication in papers from four medical fields: basic medicine, health management, pharmacology and pharmacy, and public health and preventive medicine. To analyze the reasons for duplication and provide suggestions for the management of medical academic misconduct. Design/methodology/approach In total, 2,469 representative Chinese journal papers were included in our research, which were submitted by researchers in 2020 and 2021. A plagiarism check was carried out using the Academic Misconduct Literature Check System (AMLC). We generated a corrected similarity index based on the AMLC general similarity index for further analysis. We compared the similarity indices of papers in four medical fields and revealed their trends over time; differences in similarity index between review and research articles were also analyzed according to the different fields. Further analysis of 143 papers suspected of plagiarism was also performed from the perspective of sections containing duplication and according to the field of research. Findings Papers in the field of pharmacology and pharmacy had the highest similarity index (8.67 ± 5.92%), which was significantly higher than that in other fields, except health management. The similarity index of review articles (9.77 ± 10.28%) was significantly higher than that of research articles (7.41 ± 6.26%). In total, 143 papers were suspected of plagiarism (5.80%) with similarity indices ≥ 15%; most were papers on health management (78, 54.55%), followed by public health and preventive medicine (38, 26.58%); 90.21% of the 143 papers had duplication in multiple sections, while only 9.79% had duplication in a single section. The distribution of sections with duplication varied among different fields; papers in pharmacology and pharmacy were more likely to have duplication in the data/methods and introduction/background sections, however, papers in health management were more likely to contain duplication in the introduction/background or results/discussion sections. Different structures for papers in different fields may have caused these differences. Research limitations There were three limitations to our research. Firstly, we observed that a small number of papers have been checked early. It is unknown who conducted the plagiarism check as this can be included in other evaluations, such as applications for Science and technology projects or awards. If the authors carried out the check, text with high similarity indices may have been excluded before submission, meaning the similarity index in our research may have been lower than the original value. Secondly, there were only four medical fields included in our research. Additional analysis on a wider scale is required in the future. Thirdly, only a general similarity index was calculated in our study; other similarity indices were not tested. Practical implications A comprehensive analysis of similarity indices in four medical fields was performed. We made several recommendations for the supervision of medical academic misconduct and the formation of criteria for defining suspected plagiarism for medical papers, as well as for the improved accuracy of text duplication checks. Originality/value We quantified the differences between the AMLC general similarity index and the corrected index, described the situation around text duplication and plagiarism in papers from four medical fields, and revealed differences in similarity indices between different article types. We also revealed differences in the sections containing duplication for papers with suspected plagiarism among different fields.
四个医学相关领域的论文文本重复
摘要目的揭示基础医学、卫生管理、药理学与药学、公共卫生与预防医学四个医学领域论文文本重复的典型特征。分析重复的原因,为医学学术不端行为的管理提供建议。设计/方法/方法我们的研究共纳入了2,469篇具有代表性的中文期刊论文,这些论文由研究人员在2020年和2021年提交。使用学术不端文献检查系统(AMLC)进行了剽窃检查。我们在AMLC通用相似度指数的基础上生成了修正后的相似度指数,以便进一步分析。我们比较了四个医学领域的论文相似度指数,揭示了它们随时间的变化趋势;根据不同的研究领域,分析了综述文章与研究文章相似度指标的差异。对143篇涉嫌抄袭的论文也从包含重复的章节角度和根据研究领域进行了进一步的分析。结果药理学和药学领域的论文相似度最高(8.67±5.92%),显著高于除卫生管理以外的其他领域。综述文章的相似度指数(9.77±10.28%)显著高于研究文章的相似度指数(7.41±6.26%)。疑似抄袭论文143篇(5.80%),相似度指数≥15%;以卫生管理类论文最多(78篇,54.55%),其次是公共卫生与预防医学(38篇,26.58%);143篇论文中,90.21%的论文存在多篇重复,而仅9.79%的论文存在单篇重复。重复区段的分布在不同的领域有所不同;药理学和药学的论文在数据/方法和介绍/背景部分更有可能出现重复,然而,卫生管理的论文在介绍/背景或结果/讨论部分更有可能出现重复。不同领域论文的不同结构可能造成了这些差异。我们的研究有三个局限性。首先,我们观察到有少量的论文被提前检查过。目前还不清楚是谁进行了剽窃检查,因为这可以包括在科学技术项目申请或奖励申请等其他评估中。如果作者进行了检查,可能在投稿前已经排除了相似度指数高的文本,这意味着我们研究中的相似度指数可能低于原值。其次,我们的研究只涉及了四个医学领域。今后需要在更大范围内进行进一步分析。第三,我们的研究只计算了一个一般的相似指数;其他相似度指标未进行检验。本文对四个医学领域的相似度指标进行了综合分析。我们就监督医学学术不端行为、制定医学论文疑似抄袭的定义标准以及提高文本重复检查的准确性提出了几项建议。我们量化了AMLC一般相似度指标与修正后相似度指标的差异,描述了四个医学领域论文的文本重复和抄袭情况,揭示了不同文章类型之间相似度指标的差异。我们还揭示了在不同领域中,有抄袭嫌疑的论文在包含重复部分上的差异。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Journal of Data and Information Science
Journal of Data and Information Science INFORMATION SCIENCE & LIBRARY SCIENCE-
CiteScore
3.50
自引率
6.70%
发文量
495
期刊介绍: JDIS devotes itself to the study and application of the theories, methods, techniques, services, infrastructural facilities using big data to support knowledge discovery for decision & policy making. The basic emphasis is big data-based, analytics centered, knowledge discovery driven, and decision making supporting. The special effort is on the knowledge discovery to detect and predict structures, trends, behaviors, relations, evolutions and disruptions in research, innovation, business, politics, security, media and communications, and social development, where the big data may include metadata or full content data, text or non-textural data, structured or non-structural data, domain specific or cross-domain data, and dynamic or interactive data. The main areas of interest are: (1) New theories, methods, and techniques of big data based data mining, knowledge discovery, and informatics, including but not limited to scientometrics, communication analysis, social network analysis, tech & industry analysis, competitive intelligence, knowledge mapping, evidence based policy analysis, and predictive analysis. (2) New methods, architectures, and facilities to develop or improve knowledge infrastructure capable to support knowledge organization and sophisticated analytics, including but not limited to ontology construction, knowledge organization, semantic linked data, knowledge integration and fusion, semantic retrieval, domain specific knowledge infrastructure, and semantic sciences. (3) New mechanisms, methods, and tools to embed knowledge analytics and knowledge discovery into actual operation, service, or managerial processes, including but not limited to knowledge assisted scientific discovery, data mining driven intelligent workflows in learning, communications, and management. Specific topic areas may include: Knowledge organization Knowledge discovery and data mining Knowledge integration and fusion Semantic Web metrics Scientometrics Analytic and diagnostic informetrics Competitive intelligence Predictive analysis Social network analysis and metrics Semantic and interactively analytic retrieval Evidence-based policy analysis Intelligent knowledge production Knowledge-driven workflow management and decision-making Knowledge-driven collaboration and its management Domain knowledge infrastructure with knowledge fusion and analytics Development of data and information services
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信