使用文本匹配软件的相似度评分。

IF 2.8 1区哲学 Q1 MEDICAL ETHICS

Accountability in Research-Policies and Quality Assurance Pub Date : 2023-05-01 DOI:10.1080/08989621.2021.1986018

Stewart Manley

{"title":"使用文本匹配软件的相似度评分。","authors":"Stewart Manley","doi":"10.1080/08989621.2021.1986018","DOIUrl":null,"url":null,"abstract":"Popular text-matching software generates a percentage of similarity - called a \"similarity score\" or \"Similarity Index\" - that quantifies the matching text between a particular manuscript and content in the software's archives, on the Internet and in electronic databases. Many evaluators rely on these simple figures as a proxy for plagiarism and thus avoid the burdensome task of inspecting the longer, detailed Similarity Reports. Yet similarity scores, though alluringly straightforward, are never enough to judge the presence (or absence) of plagiarism. Ideally, evaluators should always examine the Similarity Reports. Given the persistent use of simplistic similarity score thresholds at some academic journals and educational institutions, however, and the time that can be saved by relying on the scores, a method is arguably needed that encourages examining the Similarity Reports but still also allows evaluators to rely on the scores in some instances. This article proposes a four-band method to accomplish this. Used together, the bands oblige evaluators to acknowledge the risk of relying on the similarity scores yet still allow them to ultimately determine whether they wish to accept that risk. The bands - for most rigor, high rigor, moderate rigor and less rigor - should be tailored to an evaluator's particular needs.","PeriodicalId":50927,"journal":{"name":"Accountability in Research-Policies and Quality Assurance","volume":"30 4","pages":"219-245"},"PeriodicalIF":2.8000,"publicationDate":"2023-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"The use of text-matching software's similarity scores.\",\"authors\":\"Stewart Manley\",\"doi\":\"10.1080/08989621.2021.1986018\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Popular text-matching software generates a percentage of similarity - called a \\\"similarity score\\\" or \\\"Similarity Index\\\" - that quantifies the matching text between a particular manuscript and content in the software's archives, on the Internet and in electronic databases. Many evaluators rely on these simple figures as a proxy for plagiarism and thus avoid the burdensome task of inspecting the longer, detailed Similarity Reports. Yet similarity scores, though alluringly straightforward, are never enough to judge the presence (or absence) of plagiarism. Ideally, evaluators should always examine the Similarity Reports. Given the persistent use of simplistic similarity score thresholds at some academic journals and educational institutions, however, and the time that can be saved by relying on the scores, a method is arguably needed that encourages examining the Similarity Reports but still also allows evaluators to rely on the scores in some instances. This article proposes a four-band method to accomplish this. Used together, the bands oblige evaluators to acknowledge the risk of relying on the similarity scores yet still allow them to ultimately determine whether they wish to accept that risk. The bands - for most rigor, high rigor, moderate rigor and less rigor - should be tailored to an evaluator's particular needs.\",\"PeriodicalId\":50927,\"journal\":{\"name\":\"Accountability in Research-Policies and Quality Assurance\",\"volume\":\"30 4\",\"pages\":\"219-245\"},\"PeriodicalIF\":2.8000,\"publicationDate\":\"2023-05-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Accountability in Research-Policies and Quality Assurance\",\"FirstCategoryId\":\"98\",\"ListUrlMain\":\"https://doi.org/10.1080/08989621.2021.1986018\",\"RegionNum\":1,\"RegionCategory\":\"哲学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"MEDICAL ETHICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Accountability in Research-Policies and Quality Assurance","FirstCategoryId":"98","ListUrlMain":"https://doi.org/10.1080/08989621.2021.1986018","RegionNum":1,"RegionCategory":"哲学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MEDICAL ETHICS","Score":null,"Total":0}

引用次数: 4

摘要

流行的文本匹配软件生成一个相似度百分比，称为“相似度分数”或“相似度指数”，它量化了特定手稿与软件档案、互联网和电子数据库中的内容之间的匹配程度。许多评价者依赖这些简单的数字作为抄袭的代表，从而避免了检查更长的、详细的相似度报告的繁重任务。然而，相似度评分虽然直截了当，但永远不足以判断是否存在抄袭。理想情况下，评估人员应该总是检查相似度报告。然而，考虑到一些学术期刊和教育机构持续使用简单的相似性分数阈值，以及依靠分数可以节省的时间，我们可以说需要一种方法，既鼓励检查相似性报告，又允许评估人员在某些情况下依赖分数。本文提出了一种四波段方法来实现这一目标。在一起使用时，这些等级迫使评估者承认依赖相似度得分的风险，但仍允许他们最终决定是否愿意接受这种风险。最严格、高严格、中等严格和较不严格的范围应该根据评估者的特定需求进行调整。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

The use of text-matching software's similarity scores.

Popular text-matching software generates a percentage of similarity - called a "similarity score" or "Similarity Index" - that quantifies the matching text between a particular manuscript and content in the software's archives, on the Internet and in electronic databases. Many evaluators rely on these simple figures as a proxy for plagiarism and thus avoid the burdensome task of inspecting the longer, detailed Similarity Reports. Yet similarity scores, though alluringly straightforward, are never enough to judge the presence (or absence) of plagiarism. Ideally, evaluators should always examine the Similarity Reports. Given the persistent use of simplistic similarity score thresholds at some academic journals and educational institutions, however, and the time that can be saved by relying on the scores, a method is arguably needed that encourages examining the Similarity Reports but still also allows evaluators to rely on the scores in some instances. This article proposes a four-band method to accomplish this. Used together, the bands oblige evaluators to acknowledge the risk of relying on the similarity scores yet still allow them to ultimately determine whether they wish to accept that risk. The bands - for most rigor, high rigor, moderate rigor and less rigor - should be tailored to an evaluator's particular needs.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Accountability in Research-Policies and Quality Assurance MEDICAL ETHICS-

CiteScore

4.90

自引率

14.70%

发文量

审稿时长

>12 weeks

期刊介绍： Accountability in Research: Policies and Quality Assurance is devoted to the examination and critical analysis of systems for maximizing integrity in the conduct of research. It provides an interdisciplinary, international forum for the development of ethics, procedures, standards policies, and concepts to encourage the ethical conduct of research and to enhance the validity of research results. The journal welcomes views on advancing the integrity of research in the fields of general and multidisciplinary sciences, medicine, law, economics, statistics, management studies, public policy, politics, sociology, history, psychology, philosophy, ethics, and information science. All submitted manuscripts are subject to initial appraisal by the Editor, and if found suitable for further consideration, to peer review by independent, anonymous expert referees.