小组讨论提高了基于系统评价的定性数据的评级类别的可靠性和有效性。

IF 2.6 3区综合性期刊 Q1 MULTIDISCIPLINARY SCIENCES

PLoS ONE Pub Date : 2025-06-18 eCollection Date: 2025-01-01 DOI:10.1371/journal.pone.0326166

Jutta Beher, Eric Treml, Brendan Wintle

{"title":"小组讨论提高了基于系统评价的定性数据的评级类别的可靠性和有效性。","authors":"Jutta Beher, Eric Treml, Brendan Wintle","doi":"10.1371/journal.pone.0326166","DOIUrl":null,"url":null,"abstract":"The number of literature reviews in the fields of ecology and conservation has increased dramatically in recent years. Scientists conduct systematic literature reviews with the aim of drawing conclusions based on the content of a representative sample of publications. This requires subjective judgments on qualitative content, including interpretations and deductions. However, subjective judgments can differ substantially even between highly trained experts that are faced with the same evidence. Because classification of content into codes by one individual rater is prone to subjectivity and error, general guidelines recommend checking the produced data for consistency and reliability. Metrics on agreement between multiple people exist to assess the rate of agreement (consistency). These metrics do not account for mistakes or allow for their correction, while group discussions about codes that have been derived from classification of qualitative data have shown to improve reliability and accuracy. Here, we describe a pragmatic approach to reliability testing that gives insights into the error rate of multiple raters. Five independent raters rated and discussed categories for 23 variables within 21 peer-reviewed publications on conservation management plans. Mistakes, including overlooking information in the text, were the most common source of disagreement, followed by differences in interpretation and ambiguity around categories. Discussions could resolve most differences in ratings. We recommend our approach as a significant improvement on current review and synthesis approaches that lack assessment of misclassification.","PeriodicalId":20189,"journal":{"name":"PLoS ONE","volume":"20 6","pages":"e0326166"},"PeriodicalIF":2.6000,"publicationDate":"2025-06-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12176165/pdf/","citationCount":"0","resultStr":"{\"title\":\"Group discussions improve reliability and validity of rated categories based on qualitative data from systematic review.\",\"authors\":\"Jutta Beher, Eric Treml, Brendan Wintle\",\"doi\":\"10.1371/journal.pone.0326166\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The number of literature reviews in the fields of ecology and conservation has increased dramatically in recent years. Scientists conduct systematic literature reviews with the aim of drawing conclusions based on the content of a representative sample of publications. This requires subjective judgments on qualitative content, including interpretations and deductions. However, subjective judgments can differ substantially even between highly trained experts that are faced with the same evidence. Because classification of content into codes by one individual rater is prone to subjectivity and error, general guidelines recommend checking the produced data for consistency and reliability. Metrics on agreement between multiple people exist to assess the rate of agreement (consistency). These metrics do not account for mistakes or allow for their correction, while group discussions about codes that have been derived from classification of qualitative data have shown to improve reliability and accuracy. Here, we describe a pragmatic approach to reliability testing that gives insights into the error rate of multiple raters. Five independent raters rated and discussed categories for 23 variables within 21 peer-reviewed publications on conservation management plans. Mistakes, including overlooking information in the text, were the most common source of disagreement, followed by differences in interpretation and ambiguity around categories. Discussions could resolve most differences in ratings. We recommend our approach as a significant improvement on current review and synthesis approaches that lack assessment of misclassification.\",\"PeriodicalId\":20189,\"journal\":{\"name\":\"PLoS ONE\",\"volume\":\"20 6\",\"pages\":\"e0326166\"},\"PeriodicalIF\":2.6000,\"publicationDate\":\"2025-06-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12176165/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"PLoS ONE\",\"FirstCategoryId\":\"103\",\"ListUrlMain\":\"https://doi.org/10.1371/journal.pone.0326166\",\"RegionNum\":3,\"RegionCategory\":\"综合性期刊\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2025/1/1 0:00:00\",\"PubModel\":\"eCollection\",\"JCR\":\"Q1\",\"JCRName\":\"MULTIDISCIPLINARY SCIENCES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"PLoS ONE","FirstCategoryId":"103","ListUrlMain":"https://doi.org/10.1371/journal.pone.0326166","RegionNum":3,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/1 0:00:00","PubModel":"eCollection","JCR":"Q1","JCRName":"MULTIDISCIPLINARY SCIENCES","Score":null,"Total":0}

引用次数: 0

摘要

近年来，生态学和保护领域的文献综述数量急剧增加。科学家进行系统的文献综述，目的是根据出版物的代表性样本的内容得出结论。这需要对定性内容进行主观判断，包括解释和演绎。然而，即使在面对相同证据的训练有素的专家之间，主观判断也可能存在很大差异。由于单个评分者将内容分类为代码容易出现主观性和错误，因此一般指南建议检查生成的数据的一致性和可靠性。多人之间的一致性度量是用来评估一致性的比率（一致性）。这些指标不考虑错误或允许其纠正，而关于从定性数据分类中得出的代码的小组讨论已显示出提高可靠性和准确性。在这里，我们描述了一种实用的可靠性测试方法，该方法可以深入了解多个评分者的错误率。五名独立评估师对21份同行评议的关于保护管理计划的出版物中的23个变量进行了评级和讨论。错误，包括忽略文本中的信息，是最常见的分歧来源，其次是解释上的差异和对类别的歧义。讨论可以解决评级方面的大多数分歧。我们推荐我们的方法是对目前缺乏错误分类评估的综述和综合方法的重大改进。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

Group discussions improve reliability and validity of rated categories based on qualitative data from systematic review.

查看原文本刊更多论文

Group discussions improve reliability and validity of rated categories based on qualitative data from systematic review.

The number of literature reviews in the fields of ecology and conservation has increased dramatically in recent years. Scientists conduct systematic literature reviews with the aim of drawing conclusions based on the content of a representative sample of publications. This requires subjective judgments on qualitative content, including interpretations and deductions. However, subjective judgments can differ substantially even between highly trained experts that are faced with the same evidence. Because classification of content into codes by one individual rater is prone to subjectivity and error, general guidelines recommend checking the produced data for consistency and reliability. Metrics on agreement between multiple people exist to assess the rate of agreement (consistency). These metrics do not account for mistakes or allow for their correction, while group discussions about codes that have been derived from classification of qualitative data have shown to improve reliability and accuracy. Here, we describe a pragmatic approach to reliability testing that gives insights into the error rate of multiple raters. Five independent raters rated and discussed categories for 23 variables within 21 peer-reviewed publications on conservation management plans. Mistakes, including overlooking information in the text, were the most common source of disagreement, followed by differences in interpretation and ambiguity around categories. Discussions could resolve most differences in ratings. We recommend our approach as a significant improvement on current review and synthesis approaches that lack assessment of misclassification.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

PLoS ONE 生物-生物学

CiteScore

6.20

自引率

5.40%

发文量

14242

审稿时长

3.7 months

期刊介绍： PLOS ONE is an international, peer-reviewed, open-access, online publication. PLOS ONE welcomes reports on primary research from any scientific discipline. It provides: * Open-access—freely accessible online, authors retain copyright * Fast publication times * Peer review by expert, practicing researchers * Post-publication tools to indicate quality and impact * Community-based dialogue on articles * Worldwide media coverage