{"title":"Patterns of observer error in scoring macromorphoscopic traits for population affinity.","authors":"Leandi Liebenberg, Kyra E Stull, Ericka N L'Abbé","doi":"10.1111/1556-4029.70063","DOIUrl":null,"url":null,"abstract":"<p><p>Revising methodologies is essential to understand the limitations and biases inherent in certain methods, which is crucial for obtaining reliable results. Due to the subjective nature of non-metric methods, variation in trait scoring and its impact on accurately classifying biological parameters remains a concern that requires further investigation. This study aimed to examine the effects of observer experience, familiarity with the method, and different statistical approaches on the repeatability of macromorphoscopic traits in the cranium for population affinity. Seventeen traits were scored on a sample of 10 crania by five observers with varying experience levels. Intra-observer agreement ranged from moderate to perfect, with three traits-inferior nasal margin, nasal bone shape, and nasal overgrowth demonstrating-the lowest agreement. Overall, inter-observer repeatability ranged from poor to substantial agreement. After a group discussion on the scoring procedure and subsequent rescoring of the crania, a slight improvement in agreement was observed, with kappa values shifting towards moderate and substantial levels. Each observer exhibited variation in the repeatability of different traits. While general experience did not consistently translate into proficiency with the method, familiarity with the specific traits and scoring procedures contributed to more consistent results. Therefore, method-specific training is crucial before applying the MMS traits in practice. Additionally, the choice of statistical approaches-such as applying different weights to Cohen's kappa based on data type-can influence the perceived reliability of a method. Practitioners should select weights and tests that are most appropriate for the data type of each trait being analyzed.</p>","PeriodicalId":94080,"journal":{"name":"Journal of forensic sciences","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2025-05-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of forensic sciences","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1111/1556-4029.70063","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Revising methodologies is essential to understand the limitations and biases inherent in certain methods, which is crucial for obtaining reliable results. Due to the subjective nature of non-metric methods, variation in trait scoring and its impact on accurately classifying biological parameters remains a concern that requires further investigation. This study aimed to examine the effects of observer experience, familiarity with the method, and different statistical approaches on the repeatability of macromorphoscopic traits in the cranium for population affinity. Seventeen traits were scored on a sample of 10 crania by five observers with varying experience levels. Intra-observer agreement ranged from moderate to perfect, with three traits-inferior nasal margin, nasal bone shape, and nasal overgrowth demonstrating-the lowest agreement. Overall, inter-observer repeatability ranged from poor to substantial agreement. After a group discussion on the scoring procedure and subsequent rescoring of the crania, a slight improvement in agreement was observed, with kappa values shifting towards moderate and substantial levels. Each observer exhibited variation in the repeatability of different traits. While general experience did not consistently translate into proficiency with the method, familiarity with the specific traits and scoring procedures contributed to more consistent results. Therefore, method-specific training is crucial before applying the MMS traits in practice. Additionally, the choice of statistical approaches-such as applying different weights to Cohen's kappa based on data type-can influence the perceived reliability of a method. Practitioners should select weights and tests that are most appropriate for the data type of each trait being analyzed.