Expert consensus-derived evaluation criteria for orthodontic treatment outcomes using a novel ranking method: A retrospective dental cast analysis study

IF 1.9 Q2 DENTISTRY, ORAL SURGERY & MEDICINE

International Orthodontics Pub Date : 2025-09-25 DOI:10.1016/j.ortho.2025.101057

Huanhuan Chen, Hanwei Zheng, Yue Lai, Wei Li, Chenda Meng, Tianyi Wang, Guangying Song, Bing Han, Tianmin Xu

{"title":"Expert consensus-derived evaluation criteria for orthodontic treatment outcomes using a novel ranking method: A retrospective dental cast analysis study","authors":"Huanhuan Chen, Hanwei Zheng, Yue Lai, Wei Li, Chenda Meng, Tianyi Wang, Guangying Song, Bing Han, Tianmin Xu","doi":"10.1016/j.ortho.2025.101057","DOIUrl":null,"url":null,"abstract":"<div><h3>Objective</h3><div>This retrospective expert consensus study (PKUSSIRB No.202058145) aimed to establish expert consensus-derived evaluation criteria for orthodontic treatment outcomes using the Merge Ranking Method on post-treatment dental casts.</div></div><div><h3>Material and methods</h3><div>From patients treated at the Department of Orthodontics from January 2018 to December 2022, 216 cases were randomly selected for evaluation by 65 orthodontic experts using the Merge Ranking Method. Concurrently, nine objective indicators of the 216 post-treatment dental casts were measured by three researchers. The consistency analysis of experts’ subjective evaluation and researchers’ objective measurement was conducted, respectively. Through subjective-to-objective correlation analysis and regression analysis, the objective indicators significantly correlated with experts’ subjective evaluations were selected, their weights were determined, and the threshold values of grading evaluation were screened.</div></div><div><h3>Results</h3><div>The 65 orthodontic experts demonstrated: (1) moderate pairwise consistency (mean Spearman's ρ=0.560, 95% bootstrap CI: 0.556-0.564), (2) significant group-level concordance across two independent panels (Kendall's W=0.544–0.606, all <em>P</em>    <0.001), confirming panel homogeneity for subsequent analyses. Inter-rater reliability among the three researchers showed excellent consistency (mean ICC=0.835, 95% CI: 0.788–0.882, range: 0.736–0.920), paralleled by high intra-rater reliability (mean ICC=0.832, 95% CI: 0.806–0.858, range: 0.715–0.948) across all 216 cases. Six objective indicators (occlusal relationship, overbite, alignment, overjet, occlusal contact, and buccal-lingual inclination) significantly predicted expert evaluations in a regression model (cumulative R<sup>2</sup>  <0.001). The threshold values for grading orthodontic treatment outcomes as Excellent, Good, Fair, Poor, and Worst were screened to be 1.846, 2.454, 3.492, and 4.312, respectively.</div></div><div><h3>Conclusions</h3><div>This expert consensus study demonstrated moderate consistency in subjective orthodontic outcome evaluation, with the occlusal relationship emerging as the primary quality determinant. The developed Merge Ranking Method addressed conventional ranking limitations through its innovative two-stage approach: initial segmented evaluation reduced expert fatigue, while subsequent dynamic adjustments improved borderline case classification.</div></div>","PeriodicalId":45449,"journal":{"name":"International Orthodontics","volume":"24 1","pages":"Article 101057"},"PeriodicalIF":1.9000,"publicationDate":"2025-09-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Orthodontics","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1761722725000920","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"DENTISTRY, ORAL SURGERY & MEDICINE","Score":null,"Total":0}

引用次数: 0

Abstract

Objective

This retrospective expert consensus study (PKUSSIRB No.202058145) aimed to establish expert consensus-derived evaluation criteria for orthodontic treatment outcomes using the Merge Ranking Method on post-treatment dental casts.

Material and methods

From patients treated at the Department of Orthodontics from January 2018 to December 2022, 216 cases were randomly selected for evaluation by 65 orthodontic experts using the Merge Ranking Method. Concurrently, nine objective indicators of the 216 post-treatment dental casts were measured by three researchers. The consistency analysis of experts’ subjective evaluation and researchers’ objective measurement was conducted, respectively. Through subjective-to-objective correlation analysis and regression analysis, the objective indicators significantly correlated with experts’ subjective evaluations were selected, their weights were determined, and the threshold values of grading evaluation were screened.

Results

The 65 orthodontic experts demonstrated: (1) moderate pairwise consistency (mean Spearman's ρ = 0.560, 95% bootstrap CI: 0.556-0.564), (2) significant group-level concordance across two independent panels (Kendall's W = 0.544–0.606, all P < 0.001), and (3) near-perfect cross-panel reliability for 24 overlapping cases (Kendall's τ-b = 0.833–0.880, P < 0.001), confirming panel homogeneity for subsequent analyses. Inter-rater reliability among the three researchers showed excellent consistency (mean ICC = 0.835, 95% CI: 0.788–0.882, range: 0.736–0.920), paralleled by high intra-rater reliability (mean ICC = 0.832, 95% CI: 0.806–0.858, range: 0.715–0.948) across all 216 cases. Six objective indicators (occlusal relationship, overbite, alignment, overjet, occlusal contact, and buccal-lingual inclination) significantly predicted expert evaluations in a regression model (cumulative R² = 0.598, P < 0.001). The threshold values for grading orthodontic treatment outcomes as Excellent, Good, Fair, Poor, and Worst were screened to be 1.846, 2.454, 3.492, and 4.312, respectively.

Conclusions

This expert consensus study demonstrated moderate consistency in subjective orthodontic outcome evaluation, with the occlusal relationship emerging as the primary quality determinant. The developed Merge Ranking Method addressed conventional ranking limitations through its innovative two-stage approach: initial segmented evaluation reduced expert fatigue, while subsequent dynamic adjustments improved borderline case classification.

查看原文本刊更多论文

专家共识衍生的评估标准正畸治疗结果使用一种新的排名方法：回顾性牙模分析研究

目的本回顾性专家共识研究（PKUSSIRB No.202058145）旨在建立基于专家共识的正畸治疗后牙模疗效评价标准。材料与方法从2018年1月至2022年12月在口腔正畸科就诊的患者中随机抽取216例，由65名正畸专家采用合并排序法进行评价。同时，三位研究者对216个术后牙模的9项客观指标进行了测量。分别对专家主观评价和研究者客观测量结果进行一致性分析。通过主客观相关分析和回归分析，选择与专家主观评价显著相关的客观指标，确定其权重，筛选分级评价阈值。结果65名正畸专家显示：(1)中等的两两一致性（平均Spearman ρ = 0.560, 95% bootstrap CI: 0.556-0.564），(2)两个独立面板间显著的组水平一致性（Kendall的W = 0.544-0.606，所有P <； 0.001），(3) 24个重叠病例的接近完美的面板间可靠性（Kendall的ρ -b = 0.833-0.880, P < 0.001），证实了后续分析的面板一致性。在所有216个病例中，三位研究者的评估间信度表现出极好的一致性（平均ICC = 0.835, 95% CI: 0.788-0.882，范围：0.736-0.920），与之平行的是高评估间信度（平均ICC = 0.832, 95% CI: 0.806-0.858，范围：0.715-0.948）。在回归模型中，六个客观指标（咬合关系、覆咬合、排列、覆咬合、咬合接触和颊-舌倾斜）显著预测专家评价（累积R2 = 0.598, P < 0.001）。正畸治疗结果分级的阈值分别为：优、好、一般、差、最差，分别为1.846、2.454、3.492、4.312。结论专家共识研究表明，主观正畸结果评价具有中等程度的一致性，咬合关系成为主要的质量决定因素。所开发的合并排名方法通过其创新的两阶段方法解决了传统排名的局限性：最初的分段评估减少了专家的疲劳，而随后的动态调整改进了边界案例分类。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

International Orthodontics DENTISTRY, ORAL SURGERY & MEDICINE-

CiteScore

2.50

自引率

13.30%

发文量

审稿时长

26 days

期刊介绍： Une revue de référence dans le domaine de orthodontie et des disciplines frontières Your reference in dentofacial orthopedics International Orthodontics adresse aux orthodontistes, aux dentistes, aux stomatologistes, aux chirurgiens maxillo-faciaux et aux plasticiens de la face, ainsi quà leurs assistant(e)s. International Orthodontics is addressed to orthodontists, dentists, stomatologists, maxillofacial surgeons and facial plastic surgeons, as well as their assistants.