Ryan A Tesh, Anika Zahoor, Jayme Banks, Kaileigh Gallagher, Christine A Eckhardt, Haoqi Sun, Ioannis Karakis, Roohi Katyal, Jonathan Williams, Chetan Nayak, Aline Herlopian, Marcus C Ng, Adam S Greenblatt, Emma Meyers, Mike Westmeijer, Daniel S Harrison, Wolfgang Ganglberger, Galina Gheihman, Tracey Fan, Aaron F Struck, Irfan S Sheikh, Fábio A Nascimento, M Brandon Westover
{"title":"Inter-Rater Reliability of EEG-Based Encephalopathy Grading.","authors":"Ryan A Tesh, Anika Zahoor, Jayme Banks, Kaileigh Gallagher, Christine A Eckhardt, Haoqi Sun, Ioannis Karakis, Roohi Katyal, Jonathan Williams, Chetan Nayak, Aline Herlopian, Marcus C Ng, Adam S Greenblatt, Emma Meyers, Mike Westmeijer, Daniel S Harrison, Wolfgang Ganglberger, Galina Gheihman, Tracey Fan, Aaron F Struck, Irfan S Sheikh, Fábio A Nascimento, M Brandon Westover","doi":"10.1097/WNP.0000000000001185","DOIUrl":null,"url":null,"abstract":"<p><strong>Purpose: </strong>Visual EEG Confusion Assessment Method-Severity (VE-CAM-S) quantifies encephalopathy severity based on electroencephalography features. This study evaluated inter-rater reliability among experts using the VE-CAM-S scale.</p><p><strong>Methods: </strong>Nine experts from six institutions independently reviewed 32 15-second electroencephalography samples in an online test, assessing 29 features (16 in the VE-CAM-S and 13 additional, or \"VE-CAM-S+\"). A consensus of three experts served as the gold standard. Performance was measured by the median Matthews correlation coefficient between expert and gold-standard VE-CAM-S+ scores, along with average sensitivity and specificity. Qualitative analysis identified common feature-recognition errors affecting scores.</p><p><strong>Results: </strong>Experts achieved a median Matthews correlation coefficient of 0.82 [95% CI: 0.74-0.99]. Specificity exceeded 90% for most features except background β (87%) and generalized delta (71%). Sensitivity was ≥65% except for burst suppression with epileptiform activity (61%), extreme delta brush (EDB; 61%), posterior dominant rhythm (50%), background α (59%) and β (42%). Common errors included missing subtle findings, confusing features, and misidentifying extreme delta brush.</p><p><strong>Conclusions: </strong>This pilot study offers some initial support for the reliability of VE-CAM-S+ scoring. The largest errors occurred when experts missed or falsely identified features with higher weight in the VE-CAM-S. Encephalopathy grading through VE-CAM-S may be improved by breaking high-stakes features into smaller parts, creating a \"cheat sheet\" with scored examples, and designing teaching materials.</p>","PeriodicalId":15516,"journal":{"name":"Journal of Clinical Neurophysiology","volume":" ","pages":""},"PeriodicalIF":2.3000,"publicationDate":"2025-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Clinical Neurophysiology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1097/WNP.0000000000001185","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"CLINICAL NEUROLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Purpose: Visual EEG Confusion Assessment Method-Severity (VE-CAM-S) quantifies encephalopathy severity based on electroencephalography features. This study evaluated inter-rater reliability among experts using the VE-CAM-S scale.
Methods: Nine experts from six institutions independently reviewed 32 15-second electroencephalography samples in an online test, assessing 29 features (16 in the VE-CAM-S and 13 additional, or "VE-CAM-S+"). A consensus of three experts served as the gold standard. Performance was measured by the median Matthews correlation coefficient between expert and gold-standard VE-CAM-S+ scores, along with average sensitivity and specificity. Qualitative analysis identified common feature-recognition errors affecting scores.
Results: Experts achieved a median Matthews correlation coefficient of 0.82 [95% CI: 0.74-0.99]. Specificity exceeded 90% for most features except background β (87%) and generalized delta (71%). Sensitivity was ≥65% except for burst suppression with epileptiform activity (61%), extreme delta brush (EDB; 61%), posterior dominant rhythm (50%), background α (59%) and β (42%). Common errors included missing subtle findings, confusing features, and misidentifying extreme delta brush.
Conclusions: This pilot study offers some initial support for the reliability of VE-CAM-S+ scoring. The largest errors occurred when experts missed or falsely identified features with higher weight in the VE-CAM-S. Encephalopathy grading through VE-CAM-S may be improved by breaking high-stakes features into smaller parts, creating a "cheat sheet" with scored examples, and designing teaching materials.
期刊介绍:
The Journal of Clinical Neurophysiology features both topical reviews and original research in both central and peripheral neurophysiology, as related to patient evaluation and treatment.
Official Journal of the American Clinical Neurophysiology Society.