José Sá Silva, Ana Pereira, Vasco Abreu, João Pedro Filipe
{"title":"不同评估量表在磁共振成像腰椎椎间孔狭窄分类中的观察者间差异。","authors":"José Sá Silva, Ana Pereira, Vasco Abreu, João Pedro Filipe","doi":"10.1007/s00586-024-08612-z","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>The evaluation of lumbar spine degeneration on magnetic resonance imaging (MRI) is prone to inter-reader variability, including when assessing foraminal changes. This variability, often due to subjective criteria and inconsistent terminology, may affect clinical correlations. Standardized criteria could help improve agreement among readers.</p><p><strong>Materials and methods: </strong>MRI of the lumbar spine of 50 randomly selected patients were evaluated by 12 independent readers. Foraminal stenosis was assessed using four different rating scales for each patient. The first scale classified stenosis as presence/absence of neurologic compromise of the spinal nerve root at the foramen, the second scale classified stenosis as absent/mild/moderate/severe, the third scale as normal/contact of disk or osteophyte with the nerve root/deviation of the nerve root/compression of the nerve root, and the fourth scale utilized the Lee et al. criteria. Agreement analysis was performed using Fleiss' kappa coefficients.</p><p><strong>Results: </strong>Agreement was moderate using the first scale (k = 0.439), and significantly lower using the second, third and fourth scales (k = 0.310, k = 0.311, k = 0.295, respectively). When comparing the agreements obtained between board certified neuroradiologists and between neuroradiology residents, there was statistically significant differences when using the third and fourth scales, where the agreement for board certified neuroradiologists was higher, but still only fair. Individual kappas showed that in the second, third, and fourth scales the levels of agreement were higher in the extremes of the scale, namely, when there was no stenosis or when the stenosis was maximal with nerve compression.</p><p><strong>Conclusions: </strong>Levels of agreement can differ depending on the scale used. Simpler dichotomous scales may return higher levels of agreement compared to more complex ones. For the non-dichotomous scales, using different scales may not result in overall different levels of agreement. Given the overall low inter-rater agreements observed, there is probably significant potential to enhance agreement through more rigorous training and consensus-building.</p>","PeriodicalId":12323,"journal":{"name":"European Spine Journal","volume":" ","pages":"869-873"},"PeriodicalIF":2.6000,"publicationDate":"2025-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Inter-observer variability in the classification of lumbar foraminal stenosis in magnetic resonance imaging using different evaluation scales.\",\"authors\":\"José Sá Silva, Ana Pereira, Vasco Abreu, João Pedro Filipe\",\"doi\":\"10.1007/s00586-024-08612-z\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Background: </strong>The evaluation of lumbar spine degeneration on magnetic resonance imaging (MRI) is prone to inter-reader variability, including when assessing foraminal changes. This variability, often due to subjective criteria and inconsistent terminology, may affect clinical correlations. Standardized criteria could help improve agreement among readers.</p><p><strong>Materials and methods: </strong>MRI of the lumbar spine of 50 randomly selected patients were evaluated by 12 independent readers. Foraminal stenosis was assessed using four different rating scales for each patient. The first scale classified stenosis as presence/absence of neurologic compromise of the spinal nerve root at the foramen, the second scale classified stenosis as absent/mild/moderate/severe, the third scale as normal/contact of disk or osteophyte with the nerve root/deviation of the nerve root/compression of the nerve root, and the fourth scale utilized the Lee et al. criteria. Agreement analysis was performed using Fleiss' kappa coefficients.</p><p><strong>Results: </strong>Agreement was moderate using the first scale (k = 0.439), and significantly lower using the second, third and fourth scales (k = 0.310, k = 0.311, k = 0.295, respectively). When comparing the agreements obtained between board certified neuroradiologists and between neuroradiology residents, there was statistically significant differences when using the third and fourth scales, where the agreement for board certified neuroradiologists was higher, but still only fair. Individual kappas showed that in the second, third, and fourth scales the levels of agreement were higher in the extremes of the scale, namely, when there was no stenosis or when the stenosis was maximal with nerve compression.</p><p><strong>Conclusions: </strong>Levels of agreement can differ depending on the scale used. Simpler dichotomous scales may return higher levels of agreement compared to more complex ones. For the non-dichotomous scales, using different scales may not result in overall different levels of agreement. Given the overall low inter-rater agreements observed, there is probably significant potential to enhance agreement through more rigorous training and consensus-building.</p>\",\"PeriodicalId\":12323,\"journal\":{\"name\":\"European Spine Journal\",\"volume\":\" \",\"pages\":\"869-873\"},\"PeriodicalIF\":2.6000,\"publicationDate\":\"2025-03-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"European Spine Journal\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1007/s00586-024-08612-z\",\"RegionNum\":3,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2024/12/20 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q2\",\"JCRName\":\"CLINICAL NEUROLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"European Spine Journal","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1007/s00586-024-08612-z","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/12/20 0:00:00","PubModel":"Epub","JCR":"Q2","JCRName":"CLINICAL NEUROLOGY","Score":null,"Total":0}
Inter-observer variability in the classification of lumbar foraminal stenosis in magnetic resonance imaging using different evaluation scales.
Background: The evaluation of lumbar spine degeneration on magnetic resonance imaging (MRI) is prone to inter-reader variability, including when assessing foraminal changes. This variability, often due to subjective criteria and inconsistent terminology, may affect clinical correlations. Standardized criteria could help improve agreement among readers.
Materials and methods: MRI of the lumbar spine of 50 randomly selected patients were evaluated by 12 independent readers. Foraminal stenosis was assessed using four different rating scales for each patient. The first scale classified stenosis as presence/absence of neurologic compromise of the spinal nerve root at the foramen, the second scale classified stenosis as absent/mild/moderate/severe, the third scale as normal/contact of disk or osteophyte with the nerve root/deviation of the nerve root/compression of the nerve root, and the fourth scale utilized the Lee et al. criteria. Agreement analysis was performed using Fleiss' kappa coefficients.
Results: Agreement was moderate using the first scale (k = 0.439), and significantly lower using the second, third and fourth scales (k = 0.310, k = 0.311, k = 0.295, respectively). When comparing the agreements obtained between board certified neuroradiologists and between neuroradiology residents, there was statistically significant differences when using the third and fourth scales, where the agreement for board certified neuroradiologists was higher, but still only fair. Individual kappas showed that in the second, third, and fourth scales the levels of agreement were higher in the extremes of the scale, namely, when there was no stenosis or when the stenosis was maximal with nerve compression.
Conclusions: Levels of agreement can differ depending on the scale used. Simpler dichotomous scales may return higher levels of agreement compared to more complex ones. For the non-dichotomous scales, using different scales may not result in overall different levels of agreement. Given the overall low inter-rater agreements observed, there is probably significant potential to enhance agreement through more rigorous training and consensus-building.
期刊介绍:
"European Spine Journal" is a publication founded in response to the increasing trend toward specialization in spinal surgery and spinal pathology in general. The Journal is devoted to all spine related disciplines, including functional and surgical anatomy of the spine, biomechanics and pathophysiology, diagnostic procedures, and neurology, surgery and outcomes. The aim of "European Spine Journal" is to support the further development of highly innovative spine treatments including but not restricted to surgery and to provide an integrated and balanced view of diagnostic, research and treatment procedures as well as outcomes that will enhance effective collaboration among specialists worldwide. The “European Spine Journal” also participates in education by means of videos, interactive meetings and the endorsement of educative efforts.
Official publication of EUROSPINE, The Spine Society of Europe