Sam Mitchinson, Jessica H Johnson, Ben Milner, Oliver Lamb, Yannik Behr
{"title":"Capturing expert uncertainty: ICC-informed soft labelling for volcano-seismicity.","authors":"Sam Mitchinson, Jessica H Johnson, Ben Milner, Oliver Lamb, Yannik Behr","doi":"10.1007/s00445-025-01875-4","DOIUrl":null,"url":null,"abstract":"<p><p>Reliable classification of volcano-seismic signals underpins monitoring and eruption forecasting and is an essential tool for advancing understanding of subsurface processes. However, traditional approaches may overlook the inherent uncertainty and variability between expert judgments. We introduce an innovative method that explicitly quantifies inter-expert agreement using the intraclass correlation coefficient (ICC) and incorporates this measure into probabilistic, ICC-informed soft labels, which can be fed into machine learning pipelines. We conducted a global survey involving 89 experts who classified a set of 80 volcano-seismic events from Ruapehu, New Zealand, providing continuous ratings for standard categories: volcano tectonic (VT), hybrid (HYB), long-period (LP), and other (OT). ICC agreement scores revealed that single-rater scores produce poor agreement between experts even for well-established VT and LP classifications. However, reliability significantly improved for these classifications when multiple expert ratings were combined, although, for HYB and OT categories, expert disagreement remained substantial. We developed a soft labelling methodology that weights class probabilities by their respective ICC scores, resulting in a distribution that naturally reflects expert uncertainty. This demonstrates that ICC-informed soft labels could provide a robust alternative to the hard label standard by explicitly capturing classification uncertainty and variability. Our fully probabilistic view has the potential to significantly enhance machine learning model accuracy, robustness, and transferability across volcanic systems and should provide a fundamental shift in how volcano-seismic data are labelled and interpreted within automated monitoring frameworks.</p><p><strong>Supplementary information: </strong>The online version contains supplementary material available at 10.1007/s00445-025-01875-4.</p>","PeriodicalId":55297,"journal":{"name":"Bulletin of Volcanology","volume":"87 10","pages":"84"},"PeriodicalIF":3.2000,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12441074/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Bulletin of Volcanology","FirstCategoryId":"89","ListUrlMain":"https://doi.org/10.1007/s00445-025-01875-4","RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/9/16 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"GEOSCIENCES, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0
Abstract
Reliable classification of volcano-seismic signals underpins monitoring and eruption forecasting and is an essential tool for advancing understanding of subsurface processes. However, traditional approaches may overlook the inherent uncertainty and variability between expert judgments. We introduce an innovative method that explicitly quantifies inter-expert agreement using the intraclass correlation coefficient (ICC) and incorporates this measure into probabilistic, ICC-informed soft labels, which can be fed into machine learning pipelines. We conducted a global survey involving 89 experts who classified a set of 80 volcano-seismic events from Ruapehu, New Zealand, providing continuous ratings for standard categories: volcano tectonic (VT), hybrid (HYB), long-period (LP), and other (OT). ICC agreement scores revealed that single-rater scores produce poor agreement between experts even for well-established VT and LP classifications. However, reliability significantly improved for these classifications when multiple expert ratings were combined, although, for HYB and OT categories, expert disagreement remained substantial. We developed a soft labelling methodology that weights class probabilities by their respective ICC scores, resulting in a distribution that naturally reflects expert uncertainty. This demonstrates that ICC-informed soft labels could provide a robust alternative to the hard label standard by explicitly capturing classification uncertainty and variability. Our fully probabilistic view has the potential to significantly enhance machine learning model accuracy, robustness, and transferability across volcanic systems and should provide a fundamental shift in how volcano-seismic data are labelled and interpreted within automated monitoring frameworks.
Supplementary information: The online version contains supplementary material available at 10.1007/s00445-025-01875-4.
期刊介绍:
Bulletin of Volcanology was founded in 1922, as Bulletin Volcanologique, and is the official journal of the International Association of Volcanology and Chemistry of the Earth’s Interior (IAVCEI). The Bulletin of Volcanology publishes papers on volcanoes, their products, their eruptive behavior, and their hazards. Papers aimed at understanding the deeper structure of volcanoes, and the evolution of magmatic systems using geochemical, petrological, and geophysical techniques are also published. Material is published in four sections: Review Articles; Research Articles; Short Scientific Communications; and a Forum that provides for discussion of controversial issues and for comment and reply on previously published Articles and Communications.