解释的可分辨性:为医学设计更可接受和有意义的机器学习模型。

IF 4.4 2区 生物学 Q2 BIOCHEMISTRY & MOLECULAR BIOLOGY
Computational and structural biotechnology journal Pub Date : 2025-04-23 eCollection Date: 2025-01-01 DOI:10.1016/j.csbj.2025.04.021
Haomiao Wang, Julien Aligon, Julien May, Emmanuel Doumard, Nicolas Labroche, Cyrille Delpierre, Chantal Soulé-Dupuy, Louis Casteilla, Valérie Planat-Benard, Paul Monsarrat
{"title":"解释的可分辨性:为医学设计更可接受和有意义的机器学习模型。","authors":"Haomiao Wang, Julien Aligon, Julien May, Emmanuel Doumard, Nicolas Labroche, Cyrille Delpierre, Chantal Soulé-Dupuy, Louis Casteilla, Valérie Planat-Benard, Paul Monsarrat","doi":"10.1016/j.csbj.2025.04.021","DOIUrl":null,"url":null,"abstract":"<p><p>Although the benefits of machine learning are undeniable in healthcare, explainability plays a vital role in improving transparency and understanding the most decisive and persuasive variables for prediction. The challenge is to identify explanations that make sense to the biomedical expert. This work proposes <i>discernibility</i> as a new approach to faithfully reflect human cognition, based on the user's perception of a relationship between explanations and data for a given variable. A total of 50 participants (19 biomedical experts and 31 data scientists) evaluated their perception of the discernibility of explanations from both synthetic and human-based datasets (National Health and Nutrition Examination Survey). The low inter-rater reliability of discernibility (Intraclass Correlation Coefficient < 0.5), with no significant difference between areas of expertise or levels of education, highlights the need for an objective metric of discernibility. Thirteen statistical coefficients were evaluated for their ability to capture, for a given variable, the relationship between its values and its explanations using Passing-Bablok regression. Among these, dcor was shown to be a reliable metric for assessing the discernibility of explanations, effectively capturing the clarity of the relationship between the data and their explanations, and providing clues to underlying pathophysiological mechanisms not immediately apparent when examining individual predictors. Discernibility can also serve as an evaluation metric for model quality, helping to prevent overfitting and aiding in feature selection, ultimately providing medical practitioners with more accurate and persuasive results.</p>","PeriodicalId":10715,"journal":{"name":"Computational and structural biotechnology journal","volume":"27 ","pages":"1800-1808"},"PeriodicalIF":4.4000,"publicationDate":"2025-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12127544/pdf/","citationCount":"0","resultStr":"{\"title\":\"Discernibility in explanations: Designing more acceptable and meaningful machine learning models for medicine.\",\"authors\":\"Haomiao Wang, Julien Aligon, Julien May, Emmanuel Doumard, Nicolas Labroche, Cyrille Delpierre, Chantal Soulé-Dupuy, Louis Casteilla, Valérie Planat-Benard, Paul Monsarrat\",\"doi\":\"10.1016/j.csbj.2025.04.021\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Although the benefits of machine learning are undeniable in healthcare, explainability plays a vital role in improving transparency and understanding the most decisive and persuasive variables for prediction. The challenge is to identify explanations that make sense to the biomedical expert. This work proposes <i>discernibility</i> as a new approach to faithfully reflect human cognition, based on the user's perception of a relationship between explanations and data for a given variable. A total of 50 participants (19 biomedical experts and 31 data scientists) evaluated their perception of the discernibility of explanations from both synthetic and human-based datasets (National Health and Nutrition Examination Survey). The low inter-rater reliability of discernibility (Intraclass Correlation Coefficient < 0.5), with no significant difference between areas of expertise or levels of education, highlights the need for an objective metric of discernibility. Thirteen statistical coefficients were evaluated for their ability to capture, for a given variable, the relationship between its values and its explanations using Passing-Bablok regression. Among these, dcor was shown to be a reliable metric for assessing the discernibility of explanations, effectively capturing the clarity of the relationship between the data and their explanations, and providing clues to underlying pathophysiological mechanisms not immediately apparent when examining individual predictors. Discernibility can also serve as an evaluation metric for model quality, helping to prevent overfitting and aiding in feature selection, ultimately providing medical practitioners with more accurate and persuasive results.</p>\",\"PeriodicalId\":10715,\"journal\":{\"name\":\"Computational and structural biotechnology journal\",\"volume\":\"27 \",\"pages\":\"1800-1808\"},\"PeriodicalIF\":4.4000,\"publicationDate\":\"2025-04-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12127544/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computational and structural biotechnology journal\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://doi.org/10.1016/j.csbj.2025.04.021\",\"RegionNum\":2,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2025/1/1 0:00:00\",\"PubModel\":\"eCollection\",\"JCR\":\"Q2\",\"JCRName\":\"BIOCHEMISTRY & MOLECULAR BIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computational and structural biotechnology journal","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1016/j.csbj.2025.04.021","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/1 0:00:00","PubModel":"eCollection","JCR":"Q2","JCRName":"BIOCHEMISTRY & MOLECULAR BIOLOGY","Score":null,"Total":0}
引用次数: 0

摘要

虽然机器学习在医疗保健领域的好处是不可否认的,但可解释性在提高透明度和理解预测中最具决定性和说服力的变量方面发挥着至关重要的作用。挑战在于找出对生物医学专家来说有意义的解释。这项工作提出了可辨别性作为一种新的方法来忠实地反映人类认知,基于用户对给定变量的解释和数据之间关系的感知。共有50名参与者(19名生物医学专家和31名数据科学家)评估了他们对合成数据集和基于人类的数据集解释的可辨别性的看法(国家健康和营养检查调查)。可辨性的低等级间可靠性(类内相关系数< 0.5),在专业领域或教育水平之间没有显著差异,突出了对可辨性客观度量的需求。对13个统计系数进行了评估,因为它们能够捕捉给定变量的值与使用Passing-Bablok回归的解释之间的关系。其中,dor被证明是一种可靠的指标,用于评估解释的可辨别性,有效地捕捉数据与其解释之间关系的清晰度,并在检查单个预测因子时提供潜在病理生理机制的线索。可辨别性还可以作为模型质量的评估指标,有助于防止过拟合和辅助特征选择,最终为医疗从业者提供更准确和更有说服力的结果。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Discernibility in explanations: Designing more acceptable and meaningful machine learning models for medicine.

Although the benefits of machine learning are undeniable in healthcare, explainability plays a vital role in improving transparency and understanding the most decisive and persuasive variables for prediction. The challenge is to identify explanations that make sense to the biomedical expert. This work proposes discernibility as a new approach to faithfully reflect human cognition, based on the user's perception of a relationship between explanations and data for a given variable. A total of 50 participants (19 biomedical experts and 31 data scientists) evaluated their perception of the discernibility of explanations from both synthetic and human-based datasets (National Health and Nutrition Examination Survey). The low inter-rater reliability of discernibility (Intraclass Correlation Coefficient < 0.5), with no significant difference between areas of expertise or levels of education, highlights the need for an objective metric of discernibility. Thirteen statistical coefficients were evaluated for their ability to capture, for a given variable, the relationship between its values and its explanations using Passing-Bablok regression. Among these, dcor was shown to be a reliable metric for assessing the discernibility of explanations, effectively capturing the clarity of the relationship between the data and their explanations, and providing clues to underlying pathophysiological mechanisms not immediately apparent when examining individual predictors. Discernibility can also serve as an evaluation metric for model quality, helping to prevent overfitting and aiding in feature selection, ultimately providing medical practitioners with more accurate and persuasive results.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Computational and structural biotechnology journal
Computational and structural biotechnology journal Biochemistry, Genetics and Molecular Biology-Biophysics
CiteScore
9.30
自引率
3.30%
发文量
540
审稿时长
6 weeks
期刊介绍: Computational and Structural Biotechnology Journal (CSBJ) is an online gold open access journal publishing research articles and reviews after full peer review. All articles are published, without barriers to access, immediately upon acceptance. The journal places a strong emphasis on functional and mechanistic understanding of how molecular components in a biological process work together through the application of computational methods. Structural data may provide such insights, but they are not a pre-requisite for publication in the journal. Specific areas of interest include, but are not limited to: Structure and function of proteins, nucleic acids and other macromolecules Structure and function of multi-component complexes Protein folding, processing and degradation Enzymology Computational and structural studies of plant systems Microbial Informatics Genomics Proteomics Metabolomics Algorithms and Hypothesis in Bioinformatics Mathematical and Theoretical Biology Computational Chemistry and Drug Discovery Microscopy and Molecular Imaging Nanotechnology Systems and Synthetic Biology
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信