{"title":"Model interpretability enhances domain generalization in the case of textual complexity modeling.","authors":"Frans van der Sluis, Egon L van den Broek","doi":"10.1016/j.patter.2025.101177","DOIUrl":null,"url":null,"abstract":"<p><p>Balancing prediction accuracy, model interpretability, and domain generalization (also known as [a.k.a.] out-of-distribution testing/evaluation) is a central challenge in machine learning. To assess this challenge, we took 120 interpretable and 166 opaque models from 77,640 tuned configurations, complemented with ChatGPT, 3 probabilistic language models, and Vec2Read. The models first performed text classification to derive principles of textual complexity (task 1) and then generalized these to predict readers' appraisals of processing difficulty (task 2). The results confirmed the known accuracy-interpretability trade-off on task 1. However, task 2's domain generalization showed that interpretable models outperform complex, opaque models. Multiplicative interactions further improved interpretable models' domain generalization incrementally. We advocate for the value of big data for training, complemented by (1) external theories to enhance interpretability and guide machine learning and (2) small, well-crafted out-of-distribution data to validate models-together ensuring domain generalization and robustness against data shifts.</p>","PeriodicalId":36242,"journal":{"name":"Patterns","volume":"6 2","pages":"101177"},"PeriodicalIF":6.7000,"publicationDate":"2025-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11873011/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Patterns","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1016/j.patter.2025.101177","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/2/14 0:00:00","PubModel":"eCollection","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Balancing prediction accuracy, model interpretability, and domain generalization (also known as [a.k.a.] out-of-distribution testing/evaluation) is a central challenge in machine learning. To assess this challenge, we took 120 interpretable and 166 opaque models from 77,640 tuned configurations, complemented with ChatGPT, 3 probabilistic language models, and Vec2Read. The models first performed text classification to derive principles of textual complexity (task 1) and then generalized these to predict readers' appraisals of processing difficulty (task 2). The results confirmed the known accuracy-interpretability trade-off on task 1. However, task 2's domain generalization showed that interpretable models outperform complex, opaque models. Multiplicative interactions further improved interpretable models' domain generalization incrementally. We advocate for the value of big data for training, complemented by (1) external theories to enhance interpretability and guide machine learning and (2) small, well-crafted out-of-distribution data to validate models-together ensuring domain generalization and robustness against data shifts.
平衡预测精度、模型可解释性和领域泛化(也称为[a.k.a。分布外测试(out- distribution testing/evaluation)是机器学习的核心挑战。为了评估这一挑战,我们从77,640个调优配置中选取了120个可解释模型和166个不透明模型,并补充了ChatGPT、3个概率语言模型和Vec2Read。这些模型首先进行文本分类,得出文本复杂性的原则(任务1),然后将这些原则推广到预测读者对处理难度的评价(任务2)。结果证实了任务1中已知的准确性和可解释性权衡。然而,任务2的领域概括表明,可解释的模型优于复杂的、不透明的模型。乘法交互进一步提高了可解释模型的领域泛化能力。我们提倡大数据在训练中的价值,并辅以:(1)增强可解释性和指导机器学习的外部理论;(2)精心制作的小型分布外数据来验证模型——共同确保领域泛化和对数据转移的鲁棒性。