时间序列分类的不确定性估计

IF 5.8 2区 物理与天体物理 Q1 ASTRONOMY & ASTROPHYSICS
M. Cádiz-Leyton, G. Cabrera-Vives, P. Protopapas, D. Moreno-Cartagena, C. Donoso-Oliva, I. Becker
{"title":"时间序列分类的不确定性估计","authors":"M. Cádiz-Leyton, G. Cabrera-Vives, P. Protopapas, D. Moreno-Cartagena, C. Donoso-Oliva, I. Becker","doi":"10.1051/0004-6361/202453388","DOIUrl":null,"url":null,"abstract":"<i>Context.<i/> Classifying variable stars is key to understanding stellar evolution and galactic dynamics. With the demands of large astronomical surveys, machine learning models, especially attention-based neural networks, have become the state of the art. While achieving high accuracy is crucial, improving model interpretability and uncertainty estimation is equally important to ensuring that insights are both reliable and comprehensible.<i>Aims.<i/> We aim to enhance transformer-based models for classifying astronomical light curves by incorporating uncertainty estimation techniques to detect misclassified instances. We tested our methods on labeled datasets from MACHO, OGLE-III, and ATLAS, introducing a framework that significantly improves the reliability of automated classification for next-generation surveys.<i>Methods.<i/> We used Astromer, a transformer-based encoder designed to capture representations of single-band light curves. We enhanced its capabilities by applying three methods for quantifying uncertainty: Monte Carlo dropout (MC Dropout), hierarchical stochastic attention, and a novel hybrid method that combines the two approaches (HA-MC Dropout). We compared these methods against a baseline of deep ensembles. To estimate uncertainty scores for the misclassification task, we used the following uncertainty estimates: the sampled maximum probability, probability variance (PV), and Bayesian active learning by disagreement.<i>Results.<i/> In predictive performance tests, HA-MC Dropout outperforms the baseline, achieving macro F1-scores of 79.8 ± 0.5 on OGLE, 84 ± 1.3 on ATLAS, and 76.6 ± 1.8 on MACHO. When comparing the PV score values, the quality of uncertainty estimation by HA-MC Dropout surpasses that of all other methods, with improvements of 2.5 ± 2.3 for MACHO, 3.3 ± 2.1 for ATLAS, and 8.5 ± 1.6 for OGLE-III.","PeriodicalId":8571,"journal":{"name":"Astronomy & Astrophysics","volume":"108 1","pages":""},"PeriodicalIF":5.8000,"publicationDate":"2025-07-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Uncertainty estimation for time series classification\",\"authors\":\"M. Cádiz-Leyton, G. Cabrera-Vives, P. Protopapas, D. Moreno-Cartagena, C. Donoso-Oliva, I. Becker\",\"doi\":\"10.1051/0004-6361/202453388\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<i>Context.<i/> Classifying variable stars is key to understanding stellar evolution and galactic dynamics. With the demands of large astronomical surveys, machine learning models, especially attention-based neural networks, have become the state of the art. While achieving high accuracy is crucial, improving model interpretability and uncertainty estimation is equally important to ensuring that insights are both reliable and comprehensible.<i>Aims.<i/> We aim to enhance transformer-based models for classifying astronomical light curves by incorporating uncertainty estimation techniques to detect misclassified instances. We tested our methods on labeled datasets from MACHO, OGLE-III, and ATLAS, introducing a framework that significantly improves the reliability of automated classification for next-generation surveys.<i>Methods.<i/> We used Astromer, a transformer-based encoder designed to capture representations of single-band light curves. We enhanced its capabilities by applying three methods for quantifying uncertainty: Monte Carlo dropout (MC Dropout), hierarchical stochastic attention, and a novel hybrid method that combines the two approaches (HA-MC Dropout). We compared these methods against a baseline of deep ensembles. To estimate uncertainty scores for the misclassification task, we used the following uncertainty estimates: the sampled maximum probability, probability variance (PV), and Bayesian active learning by disagreement.<i>Results.<i/> In predictive performance tests, HA-MC Dropout outperforms the baseline, achieving macro F1-scores of 79.8 ± 0.5 on OGLE, 84 ± 1.3 on ATLAS, and 76.6 ± 1.8 on MACHO. When comparing the PV score values, the quality of uncertainty estimation by HA-MC Dropout surpasses that of all other methods, with improvements of 2.5 ± 2.3 for MACHO, 3.3 ± 2.1 for ATLAS, and 8.5 ± 1.6 for OGLE-III.\",\"PeriodicalId\":8571,\"journal\":{\"name\":\"Astronomy & Astrophysics\",\"volume\":\"108 1\",\"pages\":\"\"},\"PeriodicalIF\":5.8000,\"publicationDate\":\"2025-07-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Astronomy & Astrophysics\",\"FirstCategoryId\":\"101\",\"ListUrlMain\":\"https://doi.org/10.1051/0004-6361/202453388\",\"RegionNum\":2,\"RegionCategory\":\"物理与天体物理\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ASTRONOMY & ASTROPHYSICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Astronomy & Astrophysics","FirstCategoryId":"101","ListUrlMain":"https://doi.org/10.1051/0004-6361/202453388","RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ASTRONOMY & ASTROPHYSICS","Score":null,"Total":0}
引用次数: 0

摘要

上下文。对变星进行分类是理解恒星演化和星系动力学的关键。随着大型天文调查的需求,机器学习模型,特别是基于注意力的神经网络,已经成为最先进的技术。虽然实现高精度是至关重要的,但提高模型可解释性和不确定性估计对于确保见解既可靠又可理解同样重要。我们的目标是通过结合不确定性估计技术来检测错误分类的实例,增强基于变压器的天文光曲线分类模型。我们在MACHO、OGLE-III和ATLAS的标记数据集上测试了我们的方法,引入了一个框架,显著提高了下一代调查自动分类的可靠性。我们使用了Astromer,这是一种基于变压器的编码器,旨在捕捉单波段光曲线的表示。我们通过应用三种量化不确定性的方法来增强其能力:蒙特卡罗dropout (MC dropout),分层随机注意,以及一种结合两种方法的新型混合方法(HA-MC dropout)。我们将这些方法与深度集成的基线进行了比较。为了估计错误分类任务的不确定性得分,我们使用了以下不确定性估计:抽样最大概率、概率方差(PV)和贝叶斯主动学习。在预测性能测试中,HA-MC Dropout优于基线,在OGLE上达到宏观f1得分79.8±0.5,在ATLAS上达到84±1.3,在MACHO上达到76.6±1.8。当比较PV评分值时,HA-MC Dropout的不确定性估计质量优于所有其他方法,MACHO的改进为2.5±2.3,ATLAS的改进为3.3±2.1,OGLE-III的改进为8.5±1.6。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Uncertainty estimation for time series classification
Context. Classifying variable stars is key to understanding stellar evolution and galactic dynamics. With the demands of large astronomical surveys, machine learning models, especially attention-based neural networks, have become the state of the art. While achieving high accuracy is crucial, improving model interpretability and uncertainty estimation is equally important to ensuring that insights are both reliable and comprehensible.Aims. We aim to enhance transformer-based models for classifying astronomical light curves by incorporating uncertainty estimation techniques to detect misclassified instances. We tested our methods on labeled datasets from MACHO, OGLE-III, and ATLAS, introducing a framework that significantly improves the reliability of automated classification for next-generation surveys.Methods. We used Astromer, a transformer-based encoder designed to capture representations of single-band light curves. We enhanced its capabilities by applying three methods for quantifying uncertainty: Monte Carlo dropout (MC Dropout), hierarchical stochastic attention, and a novel hybrid method that combines the two approaches (HA-MC Dropout). We compared these methods against a baseline of deep ensembles. To estimate uncertainty scores for the misclassification task, we used the following uncertainty estimates: the sampled maximum probability, probability variance (PV), and Bayesian active learning by disagreement.Results. In predictive performance tests, HA-MC Dropout outperforms the baseline, achieving macro F1-scores of 79.8 ± 0.5 on OGLE, 84 ± 1.3 on ATLAS, and 76.6 ± 1.8 on MACHO. When comparing the PV score values, the quality of uncertainty estimation by HA-MC Dropout surpasses that of all other methods, with improvements of 2.5 ± 2.3 for MACHO, 3.3 ± 2.1 for ATLAS, and 8.5 ± 1.6 for OGLE-III.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Astronomy & Astrophysics
Astronomy & Astrophysics 地学天文-天文与天体物理
CiteScore
10.20
自引率
27.70%
发文量
2105
审稿时长
1-2 weeks
期刊介绍: Astronomy & Astrophysics is an international Journal that publishes papers on all aspects of astronomy and astrophysics (theoretical, observational, and instrumental) independently of the techniques used to obtain the results.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信