用于功能数据分类的随机样条树:环境时间序列的理论与应用

Donato Riccio, Fabrizio Maturo, Elvira Romano
{"title":"用于功能数据分类的随机样条树:环境时间序列的理论与应用","authors":"Donato Riccio, Fabrizio Maturo, Elvira Romano","doi":"arxiv-2409.07879","DOIUrl":null,"url":null,"abstract":"Functional data analysis (FDA) and ensemble learning can be powerful tools\nfor analyzing complex environmental time series. Recent literature has\nhighlighted the key role of diversity in enhancing accuracy and reducing\nvariance in ensemble methods.This paper introduces Randomized Spline Trees\n(RST), a novel algorithm that bridges these two approaches by incorporating\nrandomized functional representations into the Random Forest framework. RST\ngenerates diverse functional representations of input data using randomized\nB-spline parameters, creating an ensemble of decision trees trained on these\nvaried representations. We provide a theoretical analysis of how this\nfunctional diversity contributes to reducing generalization error and present\nempirical evaluations on six environmental time series classification tasks\nfrom the UCR Time Series Archive. Results show that RST variants outperform\nstandard Random Forests and Gradient Boosting on most datasets, improving\nclassification accuracy by up to 14\\%. The success of RST demonstrates the\npotential of adaptive functional representations in capturing complex temporal\npatterns in environmental data. This work contributes to the growing field of\nmachine learning techniques focused on functional data and opens new avenues\nfor research in environmental time series analysis.","PeriodicalId":501425,"journal":{"name":"arXiv - STAT - Methodology","volume":"67 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Randomized Spline Trees for Functional Data Classification: Theory and Application to Environmental Time Series\",\"authors\":\"Donato Riccio, Fabrizio Maturo, Elvira Romano\",\"doi\":\"arxiv-2409.07879\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Functional data analysis (FDA) and ensemble learning can be powerful tools\\nfor analyzing complex environmental time series. Recent literature has\\nhighlighted the key role of diversity in enhancing accuracy and reducing\\nvariance in ensemble methods.This paper introduces Randomized Spline Trees\\n(RST), a novel algorithm that bridges these two approaches by incorporating\\nrandomized functional representations into the Random Forest framework. RST\\ngenerates diverse functional representations of input data using randomized\\nB-spline parameters, creating an ensemble of decision trees trained on these\\nvaried representations. We provide a theoretical analysis of how this\\nfunctional diversity contributes to reducing generalization error and present\\nempirical evaluations on six environmental time series classification tasks\\nfrom the UCR Time Series Archive. Results show that RST variants outperform\\nstandard Random Forests and Gradient Boosting on most datasets, improving\\nclassification accuracy by up to 14\\\\%. The success of RST demonstrates the\\npotential of adaptive functional representations in capturing complex temporal\\npatterns in environmental data. This work contributes to the growing field of\\nmachine learning techniques focused on functional data and opens new avenues\\nfor research in environmental time series analysis.\",\"PeriodicalId\":501425,\"journal\":{\"name\":\"arXiv - STAT - Methodology\",\"volume\":\"67 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-09-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - STAT - Methodology\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2409.07879\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - STAT - Methodology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.07879","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

功能数据分析(FDA)和集合学习是分析复杂环境时间序列的有力工具。本文介绍了随机样条树(RST),这是一种新型算法,它将随机化函数表示纳入随机森林框架,从而在这两种方法之间架起了桥梁。RST 使用随机 B 样条参数生成输入数据的不同函数表示,并创建一个在这些不同表示上训练的决策树集合。我们从理论上分析了功能多样性如何有助于减少泛化误差,并对 UCR 时间序列档案中的六个环境时间序列分类任务进行了实证评估。结果表明,RST 变体在大多数数据集上的表现优于标准随机森林和梯度提升,分类准确率提高了 14%。RST 的成功证明了自适应函数表示法在捕捉环境数据中复杂时间模式方面的潜力。这项工作为不断发展的以功能数据为重点的机器学习技术领域做出了贡献,并为环境时间序列分析的研究开辟了新的途径。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Randomized Spline Trees for Functional Data Classification: Theory and Application to Environmental Time Series
Functional data analysis (FDA) and ensemble learning can be powerful tools for analyzing complex environmental time series. Recent literature has highlighted the key role of diversity in enhancing accuracy and reducing variance in ensemble methods.This paper introduces Randomized Spline Trees (RST), a novel algorithm that bridges these two approaches by incorporating randomized functional representations into the Random Forest framework. RST generates diverse functional representations of input data using randomized B-spline parameters, creating an ensemble of decision trees trained on these varied representations. We provide a theoretical analysis of how this functional diversity contributes to reducing generalization error and present empirical evaluations on six environmental time series classification tasks from the UCR Time Series Archive. Results show that RST variants outperform standard Random Forests and Gradient Boosting on most datasets, improving classification accuracy by up to 14\%. The success of RST demonstrates the potential of adaptive functional representations in capturing complex temporal patterns in environmental data. This work contributes to the growing field of machine learning techniques focused on functional data and opens new avenues for research in environmental time series analysis.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信