Machine Learning for Earnings Prediction: A Nonlinear Tensor Approach for Data Integration and Completion

Ajim Uddin, Xinyuan Tao, Chia-Ching Chou, Dantong Yu
{"title":"Machine Learning for Earnings Prediction: A Nonlinear Tensor Approach for Data Integration and Completion","authors":"Ajim Uddin, Xinyuan Tao, Chia-Ching Chou, Dantong Yu","doi":"10.1145/3533271.3561677","DOIUrl":null,"url":null,"abstract":"Successful predictive models for financial applications often require harnessing complementary information from multiple datasets. Incorporating data from different sources into a single model can be challenging as they vary in structure, dimensions, quality, and completeness. Simply merging those datasets can cause redundancy, discrepancy, and information loss. This paper proposes a convolutional neural network-based nonlinear tensor coupling and completion framework (NLTCC) to combine heterogeneous datasets without compromising data quality. We demonstrate the effectiveness of NLTCC in solving a specific business problem - predicting firms’ earnings from financial analysts’ earnings forecast. First, we apply NLTCC to fuse firm characteristics and stock market information into the financial analysts’ earnings forecasts data to impute missing values and improve data quality. Subsequently, we predict the next quarter’s earnings based on the imputed data. The experiments reveal that the prediction error decreases by 65% compared with the benchmark analysts’ consensus forecast. The long-short portfolio returns based on NLTCC outperform analysts’ consensus forecast and the S&P-500 index from three-day up to two-month holding period. The prediction accuracy improvement is robust with different performance metrics and various industry sectors. Notably, it is more salient for the sectors with higher heterogeneity.","PeriodicalId":134888,"journal":{"name":"Proceedings of the Third ACM International Conference on AI in Finance","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-10-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Third ACM International Conference on AI in Finance","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3533271.3561677","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

Abstract

Successful predictive models for financial applications often require harnessing complementary information from multiple datasets. Incorporating data from different sources into a single model can be challenging as they vary in structure, dimensions, quality, and completeness. Simply merging those datasets can cause redundancy, discrepancy, and information loss. This paper proposes a convolutional neural network-based nonlinear tensor coupling and completion framework (NLTCC) to combine heterogeneous datasets without compromising data quality. We demonstrate the effectiveness of NLTCC in solving a specific business problem - predicting firms’ earnings from financial analysts’ earnings forecast. First, we apply NLTCC to fuse firm characteristics and stock market information into the financial analysts’ earnings forecasts data to impute missing values and improve data quality. Subsequently, we predict the next quarter’s earnings based on the imputed data. The experiments reveal that the prediction error decreases by 65% compared with the benchmark analysts’ consensus forecast. The long-short portfolio returns based on NLTCC outperform analysts’ consensus forecast and the S&P-500 index from three-day up to two-month holding period. The prediction accuracy improvement is robust with different performance metrics and various industry sectors. Notably, it is more salient for the sectors with higher heterogeneity.
收益预测的机器学习:数据集成和补全的非线性张量方法
成功的金融应用预测模型通常需要利用来自多个数据集的互补信息。将来自不同来源的数据合并到单个模型中可能具有挑战性,因为它们在结构、维度、质量和完整性方面各不相同。简单地合并这些数据集可能会导致冗余、差异和信息丢失。本文提出了一种基于卷积神经网络的非线性张量耦合和补全框架(NLTCC),在不影响数据质量的情况下组合异构数据集。我们证明了NLTCC在解决一个特定的商业问题——从财务分析师的收益预测中预测公司收益方面的有效性。首先,我们运用NLTCC将企业特征和股票市场信息融合到金融分析师的收益预测数据中,以估算缺失值并提高数据质量。随后,我们根据估算的数据预测下一季度的收益。实验表明,与基准分析师的共识预测相比,预测误差降低了65%。基于NLTCC的多空组合回报优于分析师的共识预测和标准普尔500指数,从三天到两个月的持有期不等。对于不同的性能指标和不同的行业部门,预测精度的提高具有鲁棒性。值得注意的是,对于异质性较高的行业,这一点更为突出。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信