{"title":"总编辑的信","authors":"F. Fabozzi","doi":"10.3905/jfds.2022.4.2.001","DOIUrl":null,"url":null,"abstract":"Cathy Scott General Manager and Publisher The lead article in this issue is by the co-editor of this journal, Marcos López de Prado, “Machine Learning for Econometricians: The Readme Manual.” As he notes, econometric tools are typically applied in investment research despite the fact that they are poorly suited for uncovering statistical patterns in financial data. This is because of the unstructured nature of financial datasets, as well as the complex relationships involved in financial markets. Researchers and analysts working for asset managers overlook these limitations as they take the view that econometric approaches are more appropriate than machine learning methods. One of their objections to using machine learning is that their tools are not transparent (i.e., it is a black box approach to problem solving). López de Prado demonstrates why it is not the case that machine learning is a black box. For each analytical step of the econometric process, he identifies a corresponding step in machine learning analysis. By clearly stating this correspondence, López de Prado has facilitated and reconciled the adoption of machine techniques among econometricians, offering a bridge from classical statistics to machine learning. The process of meta-labeling, introduced by López de Prado, is used as the machine learning layer of an investment strategy that can determine the size of positions, filter out false-positive signals from backtests, and improve performance metrics. In “Meta-Labeling: Theory and Framework,” Jacques Francois Joubert provides an overview of meta-labeling’s theoretical framework (including its architecture and applications). Then the author describes the methodology for three controlled experiments designed to break meta-labeling down into three components: information advantage, modeling for false positives, and position sizing. The three experiments validated that meta-labeling not only improves classification metrics but also significantly improves the performance of various types of primary investment strategies. Because of this attribute of meta-labeling, this article provides a good case study of how machine learning can be applied in financial markets. Studies have shown that security prices are driven by information beyond the financial information reported by companies in their filings with the Securities and Exchange Commission. This information includes news and investor-based sentiment. In “FinEAS: Financial Embedding Analysis of Sentiment,” a new language representation model for sentiment analysis of financial text called “financial embedding analysis of sentiment” (FinEAS) is introduced by Asier Gutiérrez-Fandiño, Petter N. Kolm, Miquel Noguer i Alonso, and Jordi Armengol-Estapé. Their approach is based on transformer language models that are explicitly developed for sentence-level analysis which builds on Sentence-BERT, a sentence-level extension of vanilla BERT. The authors argue that the new approach generates sentence embeddings that are of higher quality that significantly improve sentence/document-level tasks such as financial sentiment analysis. Using a large-scale financial news dataset from RavenPack, the authors demonstrate that for financial sentiment analysis the new model outperforms several state-of-the-art models. The authors make the model code publicly available. Deep reinforcement learning (DRL) has attracted substantial interest from practitioners. However, its application has been limited by the need for practitioners to","PeriodicalId":199045,"journal":{"name":"The Journal of Financial Data Science","volume":"48 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-04-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Managing Editor’s Letter\",\"authors\":\"F. Fabozzi\",\"doi\":\"10.3905/jfds.2022.4.2.001\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Cathy Scott General Manager and Publisher The lead article in this issue is by the co-editor of this journal, Marcos López de Prado, “Machine Learning for Econometricians: The Readme Manual.” As he notes, econometric tools are typically applied in investment research despite the fact that they are poorly suited for uncovering statistical patterns in financial data. This is because of the unstructured nature of financial datasets, as well as the complex relationships involved in financial markets. Researchers and analysts working for asset managers overlook these limitations as they take the view that econometric approaches are more appropriate than machine learning methods. One of their objections to using machine learning is that their tools are not transparent (i.e., it is a black box approach to problem solving). López de Prado demonstrates why it is not the case that machine learning is a black box. For each analytical step of the econometric process, he identifies a corresponding step in machine learning analysis. By clearly stating this correspondence, López de Prado has facilitated and reconciled the adoption of machine techniques among econometricians, offering a bridge from classical statistics to machine learning. The process of meta-labeling, introduced by López de Prado, is used as the machine learning layer of an investment strategy that can determine the size of positions, filter out false-positive signals from backtests, and improve performance metrics. In “Meta-Labeling: Theory and Framework,” Jacques Francois Joubert provides an overview of meta-labeling’s theoretical framework (including its architecture and applications). Then the author describes the methodology for three controlled experiments designed to break meta-labeling down into three components: information advantage, modeling for false positives, and position sizing. The three experiments validated that meta-labeling not only improves classification metrics but also significantly improves the performance of various types of primary investment strategies. Because of this attribute of meta-labeling, this article provides a good case study of how machine learning can be applied in financial markets. Studies have shown that security prices are driven by information beyond the financial information reported by companies in their filings with the Securities and Exchange Commission. This information includes news and investor-based sentiment. In “FinEAS: Financial Embedding Analysis of Sentiment,” a new language representation model for sentiment analysis of financial text called “financial embedding analysis of sentiment” (FinEAS) is introduced by Asier Gutiérrez-Fandiño, Petter N. Kolm, Miquel Noguer i Alonso, and Jordi Armengol-Estapé. Their approach is based on transformer language models that are explicitly developed for sentence-level analysis which builds on Sentence-BERT, a sentence-level extension of vanilla BERT. The authors argue that the new approach generates sentence embeddings that are of higher quality that significantly improve sentence/document-level tasks such as financial sentiment analysis. Using a large-scale financial news dataset from RavenPack, the authors demonstrate that for financial sentiment analysis the new model outperforms several state-of-the-art models. The authors make the model code publicly available. Deep reinforcement learning (DRL) has attracted substantial interest from practitioners. However, its application has been limited by the need for practitioners to\",\"PeriodicalId\":199045,\"journal\":{\"name\":\"The Journal of Financial Data Science\",\"volume\":\"48 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-04-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"The Journal of Financial Data Science\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.3905/jfds.2022.4.2.001\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"The Journal of Financial Data Science","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3905/jfds.2022.4.2.001","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
摘要
这期的主要文章是由本刊的联合编辑Marcos López de Prado撰写的“计量经济学家的机器学习:自述手册”。正如他所指出的,计量经济学工具通常应用于投资研究,尽管它们不太适合揭示金融数据中的统计模式。这是因为金融数据集的非结构化性质,以及金融市场中涉及的复杂关系。为资产管理公司工作的研究人员和分析师忽视了这些限制,因为他们认为计量经济学方法比机器学习方法更合适。他们反对使用机器学习的原因之一是他们的工具不透明(即,它是解决问题的黑箱方法)。López de Prado展示了为什么机器学习不是一个黑盒子。对于计量经济学过程的每个分析步骤,他确定了机器学习分析中的相应步骤。通过清楚地说明这种对应关系,López de Prado促进和协调了计量经济学家对机器技术的采用,提供了从经典统计学到机器学习的桥梁。由López de Prado引入的元标记过程被用作投资策略的机器学习层,可以确定头寸的大小,过滤回测中的假阳性信号,并提高绩效指标。在“元标签:理论和框架”一文中,Jacques Francois Joubert概述了元标签的理论框架(包括其架构和应用)。然后,作者描述了三个对照实验的方法,旨在将元标签分解为三个组成部分:信息优势、假阳性建模和位置大小。三个实验验证了元标记不仅提高了分类指标,而且显著提高了各类主要投资策略的绩效。由于元标签的这种属性,本文提供了一个很好的案例研究,说明机器学习如何应用于金融市场。研究表明,证券价格是由公司在提交给美国证券交易委员会(Securities and Exchange Commission)的文件中报告的财务信息以外的信息驱动的。这些信息包括新闻和投资者情绪。在“FinEAS:情感的金融嵌入分析”一文中,Asier Gutiérrez-Fandiño、Petter N. Kolm、Miquel Noguer i Alonso和Jordi armengol - estapeer提出了一种新的金融文本情感分析的语言表示模型“情感的金融嵌入分析”(FinEAS)。他们的方法基于为句子级分析而明确开发的转换语言模型,该模型建立在Sentence-BERT (vanilla BERT的句子级扩展)之上。作者认为,新方法生成的句子嵌入质量更高,可以显著提高句子/文档级别的任务,如金融情绪分析。使用来自RavenPack的大规模金融新闻数据集,作者证明,对于金融情绪分析,新模型优于几个最先进的模型。作者公开了模型代码。深度强化学习(DRL)已经引起了实践者的极大兴趣。然而,它的应用一直受到从业者需要的限制
Cathy Scott General Manager and Publisher The lead article in this issue is by the co-editor of this journal, Marcos López de Prado, “Machine Learning for Econometricians: The Readme Manual.” As he notes, econometric tools are typically applied in investment research despite the fact that they are poorly suited for uncovering statistical patterns in financial data. This is because of the unstructured nature of financial datasets, as well as the complex relationships involved in financial markets. Researchers and analysts working for asset managers overlook these limitations as they take the view that econometric approaches are more appropriate than machine learning methods. One of their objections to using machine learning is that their tools are not transparent (i.e., it is a black box approach to problem solving). López de Prado demonstrates why it is not the case that machine learning is a black box. For each analytical step of the econometric process, he identifies a corresponding step in machine learning analysis. By clearly stating this correspondence, López de Prado has facilitated and reconciled the adoption of machine techniques among econometricians, offering a bridge from classical statistics to machine learning. The process of meta-labeling, introduced by López de Prado, is used as the machine learning layer of an investment strategy that can determine the size of positions, filter out false-positive signals from backtests, and improve performance metrics. In “Meta-Labeling: Theory and Framework,” Jacques Francois Joubert provides an overview of meta-labeling’s theoretical framework (including its architecture and applications). Then the author describes the methodology for three controlled experiments designed to break meta-labeling down into three components: information advantage, modeling for false positives, and position sizing. The three experiments validated that meta-labeling not only improves classification metrics but also significantly improves the performance of various types of primary investment strategies. Because of this attribute of meta-labeling, this article provides a good case study of how machine learning can be applied in financial markets. Studies have shown that security prices are driven by information beyond the financial information reported by companies in their filings with the Securities and Exchange Commission. This information includes news and investor-based sentiment. In “FinEAS: Financial Embedding Analysis of Sentiment,” a new language representation model for sentiment analysis of financial text called “financial embedding analysis of sentiment” (FinEAS) is introduced by Asier Gutiérrez-Fandiño, Petter N. Kolm, Miquel Noguer i Alonso, and Jordi Armengol-Estapé. Their approach is based on transformer language models that are explicitly developed for sentence-level analysis which builds on Sentence-BERT, a sentence-level extension of vanilla BERT. The authors argue that the new approach generates sentence embeddings that are of higher quality that significantly improve sentence/document-level tasks such as financial sentiment analysis. Using a large-scale financial news dataset from RavenPack, the authors demonstrate that for financial sentiment analysis the new model outperforms several state-of-the-art models. The authors make the model code publicly available. Deep reinforcement learning (DRL) has attracted substantial interest from practitioners. However, its application has been limited by the need for practitioners to