{"title":"Predicting Secondary Equity Offerings (SEOs) Using Machine Learning","authors":"Linlin Cui, Jianhua Chen, Wentao Wu","doi":"10.1109/ICMLA.2018.00198","DOIUrl":null,"url":null,"abstract":"This paper explores the application of machine learning techniques in finance to predict if a publicly-traded firm will issue a Seasoned Equity Offering (SEO) by analyzing the firm's 10-Q filing documents with Security and Exchange Commissions (SEC). Specifically, using the information content in the Management Discussion and Analysis section (MD&A) of 10-Q filings, we train five different algorithms, including Logistic Regression (LR), Support Vector Classification (SVC), Multinomial Naïve Bayes (NB), Artificial Neural Network (ANN) and Random Forest (RF). Two types of features, unigrams and phrases are considered. Term frequency-inverse document (TF-IDF) scores are used as independent variables in these models. Experimental results show that the accuracy of phrases-only models has a range of 0-2% improvement for LR, NB, and RF compared with unigrams-only models. The accuracy of phrase-only model for SVC is close to that of unigrams-only model. The 74.53% accuracy of unigrams-only model for SVC classifier performs the best among all tested classifiers. The precision of all models varies between 60% and 75%, while the recall varies between 55% and 85%. Further, we tune model parameters of one linear model (LR) and one non-linear model (RF) to see how these parameters will impact the models' performance. Finally, we apply RF to find the most important features on prediction and find that \"merger\" is the most important feature in both unigrams-only model and phrases-only model. We conclude that text mining with SEC financial document filings could be an effective tool to predict important corporate events such as SEO.","PeriodicalId":6533,"journal":{"name":"2018 17th IEEE International Conference on Machine Learning and Applications (ICMLA)","volume":"25 3 1","pages":"1219-1224"},"PeriodicalIF":0.0000,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 17th IEEE International Conference on Machine Learning and Applications (ICMLA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICMLA.2018.00198","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
This paper explores the application of machine learning techniques in finance to predict if a publicly-traded firm will issue a Seasoned Equity Offering (SEO) by analyzing the firm's 10-Q filing documents with Security and Exchange Commissions (SEC). Specifically, using the information content in the Management Discussion and Analysis section (MD&A) of 10-Q filings, we train five different algorithms, including Logistic Regression (LR), Support Vector Classification (SVC), Multinomial Naïve Bayes (NB), Artificial Neural Network (ANN) and Random Forest (RF). Two types of features, unigrams and phrases are considered. Term frequency-inverse document (TF-IDF) scores are used as independent variables in these models. Experimental results show that the accuracy of phrases-only models has a range of 0-2% improvement for LR, NB, and RF compared with unigrams-only models. The accuracy of phrase-only model for SVC is close to that of unigrams-only model. The 74.53% accuracy of unigrams-only model for SVC classifier performs the best among all tested classifiers. The precision of all models varies between 60% and 75%, while the recall varies between 55% and 85%. Further, we tune model parameters of one linear model (LR) and one non-linear model (RF) to see how these parameters will impact the models' performance. Finally, we apply RF to find the most important features on prediction and find that "merger" is the most important feature in both unigrams-only model and phrases-only model. We conclude that text mining with SEC financial document filings could be an effective tool to predict important corporate events such as SEO.