Jayanth Rao, V. Ramaraju, James Smith, Ajay Bansal
{"title":"基于情绪分析的股票推荐系统","authors":"Jayanth Rao, V. Ramaraju, James Smith, Ajay Bansal","doi":"10.1109/AIKE55402.2022.00020","DOIUrl":null,"url":null,"abstract":"There is tremendous value in the ability to predict stock market trends and outcomes. The public sentiment surrounding a stock is unquestionably a vital factor contributing to the rise or fall of a stock price. This paper aims to detail how data from public sentiment can be integrated into traditional stock analyses and how these analyses can then be used to make predictions of stock price trends. Headlines from seven news publications and conversations from Yahoo! Finance's conversations forum were processed by the Valence Aware Dictionary and sEntiment Reasoner (VADER) natural language processing package to determine numerical polarities which represent a positive, negative, or neutral public sentiment around a stock ticker. The resulting polarities were paired with popular stock-table metrics (PEG Ratio, Forward EPS, etc.) to create a dataset for a Logistic Regression machine learning model. The model was trained on approximately 4400 major stocks to determine a binary “Buy” (1) or “Not Buy” (0) recommendation for each stock. The model achieved an F1 accuracy of 82.5% and for most major stocks, the model's recommendations were aligned with the stock analysts' ratings from the NASDAQ website. The logistic regression model would improve from leveraging a historical compass of data, given the hive-mind behavior that online discussion forums exhibit.","PeriodicalId":441077,"journal":{"name":"2022 IEEE Fifth International Conference on Artificial Intelligence and Knowledge Engineering (AIKE)","volume":"68 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A Sentiment Analysis Based Stock Recommendation System\",\"authors\":\"Jayanth Rao, V. Ramaraju, James Smith, Ajay Bansal\",\"doi\":\"10.1109/AIKE55402.2022.00020\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"There is tremendous value in the ability to predict stock market trends and outcomes. The public sentiment surrounding a stock is unquestionably a vital factor contributing to the rise or fall of a stock price. This paper aims to detail how data from public sentiment can be integrated into traditional stock analyses and how these analyses can then be used to make predictions of stock price trends. Headlines from seven news publications and conversations from Yahoo! Finance's conversations forum were processed by the Valence Aware Dictionary and sEntiment Reasoner (VADER) natural language processing package to determine numerical polarities which represent a positive, negative, or neutral public sentiment around a stock ticker. The resulting polarities were paired with popular stock-table metrics (PEG Ratio, Forward EPS, etc.) to create a dataset for a Logistic Regression machine learning model. The model was trained on approximately 4400 major stocks to determine a binary “Buy” (1) or “Not Buy” (0) recommendation for each stock. The model achieved an F1 accuracy of 82.5% and for most major stocks, the model's recommendations were aligned with the stock analysts' ratings from the NASDAQ website. The logistic regression model would improve from leveraging a historical compass of data, given the hive-mind behavior that online discussion forums exhibit.\",\"PeriodicalId\":441077,\"journal\":{\"name\":\"2022 IEEE Fifth International Conference on Artificial Intelligence and Knowledge Engineering (AIKE)\",\"volume\":\"68 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 IEEE Fifth International Conference on Artificial Intelligence and Knowledge Engineering (AIKE)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/AIKE55402.2022.00020\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE Fifth International Conference on Artificial Intelligence and Knowledge Engineering (AIKE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/AIKE55402.2022.00020","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A Sentiment Analysis Based Stock Recommendation System
There is tremendous value in the ability to predict stock market trends and outcomes. The public sentiment surrounding a stock is unquestionably a vital factor contributing to the rise or fall of a stock price. This paper aims to detail how data from public sentiment can be integrated into traditional stock analyses and how these analyses can then be used to make predictions of stock price trends. Headlines from seven news publications and conversations from Yahoo! Finance's conversations forum were processed by the Valence Aware Dictionary and sEntiment Reasoner (VADER) natural language processing package to determine numerical polarities which represent a positive, negative, or neutral public sentiment around a stock ticker. The resulting polarities were paired with popular stock-table metrics (PEG Ratio, Forward EPS, etc.) to create a dataset for a Logistic Regression machine learning model. The model was trained on approximately 4400 major stocks to determine a binary “Buy” (1) or “Not Buy” (0) recommendation for each stock. The model achieved an F1 accuracy of 82.5% and for most major stocks, the model's recommendations were aligned with the stock analysts' ratings from the NASDAQ website. The logistic regression model would improve from leveraging a historical compass of data, given the hive-mind behavior that online discussion forums exhibit.