{"title":"Examining Stock Price Movements on Prague Stock Exchange Using Text Classification","authors":"Jonás Petrovský, Frantisek Darena Pavel Netolický","doi":"10.17781/P002293","DOIUrl":null,"url":null,"abstract":"The goal of the article was to examine the relationship between the content of text documents published on the Internet and the direction of movement of stock prices on the Prague Stock Exchange. The relationship was modeled by text classification. As data were used news articles and discussion posts on Czech websites and the value of the PX stock index and stock price of company CEZ. Document’s class (plus/minus/constant) was determined by the relative price change that happened between the publication date of a document and the next working day. We achieved a high accuracy of 75% for classification of discussion posts, however the classification accuracy for news articles was about 60%. We tried both binary (documents with constant class were discarded) and ternary classification – the former was in all cases more successful.","PeriodicalId":211757,"journal":{"name":"International journal of new computer architectures and their applications","volume":"42 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International journal of new computer architectures and their applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.17781/P002293","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
The goal of the article was to examine the relationship between the content of text documents published on the Internet and the direction of movement of stock prices on the Prague Stock Exchange. The relationship was modeled by text classification. As data were used news articles and discussion posts on Czech websites and the value of the PX stock index and stock price of company CEZ. Document’s class (plus/minus/constant) was determined by the relative price change that happened between the publication date of a document and the next working day. We achieved a high accuracy of 75% for classification of discussion posts, however the classification accuracy for news articles was about 60%. We tried both binary (documents with constant class were discarded) and ternary classification – the former was in all cases more successful.