Twitter的效率如何:通过Twitter使用支持向量机预测2012年美国总统大选，并与爱荷华州电子市场进行比较

2017 Intelligent Systems Conference (IntelliSys) Pub Date : 2017-09-01 DOI:10.1109/INTELLISYS.2017.8324363

A. Attarwala, Stanko Dimitrov, Amer Obeidi

{"title":"Twitter的效率如何:通过Twitter使用支持向量机预测2012年美国总统大选，并与爱荷华州电子市场进行比较","authors":"A. Attarwala, Stanko Dimitrov, Amer Obeidi","doi":"10.1109/INTELLISYS.2017.8324363","DOIUrl":null,"url":null,"abstract":"We test the efficient market hypothesis to see if Twitter aggregates information faster than a real-money prediction market. We use Support Vector Machines (SVMs), a supervised learning algorithm, to predict the outcome of the 2012 U.S. presidential elections via Twitter data. We then compare the prediction from SVM against the Iowa Electronic Markets (IEM). A total of 40 million unique tweets were collected and analyzed between September 29th 2012 and November 6th 2012. We observe: 1) The IEM is efficient on all the above days as per the semi-strong efficient market hypothesis definition [1]. SVM does not out predict the IEM. 2) The SVM prediction results are positively correlated with the IEM and predicts Obama winning the election, implying that Twitter can be considered as a valid source in predicting US presidential election outcomes. Using the Granger causality test, no causal relationship was inferred between the two-time series. 3) The candidate frequency count distribution independent of any sentiment analysis on all days is also positively correlated with IEM and SVM. Using Granger causality test, we determined that IEM statistically causes the candidate frequency count distribution in Twitter at the 1% level.","PeriodicalId":131825,"journal":{"name":"2017 Intelligent Systems Conference (IntelliSys)","volume":"95 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"15","resultStr":"{\"title\":\"How efficient is Twitter: Predicting 2012 U.S. presidential elections using Support Vector Machine via Twitter and comparing against Iowa Electronic Markets\",\"authors\":\"A. Attarwala, Stanko Dimitrov, Amer Obeidi\",\"doi\":\"10.1109/INTELLISYS.2017.8324363\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We test the efficient market hypothesis to see if Twitter aggregates information faster than a real-money prediction market. We use Support Vector Machines (SVMs), a supervised learning algorithm, to predict the outcome of the 2012 U.S. presidential elections via Twitter data. We then compare the prediction from SVM against the Iowa Electronic Markets (IEM). A total of 40 million unique tweets were collected and analyzed between September 29th 2012 and November 6th 2012. We observe: 1) The IEM is efficient on all the above days as per the semi-strong efficient market hypothesis definition [1]. SVM does not out predict the IEM. 2) The SVM prediction results are positively correlated with the IEM and predicts Obama winning the election, implying that Twitter can be considered as a valid source in predicting US presidential election outcomes. Using the Granger causality test, no causal relationship was inferred between the two-time series. 3) The candidate frequency count distribution independent of any sentiment analysis on all days is also positively correlated with IEM and SVM. Using Granger causality test, we determined that IEM statistically causes the candidate frequency count distribution in Twitter at the 1% level.\",\"PeriodicalId\":131825,\"journal\":{\"name\":\"2017 Intelligent Systems Conference (IntelliSys)\",\"volume\":\"95 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"15\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 Intelligent Systems Conference (IntelliSys)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/INTELLISYS.2017.8324363\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 Intelligent Systems Conference (IntelliSys)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/INTELLISYS.2017.8324363","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 15

摘要

我们测试了有效市场假设，看看Twitter是否比真实货币预测市场更快地聚合信息。我们使用支持向量机(svm)，一种监督学习算法，通过Twitter数据预测2012年美国总统选举的结果。然后，我们将SVM的预测与爱荷华州电子市场(IEM)进行比较。2012年9月29日至11月6日期间，共收集并分析了4000万条独立推文。我们观察到:1)根据半强有效市场假设定义[1]，IEM在上述所有日子都是有效的。支持向量机不能准确预测IEM。2) SVM预测结果与IEM呈正相关，预测奥巴马获胜，这意味着Twitter可以作为预测美国总统大选结果的有效来源。使用格兰杰因果检验，两个时间序列之间没有推断出因果关系。3)独立于任何情绪分析的候选频数分布也与IEM和SVM呈正相关。通过格兰杰因果检验，我们确定在1%的水平上，IEM在统计上导致Twitter中的候选频率计数分布。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

How efficient is Twitter: Predicting 2012 U.S. presidential elections using Support Vector Machine via Twitter and comparing against Iowa Electronic Markets

We test the efficient market hypothesis to see if Twitter aggregates information faster than a real-money prediction market. We use Support Vector Machines (SVMs), a supervised learning algorithm, to predict the outcome of the 2012 U.S. presidential elections via Twitter data. We then compare the prediction from SVM against the Iowa Electronic Markets (IEM). A total of 40 million unique tweets were collected and analyzed between September 29th 2012 and November 6th 2012. We observe: 1) The IEM is efficient on all the above days as per the semi-strong efficient market hypothesis definition [1]. SVM does not out predict the IEM. 2) The SVM prediction results are positively correlated with the IEM and predicts Obama winning the election, implying that Twitter can be considered as a valid source in predicting US presidential election outcomes. Using the Granger causality test, no causal relationship was inferred between the two-time series. 3) The candidate frequency count distribution independent of any sentiment analysis on all days is also positively correlated with IEM and SVM. Using Granger causality test, we determined that IEM statistically causes the candidate frequency count distribution in Twitter at the 1% level.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2017 Intelligent Systems Conference (IntelliSys)

自引率

0.00%

发文量