Eric Shook, Kalev H. Leetaru, G. Cao, Anand Padmanabhan, Shaowen Wang
{"title":"Happy or not: Generating topic-based emotional heatmaps for Culturomics using CyberGIS","authors":"Eric Shook, Kalev H. Leetaru, G. Cao, Anand Padmanabhan, Shaowen Wang","doi":"10.1109/ESCIENCE.2012.6404440","DOIUrl":null,"url":null,"abstract":"The field of Culturomics exploits “big data” to explore human society at population scale. Culturomics increasingly needs to consider geographic contexts and, thus, this research develops a geospatial visual analytical approach that transforms vast amounts of textual data into emotional heatmaps with fine-grained spatial resolution. Fulltext geocoding and sentiment mining extract locations and latent “tone” from text-based data, which are combined with spatial analysis methods - kernel density estimation and spatial interpolation - to generate heatmaps that capture the interplay of location, topic, and tone toward narrative impacts. To demonstrate the effectiveness of the approach, the complete English edition of Wikipedia is processed using a supercomputer to extract all locations and tone associated with the year of 2003. An emotional heatmap of Wikipedia's discussion of “armed conflict” for that year is created using the spatial analysis methods. Unlike previous research, our approach is designed for exploratory spatial analysis of topics in text archives by incorporating multiple attributes including the prominence of each location mentioned in the text, the density of a topic at each location compared to other topics, and the tone of the topics of interest into a single analysis. The generation of such fine-grained emotional heatmaps is computationally intensive particularly when accounting for the multiple attributes at fine scales. Therefore a CyberGIS platform based on national cyberinfrastructure in the United States is used to enable the computationally intensive visual analytics.","PeriodicalId":6364,"journal":{"name":"2012 IEEE 8th International Conference on E-Science","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2012-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"32","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2012 IEEE 8th International Conference on E-Science","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ESCIENCE.2012.6404440","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 32
Abstract
The field of Culturomics exploits “big data” to explore human society at population scale. Culturomics increasingly needs to consider geographic contexts and, thus, this research develops a geospatial visual analytical approach that transforms vast amounts of textual data into emotional heatmaps with fine-grained spatial resolution. Fulltext geocoding and sentiment mining extract locations and latent “tone” from text-based data, which are combined with spatial analysis methods - kernel density estimation and spatial interpolation - to generate heatmaps that capture the interplay of location, topic, and tone toward narrative impacts. To demonstrate the effectiveness of the approach, the complete English edition of Wikipedia is processed using a supercomputer to extract all locations and tone associated with the year of 2003. An emotional heatmap of Wikipedia's discussion of “armed conflict” for that year is created using the spatial analysis methods. Unlike previous research, our approach is designed for exploratory spatial analysis of topics in text archives by incorporating multiple attributes including the prominence of each location mentioned in the text, the density of a topic at each location compared to other topics, and the tone of the topics of interest into a single analysis. The generation of such fine-grained emotional heatmaps is computationally intensive particularly when accounting for the multiple attributes at fine scales. Therefore a CyberGIS platform based on national cyberinfrastructure in the United States is used to enable the computationally intensive visual analytics.