Yuta Taniguchi, Daiki Monzen, Lutfiana Sari Ariestien, Daisuke Ikeda
{"title":"通过推文的地理语义聚类发现重叠的主题区域","authors":"Yuta Taniguchi, Daiki Monzen, Lutfiana Sari Ariestien, Daisuke Ikeda","doi":"10.1109/WAINA.2015.85","DOIUrl":null,"url":null,"abstract":"Geotagging is an interesting feature of social media services which adds metadata of geographical locations to photos, web sites or messages. From a different perspective, geotagging can be seen as annotating geographical locations conversely by images or texts. It is a challenging task to summarize such annotations and uncover topical geographical regions characterized by specific topics locally since such knowledge is useful for location-based advertising and so on. Determining topical regions is not trivial since topical region's topic and geographical area are dependent on each other. In this paper, we aim to discover overlapping topical regions from geotagged text messages (tweets) collected from Twitter. To this end, we employ Mean Shift clustering algorithm and an integrated vector space of a geographic and semantic vector spaces. Running Mean Shift algorithm on the vector space, we can evaluate both geographical density and semantic density of tweets simultaneously. Subsequently, our method determines regions of clusters detected by Mean Shift algorithm applying the kernel density estimation on clustered tweets in the geographical space. Our experiments show clusters get broken into several sub-clusters that overlap each other when we increase the weight of semantic density over that of geographical density.","PeriodicalId":6845,"journal":{"name":"2015 IEEE 29th International Conference on Advanced Information Networking and Applications Workshops","volume":"1 1","pages":"552-557"},"PeriodicalIF":0.0000,"publicationDate":"2015-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Discover Overlapping Topical Regions by Geo-Semantic Clustering of Tweets\",\"authors\":\"Yuta Taniguchi, Daiki Monzen, Lutfiana Sari Ariestien, Daisuke Ikeda\",\"doi\":\"10.1109/WAINA.2015.85\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Geotagging is an interesting feature of social media services which adds metadata of geographical locations to photos, web sites or messages. From a different perspective, geotagging can be seen as annotating geographical locations conversely by images or texts. It is a challenging task to summarize such annotations and uncover topical geographical regions characterized by specific topics locally since such knowledge is useful for location-based advertising and so on. Determining topical regions is not trivial since topical region's topic and geographical area are dependent on each other. In this paper, we aim to discover overlapping topical regions from geotagged text messages (tweets) collected from Twitter. To this end, we employ Mean Shift clustering algorithm and an integrated vector space of a geographic and semantic vector spaces. Running Mean Shift algorithm on the vector space, we can evaluate both geographical density and semantic density of tweets simultaneously. Subsequently, our method determines regions of clusters detected by Mean Shift algorithm applying the kernel density estimation on clustered tweets in the geographical space. Our experiments show clusters get broken into several sub-clusters that overlap each other when we increase the weight of semantic density over that of geographical density.\",\"PeriodicalId\":6845,\"journal\":{\"name\":\"2015 IEEE 29th International Conference on Advanced Information Networking and Applications Workshops\",\"volume\":\"1 1\",\"pages\":\"552-557\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-03-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2015 IEEE 29th International Conference on Advanced Information Networking and Applications Workshops\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/WAINA.2015.85\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 IEEE 29th International Conference on Advanced Information Networking and Applications Workshops","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/WAINA.2015.85","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Discover Overlapping Topical Regions by Geo-Semantic Clustering of Tweets
Geotagging is an interesting feature of social media services which adds metadata of geographical locations to photos, web sites or messages. From a different perspective, geotagging can be seen as annotating geographical locations conversely by images or texts. It is a challenging task to summarize such annotations and uncover topical geographical regions characterized by specific topics locally since such knowledge is useful for location-based advertising and so on. Determining topical regions is not trivial since topical region's topic and geographical area are dependent on each other. In this paper, we aim to discover overlapping topical regions from geotagged text messages (tweets) collected from Twitter. To this end, we employ Mean Shift clustering algorithm and an integrated vector space of a geographic and semantic vector spaces. Running Mean Shift algorithm on the vector space, we can evaluate both geographical density and semantic density of tweets simultaneously. Subsequently, our method determines regions of clusters detected by Mean Shift algorithm applying the kernel density estimation on clustered tweets in the geographical space. Our experiments show clusters get broken into several sub-clusters that overlap each other when we increase the weight of semantic density over that of geographical density.