{"title":"使用地理哈希进行人口预测的机器学习方法","authors":"Avipsa Roy, E. Pebesma","doi":"10.1145/3055601.3055603","DOIUrl":null,"url":null,"abstract":"With the rapid proliferation of smartphones, human beings act as social sensors by means of carrying GPS-enabled devices that share location data. This has resulted in an abundance of sensor data gathered over long periods of time. Gaining meaningful insights from such massive amounts of spatio-temporal data accumulated by several disparate sources is often a challenge for organizations. Identifying demographics of mobile phone users by telecommunication providers is one such example. Demographic information plays a very significant role in targeting online advertisements to focused user groups by gaining insights about userfis mobility patterns. However, in practice, demographic information such as age and gender are mostly unavailable to app developers for open access due to privacy concerns. In this paper, we try to address the gap of how to enrich location data with demographics, which could be valuable for app developers. In our study, we use a machine learning approach to predict the gender and age of mobile phone users from a set of 3,252,950 anonymised GPS trajectories with 60,865 unique devices using a predictive model which is based upon the concept of Geohashes. We study to what extent usersfi demographics could be inferred from their frequently visited locations by encoding by formulating a multi-level classification algorithm to find the most frequently visited Geohashes and associating them with nearest points of interests which would enable predicting age-group and gender of the users who prefer to visit a specific location in a sequential manner. Experiments are conducted on a real dataset of mobile phone users collected and shared by a telecommunication provider. Th The experimental results show that the proposed algorithm can achieve mean prediction accuracy scores of 71.62% and 96.75% for predicting gender and age groups of the users respectively.","PeriodicalId":360957,"journal":{"name":"Proceedings of the 2nd International Workshop on Social Sensing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-04-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":"{\"title\":\"A Machine Learning Approach to Demographic Prediction using Geohashes\",\"authors\":\"Avipsa Roy, E. Pebesma\",\"doi\":\"10.1145/3055601.3055603\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"With the rapid proliferation of smartphones, human beings act as social sensors by means of carrying GPS-enabled devices that share location data. This has resulted in an abundance of sensor data gathered over long periods of time. Gaining meaningful insights from such massive amounts of spatio-temporal data accumulated by several disparate sources is often a challenge for organizations. Identifying demographics of mobile phone users by telecommunication providers is one such example. Demographic information plays a very significant role in targeting online advertisements to focused user groups by gaining insights about userfis mobility patterns. However, in practice, demographic information such as age and gender are mostly unavailable to app developers for open access due to privacy concerns. In this paper, we try to address the gap of how to enrich location data with demographics, which could be valuable for app developers. In our study, we use a machine learning approach to predict the gender and age of mobile phone users from a set of 3,252,950 anonymised GPS trajectories with 60,865 unique devices using a predictive model which is based upon the concept of Geohashes. We study to what extent usersfi demographics could be inferred from their frequently visited locations by encoding by formulating a multi-level classification algorithm to find the most frequently visited Geohashes and associating them with nearest points of interests which would enable predicting age-group and gender of the users who prefer to visit a specific location in a sequential manner. Experiments are conducted on a real dataset of mobile phone users collected and shared by a telecommunication provider. Th The experimental results show that the proposed algorithm can achieve mean prediction accuracy scores of 71.62% and 96.75% for predicting gender and age groups of the users respectively.\",\"PeriodicalId\":360957,\"journal\":{\"name\":\"Proceedings of the 2nd International Workshop on Social Sensing\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-04-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"9\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 2nd International Workshop on Social Sensing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3055601.3055603\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2nd International Workshop on Social Sensing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3055601.3055603","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A Machine Learning Approach to Demographic Prediction using Geohashes
With the rapid proliferation of smartphones, human beings act as social sensors by means of carrying GPS-enabled devices that share location data. This has resulted in an abundance of sensor data gathered over long periods of time. Gaining meaningful insights from such massive amounts of spatio-temporal data accumulated by several disparate sources is often a challenge for organizations. Identifying demographics of mobile phone users by telecommunication providers is one such example. Demographic information plays a very significant role in targeting online advertisements to focused user groups by gaining insights about userfis mobility patterns. However, in practice, demographic information such as age and gender are mostly unavailable to app developers for open access due to privacy concerns. In this paper, we try to address the gap of how to enrich location data with demographics, which could be valuable for app developers. In our study, we use a machine learning approach to predict the gender and age of mobile phone users from a set of 3,252,950 anonymised GPS trajectories with 60,865 unique devices using a predictive model which is based upon the concept of Geohashes. We study to what extent usersfi demographics could be inferred from their frequently visited locations by encoding by formulating a multi-level classification algorithm to find the most frequently visited Geohashes and associating them with nearest points of interests which would enable predicting age-group and gender of the users who prefer to visit a specific location in a sequential manner. Experiments are conducted on a real dataset of mobile phone users collected and shared by a telecommunication provider. Th The experimental results show that the proposed algorithm can achieve mean prediction accuracy scores of 71.62% and 96.75% for predicting gender and age groups of the users respectively.