{"title":"你的(出租车)司机安全吗?","authors":"R. Stanojevic","doi":"10.1145/3132847.3133068","DOIUrl":null,"url":null,"abstract":"For an auto insurer, understanding the risk of individual drivers is a critical factor in building a healthy and profitable portfolio. For decades, assessing the risk of drivers has relied on demographic information which allows the insurer to segment the market in several risk groups priced with an appropriate premium. In the recent years, however, some insurers started experimenting with so called Usage-Based Insurance (UBI) in which the insurer monitors a number of additional variables (mostly related to the location) and uses them to better assess the risk of the drivers. While several studies have reported results on the UBI trials these studies keep the studied data confidential (for obvious privacy and business concerns) which inevitably limits their reproducibility and interest by the data-mining community. In this paper we discuss a methodology for studying driver risk assessment using a public dataset of 173M taxi rides in NYC with over 40K drivers. Our approach for risk assessment utilizes not only the location data (which is significantly sparser than what is normally exploited in UBI) but also the revenue, tips and overall activity of the drivers (as proxies of their behavioral traits) and obtain risk scoring accuracy on par with the reported results on non-professional driver cohorts in spite of sparser location data and no demographic information about the drivers.","PeriodicalId":20449,"journal":{"name":"Proceedings of the 2017 ACM on Conference on Information and Knowledge Management","volume":"72 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2017-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"How Safe is Your (Taxi) Driver?\",\"authors\":\"R. Stanojevic\",\"doi\":\"10.1145/3132847.3133068\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"For an auto insurer, understanding the risk of individual drivers is a critical factor in building a healthy and profitable portfolio. For decades, assessing the risk of drivers has relied on demographic information which allows the insurer to segment the market in several risk groups priced with an appropriate premium. In the recent years, however, some insurers started experimenting with so called Usage-Based Insurance (UBI) in which the insurer monitors a number of additional variables (mostly related to the location) and uses them to better assess the risk of the drivers. While several studies have reported results on the UBI trials these studies keep the studied data confidential (for obvious privacy and business concerns) which inevitably limits their reproducibility and interest by the data-mining community. In this paper we discuss a methodology for studying driver risk assessment using a public dataset of 173M taxi rides in NYC with over 40K drivers. Our approach for risk assessment utilizes not only the location data (which is significantly sparser than what is normally exploited in UBI) but also the revenue, tips and overall activity of the drivers (as proxies of their behavioral traits) and obtain risk scoring accuracy on par with the reported results on non-professional driver cohorts in spite of sparser location data and no demographic information about the drivers.\",\"PeriodicalId\":20449,\"journal\":{\"name\":\"Proceedings of the 2017 ACM on Conference on Information and Knowledge Management\",\"volume\":\"72 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-11-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 2017 ACM on Conference on Information and Knowledge Management\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3132847.3133068\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2017 ACM on Conference on Information and Knowledge Management","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3132847.3133068","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
For an auto insurer, understanding the risk of individual drivers is a critical factor in building a healthy and profitable portfolio. For decades, assessing the risk of drivers has relied on demographic information which allows the insurer to segment the market in several risk groups priced with an appropriate premium. In the recent years, however, some insurers started experimenting with so called Usage-Based Insurance (UBI) in which the insurer monitors a number of additional variables (mostly related to the location) and uses them to better assess the risk of the drivers. While several studies have reported results on the UBI trials these studies keep the studied data confidential (for obvious privacy and business concerns) which inevitably limits their reproducibility and interest by the data-mining community. In this paper we discuss a methodology for studying driver risk assessment using a public dataset of 173M taxi rides in NYC with over 40K drivers. Our approach for risk assessment utilizes not only the location data (which is significantly sparser than what is normally exploited in UBI) but also the revenue, tips and overall activity of the drivers (as proxies of their behavioral traits) and obtain risk scoring accuracy on par with the reported results on non-professional driver cohorts in spite of sparser location data and no demographic information about the drivers.