一个由交付数据驱动的地理编码框架

Vishal Srivastava, Priyam Tejaswin, Lucky Dhakad, Mohit Kumar, Amar Dani
{"title":"一个由交付数据驱动的地理编码框架","authors":"Vishal Srivastava, Priyam Tejaswin, Lucky Dhakad, Mohit Kumar, Amar Dani","doi":"10.1145/3397536.3422254","DOIUrl":null,"url":null,"abstract":"Over the last decade, India has witnessed an explosion in the ecommerce industry. There is increasing adoption of e-commerce in smaller towns and cities over and above the densely populated urban centers. In this paper, we discuss the practical challenges involved with developing high-precision geocoding engines for these geographical regions in India. These challenges motivate the next iteration of our geocoding framework. In particular, we focus on addressing three core areas of improvement: 1) leveraging customer delivery data for geocoding, 2) understanding and solving for the diversity and variations in addresses for these new regions, and 3) overcoming the limited coverage of our reference corpus. To this end, we present GeoCloud. Key contributions of GeoCloud are 1) a training algorithm for learning reference-representations from delivery coordinates and 2) a retrieval algorithm for geocoding new addresses. We perform extensive testing of GeoCloud across India to capture the regional, socio-economical and linguistic diversity of our country. Our evaluation data is sampled from 72 cities and 21 states from the delivery addresses of a large e-commerce platform in India. The results show a significant improvement in precision and recall over the state-of-the-art geocoding system for India, and demonstrate the effectiveness of our intuitive, robust and generic approach. While we have shown the effectiveness of the framework for Indian addresses, we believe the framework can be applied to other countries as well, particularly where addresses are unstructured. To the best of our knowledge, this is the first instance of geocoding by learning reference-representations from large-scale delivery data.","PeriodicalId":233918,"journal":{"name":"Proceedings of the 28th International Conference on Advances in Geographic Information Systems","volume":"70 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-11-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"13","resultStr":"{\"title\":\"A Geocoding Framework Powered by Delivery Data\",\"authors\":\"Vishal Srivastava, Priyam Tejaswin, Lucky Dhakad, Mohit Kumar, Amar Dani\",\"doi\":\"10.1145/3397536.3422254\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Over the last decade, India has witnessed an explosion in the ecommerce industry. There is increasing adoption of e-commerce in smaller towns and cities over and above the densely populated urban centers. In this paper, we discuss the practical challenges involved with developing high-precision geocoding engines for these geographical regions in India. These challenges motivate the next iteration of our geocoding framework. In particular, we focus on addressing three core areas of improvement: 1) leveraging customer delivery data for geocoding, 2) understanding and solving for the diversity and variations in addresses for these new regions, and 3) overcoming the limited coverage of our reference corpus. To this end, we present GeoCloud. Key contributions of GeoCloud are 1) a training algorithm for learning reference-representations from delivery coordinates and 2) a retrieval algorithm for geocoding new addresses. We perform extensive testing of GeoCloud across India to capture the regional, socio-economical and linguistic diversity of our country. Our evaluation data is sampled from 72 cities and 21 states from the delivery addresses of a large e-commerce platform in India. The results show a significant improvement in precision and recall over the state-of-the-art geocoding system for India, and demonstrate the effectiveness of our intuitive, robust and generic approach. While we have shown the effectiveness of the framework for Indian addresses, we believe the framework can be applied to other countries as well, particularly where addresses are unstructured. To the best of our knowledge, this is the first instance of geocoding by learning reference-representations from large-scale delivery data.\",\"PeriodicalId\":233918,\"journal\":{\"name\":\"Proceedings of the 28th International Conference on Advances in Geographic Information Systems\",\"volume\":\"70 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-11-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"13\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 28th International Conference on Advances in Geographic Information Systems\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3397536.3422254\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 28th International Conference on Advances in Geographic Information Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3397536.3422254","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 13

摘要

在过去的十年里,印度见证了电子商务行业的爆炸式增长。在人口密集的城市中心以外,越来越多的小城镇和城市采用电子商务。在本文中,我们讨论了为印度这些地理区域开发高精度地理编码引擎所涉及的实际挑战。这些挑战促使我们对地理编码框架进行下一次迭代。特别是,我们专注于解决三个核心改进领域:1)利用客户交付数据进行地理编码,2)理解和解决这些新地区地址的多样性和变化,以及3)克服我们的参考语料库的有限覆盖。为此,我们提出GeoCloud。GeoCloud的主要贡献是:1)一种从交付坐标中学习参考表示的训练算法;2)一种对新地址进行地理编码的检索算法。我们在印度各地对GeoCloud进行了广泛的测试,以捕捉我国的区域、社会经济和语言多样性。我们的评估数据是从印度一个大型电子商务平台的配送地址中抽取的72个城市和21个邦的样本。结果表明,与印度最先进的地理编码系统相比,精度和召回率有了显著提高,并证明了我们直观、稳健和通用方法的有效性。虽然我们已经证明了印度地址框架的有效性,但我们相信该框架也可以应用于其他国家,特别是那些地址是非结构化的国家。据我们所知,这是第一个通过从大规模交付数据中学习参考表示来进行地理编码的实例。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
A Geocoding Framework Powered by Delivery Data
Over the last decade, India has witnessed an explosion in the ecommerce industry. There is increasing adoption of e-commerce in smaller towns and cities over and above the densely populated urban centers. In this paper, we discuss the practical challenges involved with developing high-precision geocoding engines for these geographical regions in India. These challenges motivate the next iteration of our geocoding framework. In particular, we focus on addressing three core areas of improvement: 1) leveraging customer delivery data for geocoding, 2) understanding and solving for the diversity and variations in addresses for these new regions, and 3) overcoming the limited coverage of our reference corpus. To this end, we present GeoCloud. Key contributions of GeoCloud are 1) a training algorithm for learning reference-representations from delivery coordinates and 2) a retrieval algorithm for geocoding new addresses. We perform extensive testing of GeoCloud across India to capture the regional, socio-economical and linguistic diversity of our country. Our evaluation data is sampled from 72 cities and 21 states from the delivery addresses of a large e-commerce platform in India. The results show a significant improvement in precision and recall over the state-of-the-art geocoding system for India, and demonstrate the effectiveness of our intuitive, robust and generic approach. While we have shown the effectiveness of the framework for Indian addresses, we believe the framework can be applied to other countries as well, particularly where addresses are unstructured. To the best of our knowledge, this is the first instance of geocoding by learning reference-representations from large-scale delivery data.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信