Gaetano Carmelo La Delfa , Javier Prieto , Salvatore Monteleone , Hamaad Rafique , Maurizio Palesi , Davide Patti
{"title":"Survey of smartphone-based datasets for indoor localization: A machine learning perspective","authors":"Gaetano Carmelo La Delfa , Javier Prieto , Salvatore Monteleone , Hamaad Rafique , Maurizio Palesi , Davide Patti","doi":"10.1016/j.iot.2025.101753","DOIUrl":null,"url":null,"abstract":"<div><div>Indoor localization has gained significant attention in recent years due to its applications across sectors such as healthcare, logistics, manufacturing, and retail. However, while outdoor localization has been effectively addressed with GPS, indoor localization remains challenging despite significant research progress. Many studies have explored the capabilities of modern smartphones, equipped with a variety of sensors, to develop machine-learning methods for indoor localization, ranging from classical fingerprinting to deep sequence models and transformers. Nevertheless, most rely on small, proprietary datasets that are not publicly available. Large, high-quality public datasets are essential for researchers to efficiently test, refine, and validate algorithms, enable comparisons between different approaches and develop robust and accurate localization solutions. To reduce data collection time and costs and help researchers find the most appropriate datasets for their needs, this paper surveys 20 publicly available high-quality indoor localization datasets suitable for Machine Learning, released between 2014 and 2024, that cover various sensing technologies. The survey reveals a shift toward multi-sensor data collection, extending beyond Wi-Fi and Bluetooth signals to include inertial sensors such as accelerometers and gyroscopes, as well as magnetic fields. It also highlights that while over 75% of datasets cover multi-floor structures or multiple buildings, there is a scarcity of datasets covering diverse types of indoor environments, with most focused on office or academic settings. Moreover, the temporal dimension, crucial in dynamic indoor scenarios, remains largely underrepresented, limiting the development of ML models for tracking dynamic trajectories or adapting to evolving signal patterns.</div></div>","PeriodicalId":29968,"journal":{"name":"Internet of Things","volume":"34 ","pages":"Article 101753"},"PeriodicalIF":7.6000,"publicationDate":"2025-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Internet of Things","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2542660525002665","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
Indoor localization has gained significant attention in recent years due to its applications across sectors such as healthcare, logistics, manufacturing, and retail. However, while outdoor localization has been effectively addressed with GPS, indoor localization remains challenging despite significant research progress. Many studies have explored the capabilities of modern smartphones, equipped with a variety of sensors, to develop machine-learning methods for indoor localization, ranging from classical fingerprinting to deep sequence models and transformers. Nevertheless, most rely on small, proprietary datasets that are not publicly available. Large, high-quality public datasets are essential for researchers to efficiently test, refine, and validate algorithms, enable comparisons between different approaches and develop robust and accurate localization solutions. To reduce data collection time and costs and help researchers find the most appropriate datasets for their needs, this paper surveys 20 publicly available high-quality indoor localization datasets suitable for Machine Learning, released between 2014 and 2024, that cover various sensing technologies. The survey reveals a shift toward multi-sensor data collection, extending beyond Wi-Fi and Bluetooth signals to include inertial sensors such as accelerometers and gyroscopes, as well as magnetic fields. It also highlights that while over 75% of datasets cover multi-floor structures or multiple buildings, there is a scarcity of datasets covering diverse types of indoor environments, with most focused on office or academic settings. Moreover, the temporal dimension, crucial in dynamic indoor scenarios, remains largely underrepresented, limiting the development of ML models for tracking dynamic trajectories or adapting to evolving signal patterns.
期刊介绍:
Internet of Things; Engineering Cyber Physical Human Systems is a comprehensive journal encouraging cross collaboration between researchers, engineers and practitioners in the field of IoT & Cyber Physical Human Systems. The journal offers a unique platform to exchange scientific information on the entire breadth of technology, science, and societal applications of the IoT.
The journal will place a high priority on timely publication, and provide a home for high quality.
Furthermore, IOT is interested in publishing topical Special Issues on any aspect of IOT.