基于视觉的半监督学习时间序列人群预测

IF 3.6 3区计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

IEEE Access Pub Date : 2025-09-01 DOI:10.1109/ACCESS.2025.3604713

Salma Saud Alghamdi;Lama Al Khuzayem;Ohoud Alzamzami

{"title":"基于视觉的半监督学习时间序列人群预测","authors":"Salma Saud Alghamdi;Lama Al Khuzayem;Ohoud Alzamzami","doi":"10.1109/ACCESS.2025.3604713","DOIUrl":null,"url":null,"abstract":"Crowd forecasting is a crucial component of public safety, urban planning, and event management, enabling proactive decision-making based on anticipated crowd dynamics. Traditional sensor-based approaches, such as WiFi-based methods, suffer from accuracy issues due to device penetration limitations. On the other hand, vision-based approaches, while more precise, typically require fully extensive labeled data and high computational resources. These demands restrict their application to forecasting often limited to predicting the next frame or a few seconds ahead. To overcome these challenges, this research presents a vision-based time series forecasting framework that exploits a semi-supervised deep learning approach. A semi-supervised crowd counting model, trained on just 5% of labeled images from a single day, is used to extract time series crowd counts from images captured over 16 days at 5-minute intervals. These extracted time series data are then used for training multiple Long Short-Term Memory (LSTM) variants to analyze the dynamics of crowd forecasting. Experimental results demonstrate that the proposed framework enables accurate crowd forecasting while reducing annotation costs. Unlike existing vision-based approaches, which are constrained to forecasting seconds ahead, our approach can forecast a horizon of one hour ahead. Notably, the CNN Autoencoder LSTM and ConvLSTM models achieved an RMSE of 61.93 and a MAPE of 26.13%. These findings highlight the effectiveness of semi-supervised learning with minimal labeled data in vision-based crowd forecasting. Future work will focus on improving generalizability and robustness across different urban environments.","PeriodicalId":13079,"journal":{"name":"IEEE Access","volume":"13 ","pages":"153523-153541"},"PeriodicalIF":3.6000,"publicationDate":"2025-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11145448","citationCount":"0","resultStr":"{\"title\":\"Vision-Based Time Series Crowd Forecasting Using Semi-Supervised Learning\",\"authors\":\"Salma Saud Alghamdi;Lama Al Khuzayem;Ohoud Alzamzami\",\"doi\":\"10.1109/ACCESS.2025.3604713\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Crowd forecasting is a crucial component of public safety, urban planning, and event management, enabling proactive decision-making based on anticipated crowd dynamics. Traditional sensor-based approaches, such as WiFi-based methods, suffer from accuracy issues due to device penetration limitations. On the other hand, vision-based approaches, while more precise, typically require fully extensive labeled data and high computational resources. These demands restrict their application to forecasting often limited to predicting the next frame or a few seconds ahead. To overcome these challenges, this research presents a vision-based time series forecasting framework that exploits a semi-supervised deep learning approach. A semi-supervised crowd counting model, trained on just 5% of labeled images from a single day, is used to extract time series crowd counts from images captured over 16 days at 5-minute intervals. These extracted time series data are then used for training multiple Long Short-Term Memory (LSTM) variants to analyze the dynamics of crowd forecasting. Experimental results demonstrate that the proposed framework enables accurate crowd forecasting while reducing annotation costs. Unlike existing vision-based approaches, which are constrained to forecasting seconds ahead, our approach can forecast a horizon of one hour ahead. Notably, the CNN Autoencoder LSTM and ConvLSTM models achieved an RMSE of 61.93 and a MAPE of 26.13%. These findings highlight the effectiveness of semi-supervised learning with minimal labeled data in vision-based crowd forecasting. Future work will focus on improving generalizability and robustness across different urban environments.\",\"PeriodicalId\":13079,\"journal\":{\"name\":\"IEEE Access\",\"volume\":\"13 \",\"pages\":\"153523-153541\"},\"PeriodicalIF\":3.6000,\"publicationDate\":\"2025-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11145448\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Access\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/11145448/\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Access","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/11145448/","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

摘要

人群预测是公共安全、城市规划和事件管理的重要组成部分，可以根据预期的人群动态进行前瞻性决策。传统的基于传感器的方法，如基于wifi的方法，由于设备渗透的限制，存在精度问题。另一方面，基于视觉的方法虽然更精确，但通常需要完全广泛的标记数据和高计算资源。这些要求限制了它们在预测中的应用，通常仅限于预测下一帧或提前几秒钟。为了克服这些挑战，本研究提出了一种基于视觉的时间序列预测框架，该框架利用半监督深度学习方法。一个半监督的人群计数模型，只训练了一天中5%的标记图像，用于从16天内以5分钟的间隔捕获的图像中提取时间序列人群计数。然后将这些提取的时间序列数据用于训练多个长短期记忆（LSTM）变量，以分析人群预测的动态。实验结果表明，该框架能够在降低标注成本的同时实现准确的人群预测。现有的基于视觉的方法只能提前几秒预测，而我们的方法可以提前一小时预测。值得注意的是，CNN Autoencoder LSTM和ConvLSTM模型的RMSE为61.93，MAPE为26.13%。这些发现强调了半监督学习在基于视觉的人群预测中使用最小标记数据的有效性。未来的工作将侧重于提高不同城市环境的通用性和稳健性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Vision-Based Time Series Crowd Forecasting Using Semi-Supervised Learning

Crowd forecasting is a crucial component of public safety, urban planning, and event management, enabling proactive decision-making based on anticipated crowd dynamics. Traditional sensor-based approaches, such as WiFi-based methods, suffer from accuracy issues due to device penetration limitations. On the other hand, vision-based approaches, while more precise, typically require fully extensive labeled data and high computational resources. These demands restrict their application to forecasting often limited to predicting the next frame or a few seconds ahead. To overcome these challenges, this research presents a vision-based time series forecasting framework that exploits a semi-supervised deep learning approach. A semi-supervised crowd counting model, trained on just 5% of labeled images from a single day, is used to extract time series crowd counts from images captured over 16 days at 5-minute intervals. These extracted time series data are then used for training multiple Long Short-Term Memory (LSTM) variants to analyze the dynamics of crowd forecasting. Experimental results demonstrate that the proposed framework enables accurate crowd forecasting while reducing annotation costs. Unlike existing vision-based approaches, which are constrained to forecasting seconds ahead, our approach can forecast a horizon of one hour ahead. Notably, the CNN Autoencoder LSTM and ConvLSTM models achieved an RMSE of 61.93 and a MAPE of 26.13%. These findings highlight the effectiveness of semi-supervised learning with minimal labeled data in vision-based crowd forecasting. Future work will focus on improving generalizability and robustness across different urban environments.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE Access COMPUTER SCIENCE, INFORMATION SYSTEMSENGIN-ENGINEERING, ELECTRICAL & ELECTRONIC

CiteScore

9.80

自引率

7.70%

发文量

6673

审稿时长

6 weeks

期刊介绍： IEEE Access® is a multidisciplinary, open access (OA), applications-oriented, all-electronic archival journal that continuously presents the results of original research or development across all of IEEE''s fields of interest. IEEE Access will publish articles that are of high interest to readers, original, technically correct, and clearly presented. Supported by author publication charges (APC), its hallmarks are a rapid peer review and publication process with open access to all readers. Unlike IEEE''s traditional Transactions or Journals, reviews are "binary", in that reviewers will either Accept or Reject an article in the form it is submitted in order to achieve rapid turnaround. Especially encouraged are submissions on: Multidisciplinary topics, or applications-oriented articles and negative results that do not fit within the scope of IEEE''s traditional journals. Practical articles discussing new experiments or measurement techniques, interesting solutions to engineering. Development of new or improved fabrication or manufacturing techniques. Reviews or survey articles of new or evolving fields oriented to assist others in understanding the new area.