{"title":"Bus travel feature inference with small samples based on multi-clustering topic model over Internet of Things","authors":"","doi":"10.1016/j.future.2024.107525","DOIUrl":null,"url":null,"abstract":"<div><p>With the widespread application of Internet of Things (IoT) technology, there has been a shift from a broad-brush to a more refined approach in traffic optimization. An increasing amount of IoT data is being utilized in trajectory mining and inference, offering more precise characteristic information for optimizing public transportation. Services that optimize public transit based on inferred travel characteristics can enhance the appeal of public transport, increase its likelihood as a travel choice, alleviate traffic congestion, and reduce carbon emissions. However, the inherent complexities of disorganized and unstructured public transportation data pose significant challenges to extracting travel features. This study explores the enhancement of bus travel by integrating advanced technologies like positioning systems, IoT, and AI to infer features in public transportation data. It introduces the MK-LDA (MeanShift Kmeans Latent Dirichlet Allocation), a novel thematic modeling technique for deducing characteristics of public transit travel using limited travel trajectory data. The model employs a segmented inference methodology, initially leveraging the Mean-shift clustering algorithm to create POI seeds, followed by the P-K-means algorithm for discerning patterns in user travel behavior and extracting travel modalities. Additionally, a P-LDA (POI-Latent Dirichlet Allocation) inference algorithm is proposed to examine the interplay between travel characteristics and behaviors, specifically targeting attributes significantly correlated with public transit usage, including age, occupation, gender, activity levels, cost, safety, and personality traits. Empirical validation highlights the efficacy of this thematic modeling-based inference technique in identifying and predicting travel characteristics and patterns, boasting enhanced interpretability and outperforming conventional benchmarks.</p></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":null,"pages":null},"PeriodicalIF":6.2000,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Future Generation Computer Systems-The International Journal of Escience","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0167739X24004898","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}
引用次数: 0
Abstract
With the widespread application of Internet of Things (IoT) technology, there has been a shift from a broad-brush to a more refined approach in traffic optimization. An increasing amount of IoT data is being utilized in trajectory mining and inference, offering more precise characteristic information for optimizing public transportation. Services that optimize public transit based on inferred travel characteristics can enhance the appeal of public transport, increase its likelihood as a travel choice, alleviate traffic congestion, and reduce carbon emissions. However, the inherent complexities of disorganized and unstructured public transportation data pose significant challenges to extracting travel features. This study explores the enhancement of bus travel by integrating advanced technologies like positioning systems, IoT, and AI to infer features in public transportation data. It introduces the MK-LDA (MeanShift Kmeans Latent Dirichlet Allocation), a novel thematic modeling technique for deducing characteristics of public transit travel using limited travel trajectory data. The model employs a segmented inference methodology, initially leveraging the Mean-shift clustering algorithm to create POI seeds, followed by the P-K-means algorithm for discerning patterns in user travel behavior and extracting travel modalities. Additionally, a P-LDA (POI-Latent Dirichlet Allocation) inference algorithm is proposed to examine the interplay between travel characteristics and behaviors, specifically targeting attributes significantly correlated with public transit usage, including age, occupation, gender, activity levels, cost, safety, and personality traits. Empirical validation highlights the efficacy of this thematic modeling-based inference technique in identifying and predicting travel characteristics and patterns, boasting enhanced interpretability and outperforming conventional benchmarks.
期刊介绍:
Computing infrastructures and systems are constantly evolving, resulting in increasingly complex and collaborative scientific applications. To cope with these advancements, there is a growing need for collaborative tools that can effectively map, control, and execute these applications.
Furthermore, with the explosion of Big Data, there is a requirement for innovative methods and infrastructures to collect, analyze, and derive meaningful insights from the vast amount of data generated. This necessitates the integration of computational and storage capabilities, databases, sensors, and human collaboration.
Future Generation Computer Systems aims to pioneer advancements in distributed systems, collaborative environments, high-performance computing, and Big Data analytics. It strives to stay at the forefront of developments in grids, clouds, and the Internet of Things (IoT) to effectively address the challenges posed by these wide-area, fully distributed sensing and computing systems.