Jixian Zhang;Peng Chen;Xuelin Yang;Hao Wu;Weidong Li
{"title":"An Optimal Reverse Affine Maximizer Auction Mechanism for Task Allocation in Mobile Crowdsensing","authors":"Jixian Zhang;Peng Chen;Xuelin Yang;Hao Wu;Weidong Li","doi":"10.1109/TMC.2025.3549504","DOIUrl":"https://doi.org/10.1109/TMC.2025.3549504","url":null,"abstract":"Mobile crowdsensing service (MCS) providers recruit users to complete data collection tasks with an incentive mechanism. How to maximize the utility of service providers has long been a popular topic in MCS research. Applying the existing reverse auction mechanism to an MCS may result in excessively high payments, thereby reducing the utility of the MCS provider. The affine maximizer auction (AMA) mechanism increases the revenue of service providers and meets dominant-strategy incentive-compatible (DSIC) characteristics. However, the AMA mechanism is a forward auction mechanism and cannot be applied to MCSs. Inspired by the AMA mechanism, this paper innovatively proposes a reverse affine maximizer auction (RAMA) mechanism to solve the task allocation problem of MCSs, effectively improving the MCS provider utility. Specifically, we construct a RAMA theoretical model and prove that the mechanism satisfies DSIC characteristics. For the discrete MCS task allocation problem, we use the reverse virtual valuation combinatorial auction (RVVCA) mechanism, a subclass of RAMA, to design a random mechanism RVVCA<inline-formula><tex-math>$^{t}$</tex-math></inline-formula> and prove that the RVVCA<inline-formula><tex-math>$^{t}$</tex-math></inline-formula> has a logarithmic approximate ratio. For the differentiable MCS task allocation problem, we use the deep learning transformer framework to design RAMANet, which can fit an exponential number of allocation solutions and output the optimal allocation and payment. We experimentally compare the algorithms of the RAMA family we propose, which use affine maximization, with existing state-of-the-art algorithms, demonstrating that the proposed algorithms significantly improve MCS provider utility.","PeriodicalId":50389,"journal":{"name":"IEEE Transactions on Mobile Computing","volume":"24 8","pages":"7475-7488"},"PeriodicalIF":7.7,"publicationDate":"2025-03-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144550753","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Qingyang Li;Yuanjiang Cao;Qianru Wang;Lina Yao;Zhiwen Yu;Jiangtao Cui
{"title":"FedHMIR: Unified Framework for Federated Human-Machine Synergy in Personalization-Generalization Balancing Identity Recognition","authors":"Qingyang Li;Yuanjiang Cao;Qianru Wang;Lina Yao;Zhiwen Yu;Jiangtao Cui","doi":"10.1109/TMC.2025.3549925","DOIUrl":"https://doi.org/10.1109/TMC.2025.3549925","url":null,"abstract":"As device-free identity recognition (IR) gains popularity and the demand for the Internet of Things (IoT) continues to grow, a new-era IR system featuring multiple distributed recognition devices and edge servers faces two main challenges: model adaptability and balancing the personalization of devices with the generalization of the system. This research introduces <italic>FedHMIR</i>, a federated framework designed to simultaneously address these challenges by harmonizing human-machine collaboration with personalization-generalization trade-offs. The proposed framework features a human-machine cooperative online internal update mechanism, leveraging reinforcement learning to maintain the adaptability of personalized local IR models. To counter overfitting and enhance the generalization of the overall IR system, an external update process incorporating a confidence index is introduced. Additionally, the framework employs asynchronous internal and external update procedures to effectively balance personalization and generalization between local and global models. Finally, extensive experiments on three diverse real-world datasets demonstrate the effectiveness and advantages of <italic>FedHMIR</i> compared to state-of-the-art baselines.","PeriodicalId":50389,"journal":{"name":"IEEE Transactions on Mobile Computing","volume":"24 8","pages":"7406-7422"},"PeriodicalIF":7.7,"publicationDate":"2025-03-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144550527","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Multimodal Scale Normalization Framework for Vision-Radar Small UAV Positioning","authors":"Yiyao Wan;Jiahuan Ji;Wenqing Xie;Guangyu Wu;Fuhui Zhou;Qihui Wu","doi":"10.1109/TMC.2025.3549620","DOIUrl":"https://doi.org/10.1109/TMC.2025.3549620","url":null,"abstract":"Uncrewed aerial vehicles (UAVs) positioning is of crucial importance in diverse applications. However, it is extremely challenging to realize the precise UAVs positioning over long distances due to the small size and dramatic scale variations associated with the high mobility in the wide area. To tackle this issue, a multimodal scale normalization framework is proposed for the scale-robust precise pixel-level UAV positioning. The framework exploits our proposed distance-aware image slicing and distance-aware scale normalization module. Moreover, a modal fusion-based scale normalization network is proposed that can accept arbitrary low-resolution UAV patches and produce the consistent high-resolution images at a uniform UAV instance scale with a single learnable model. The proposed framework is generic and can be directly used in the existing pixel-level positioning pipelines to improve the positioning performance and scale robustness. To verify the proposed framework in the real application, a practical vision-radar UAV positioning system is developed. Experimental results on the real-world dataset demonstrate the generality and effectiveness of our framework. Moreover, the ablation experiments also confirm the contribution of each module in the framework.","PeriodicalId":50389,"journal":{"name":"IEEE Transactions on Mobile Computing","volume":"24 8","pages":"6978-6995"},"PeriodicalIF":7.7,"publicationDate":"2025-03-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144550612","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Joint RIS and Beamforming Design for Secure and Energy-Efficient Two-Way Relay Communications","authors":"Shuangrui Zhao;Xinghui Zhu;Yuanyu Zhang;Zhiwei Zhang;Yulong Shen","doi":"10.1109/TMC.2025.3549445","DOIUrl":"https://doi.org/10.1109/TMC.2025.3549445","url":null,"abstract":"This paper examines the enhancement of secrecy energy efficiency (SEE) in a reconfigurable intelligent surface (RIS)-assisted two-way relay (TWR) system. We first establish a theoretical model for the system's secrecy rate, energy consumption, and SEE, and formulate the SEE maximization problem through the joint design of the RIS phase shifts and beamforming matrix. Using techniques such as weighted minimum mean square error (WMMSE), alternating optimization, and the augmented Lagrange method, we then develop a theoretical framework that identifies locally optimal solutions for the RIS and beamforming settings under unit-modulus and power constraints. The proposed framework is also shown to be applicable to solving the system's secrecy rate maximization problem. To address the computational complexity involved in optimizing the RIS phase shifts, we further propose a suboptimal scheme leveraging the Newton's method, which significantly reduces the computational burden while achieving performance close to the optimal SEE. Extensive numerical results validate the effectiveness of the proposed schemes, showing significant SEE improvements compared to traditional channel-capacity-based secure transmission scheme.","PeriodicalId":50389,"journal":{"name":"IEEE Transactions on Mobile Computing","volume":"24 8","pages":"7440-7457"},"PeriodicalIF":7.7,"publicationDate":"2025-03-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144550669","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Honggu Kang;Seohyeon Cha;Jinwoo Shin;Jongmyeong Lee;Joonhyuk Kang
{"title":"NeFL: Nested Model Scaling for Federated Learning With System Heterogeneous Clients","authors":"Honggu Kang;Seohyeon Cha;Jinwoo Shin;Jongmyeong Lee;Joonhyuk Kang","doi":"10.1109/TMC.2025.3549600","DOIUrl":"https://doi.org/10.1109/TMC.2025.3549600","url":null,"abstract":"Federated learning (FL) enables distributed training while preserving data privacy, but stragglers—slow or incapable clients can significantly slow down the total training time and degrade performance. To mitigate the impact of stragglers, system heterogeneity, including heterogeneous computing and network bandwidth, has been addressed. While previous studies have addressed system heterogeneity by splitting models into submodels, they offer limited flexibility in model architecture design, without considering potential inconsistencies arising from training multiple submodel architectures. We propose <italic>nested federated learning (NeFL)</i>, a generalized framework that efficiently divides deep neural networks into submodels using both depthwise and widthwise scaling. To address the <italic>inconsistency</i> arising from training multiple submodel architectures, NeFL decouples a subset of parameters from those being trained for each submodel. An averaging method is proposed to handle these decoupled parameters during aggregation. NeFL enables resource-constrained devices to effectively participate in the FL pipeline, facilitating larger datasets for model training. Experiments demonstrate that NeFL achieves performance gain, especially for the worst-case submodel compared to baseline approaches (7.63% improvement on CIFAR-100). Furthermore, NeFL aligns with recent advances in FL, such as leveraging pre-trained models and accounting for statistical heterogeneity.","PeriodicalId":50389,"journal":{"name":"IEEE Transactions on Mobile Computing","volume":"24 8","pages":"6734-6746"},"PeriodicalIF":7.7,"publicationDate":"2025-03-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144550190","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Qixuan Cai;Ruikai Chu;Kaixuan Zhang;Xiulong Liu;Xinyu Tong;Xin Xie;Jiancheng Chen;Keqiu Li
{"title":"AMRE: Adaptive Multilevel Redundancy Elimination for Multimodal Mobile Inference","authors":"Qixuan Cai;Ruikai Chu;Kaixuan Zhang;Xiulong Liu;Xinyu Tong;Xin Xie;Jiancheng Chen;Keqiu Li","doi":"10.1109/TMC.2025.3549422","DOIUrl":"https://doi.org/10.1109/TMC.2025.3549422","url":null,"abstract":"Given privacy and network load concerns, employing on-device multimodal neural networks (MNNs) for IoT data is a growing trend. However, the high computational demands of MNNs clash with limited on-device resources. MNNs involve input and model redundancies during inference, wasting resources to process redundant input components and run excess model parameters. Model Redundancy Elimination (MRE) reduces redundant parameters but cannot bypass inference for unnecessary input components. Input Redundancy Elimination (IRE) skips inference for redundant input components but cannot reduce computation for the remaining parts. MRE and IRE independently fail to meet the diverse computational needs of multimodal inference. To address these issues, we aim to combine the advantages of MRE and IRE to achieve a more efficient inference. We propose an <underline><b>a</b></u>daptive <underline><b>m</b></u>ultilevel <underline><b>r</b></u>edundancy <underline><b>e</b></u>limination framework (<italic>AMRE</i>), which supports both IRE and MRE. <italic>AMRE</i> first establishes a collaborative inference mechanism for IRE and MRE. We then propose a multifunctional, lightweight policy model that adaptively controls the inference logic for each instance. Moreover, a three-stage training method is proposed to ensure the performance of collaborative inference in <italic>AMRE</i>. We validate <italic>AMRE</i> in three scenarios, achieving up to 52.91% lower latency, 56.79% lower energy cost, and a slight accuracy gain compared to state-of-the-art baselines.","PeriodicalId":50389,"journal":{"name":"IEEE Transactions on Mobile Computing","volume":"24 8","pages":"7568-7583"},"PeriodicalIF":7.7,"publicationDate":"2025-03-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144550592","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Streamlining Data Transfer in Collaborative SLAM Through Bandwidth-Aware Map Distillation","authors":"Rui Ge;Huanghuang Liang;Zheng Gong;Chuang Hu;Xiaobo Zhou;Dazhao Cheng","doi":"10.1109/TMC.2025.3549367","DOIUrl":"https://doi.org/10.1109/TMC.2025.3549367","url":null,"abstract":"Edge intelligence offers a promising solution for Simultaneous Localization and Mapping (SLAM) in large-scale scenarios, where multiple robots collaboratively perceive the environment and upload their local maps to an edge server. However, maintaining mapping accuracy under constrained and dynamic communication resources remains a significant challenge for the practical deployment of robot swarms. Concurrent data uploads from multiple agents can exacerbate network congestion, leading to the loss of critical information, delayed updates, and, ultimately, the inconsistency of the generated maps. This paper presents Hermes, an edge-assisted collaborative mapping system designed for communication-constrained environments. Hermes streamlines data transfer through bandwidth-aware map distillation, ensuring only the most crucial messages are transmitted to the edge server. We quantify the importance of keyframes and landmarks based on their information entropy gain in pose estimation. By selectively sharing essential submaps, Hermes adaptively balances communication bandwidth and information richness during the mapping process. We implemented Hermes on heterogeneous platforms and conducted experiments using public datasets and self-collected campus data. Hermes exceeds SwarmMap by 50% in bandwidth utilization with similar accuracy and surpasses COVINS-G by 65% in trajectory error under highly constrained network resources.","PeriodicalId":50389,"journal":{"name":"IEEE Transactions on Mobile Computing","volume":"24 8","pages":"7554-7567"},"PeriodicalIF":7.7,"publicationDate":"2025-03-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144550187","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"MASA: Multimodal Federated Learning Through Modality-Aware and Secure Aggregation","authors":"Jialin Guo;Yongjian Fu;Zhiwei Zhai;Xinyi Li;Yongheng Deng;Sheng Yue;Lili Chen;Hao Pan;Ju Ren","doi":"10.1109/TMC.2025.3548954","DOIUrl":"https://doi.org/10.1109/TMC.2025.3548954","url":null,"abstract":"As a promising paradigm, federated learning has been applied to multimodal sensing tasks due to its deployment convenience. However, the recent advances in multimodal federated learning emphasize learning a high-quality multimodal model but overlook the model usage requirements of massive unimodal clients. Moreover, the privacy risk in model sharing and client data heterogeneity impact the efficacy of federated learning. In this paper, we propose a novel multimodal federated learning system named MASA. As a departure from existing approaches, MASA simultaneously enhances the model learning efficiency of both multimodal and unimodal clients while ensuring their data privacy. First, we employ a gated cross-modal distillation scheme to achieve performance-aware knowledge transfer across modality-heterogeneous clients. To enhance the system security, MASA integrates a lightweight split-shuffle mechanism to realize the anonymization and encryption of model aggregation. Moreover, to reach personalized collaboration while protecting privacy, MASA features an attention-based spontaneous client clustering mechanism to form client cluster structures securely and distributedly. We evaluate our MASA on four public multimodal datasets for human activity recognition. The results show that our MASA outperforms leading multimodal federated learning methods on the model performance of both multimodal and unimodal clients.","PeriodicalId":50389,"journal":{"name":"IEEE Transactions on Mobile Computing","volume":"24 8","pages":"7328-7344"},"PeriodicalIF":7.7,"publicationDate":"2025-03-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144550214","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Sensing Metal Coil Vibration of Headsets for Eavesdropping on Online Conversations With Out-of-Vocabulary Words Using RFID","authors":"Yunzhong Chen;Jiadi Yu;Yingying Chen;Linghe Kong;Yanmin Zhu;Yichao Chen","doi":"10.1109/TMC.2025.3548980","DOIUrl":"https://doi.org/10.1109/TMC.2025.3548980","url":null,"abstract":"As one of the most essential accessories, headsets have been widely used in common online conversations. The metal coil vibration patterns of headset speakers/microphones have been proven to be highly correlated with the speaker-produced/microphone-received sound. This paper presents an online conversation eavesdropping system, <italic>RFSpy</i>, which uses only one RFID tag attached on a headset to alternately sense metal coil vibrations of headset speaker and microphone for eavesdropping on speaker-produced and microphone-received sound. In some accessible scenarios, assuming attackers secretly attach a small, battery-free RFID tag under one ear cushion of an eavesdropped user’s headset without being noticed. Meanwhile, RFID readers are camouflaged as decorations placed in/out of rooms to transmit and receive RF signals. When the eavesdropped user talks with other users online through the headset, <italic>RFSpy</i> first activates the RFID tag to capture the metal coil vibration patterns of headset speaker and microphone upon RF signals. Then, <italic>RFSpy</i> reconstructs sound spectrograms from the RF signal-based vibration patterns for not only trained words but also untrained (i.e., out-of-vocabulary) words utilizing designed SSR network. Finally, <italic>RFSpy</i> converts the sound spectrograms to conversation content through sound recognition API. Extensive experiments demonstrate that <italic>RFSpy</i> can eavesdrop on online conversations with out-of-vocabulary words effectively.","PeriodicalId":50389,"journal":{"name":"IEEE Transactions on Mobile Computing","volume":"24 8","pages":"7107-7120"},"PeriodicalIF":7.7,"publicationDate":"2025-03-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144550523","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Peizhao Zhu;Yuzheng Zhu;Wenyuan Li;Yanbo He;Yongpan Zou;Kaishun Wu;Victor C. M. Leung
{"title":"CHAR: Composite Head-Body Activities Recognition With a Single Earable Device","authors":"Peizhao Zhu;Yuzheng Zhu;Wenyuan Li;Yanbo He;Yongpan Zou;Kaishun Wu;Victor C. M. Leung","doi":"10.1109/TMC.2025.3548647","DOIUrl":"https://doi.org/10.1109/TMC.2025.3548647","url":null,"abstract":"The increasing popularity of earable devices stimulates great academic interest to design novel head gesture-based interaction technologies. But existing works simply consider it as a singular activity recognition problem. This is not in line with practice since users may have different body movements such as walking and jogging along with head gestures. It is also beneficial to recognize body movements during human-device interaction since it provides useful context information. As a result, it is significant to recognize such composite activities in which actions of different body parts happen simultaneously. In this paper, we propose a system called CHAR to recognize composite head-body activities with a single IMU sensor. The key idea of our solution is to make use of the inter-correlation of different activities and design a multi-task learning network to extract shared and specific representations. We implement a real-time prototype and conduct extensive experiments to evaluate it. The results show that CHAR can recognize 60 kinds of composite activities (12 head gestures and 5 body movements) with high accuracies of 89.7% and 85.1% in sufficient data and insufficient data cases, respectively.","PeriodicalId":50389,"journal":{"name":"IEEE Transactions on Mobile Computing","volume":"24 7","pages":"6532-6549"},"PeriodicalIF":7.7,"publicationDate":"2025-03-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144219762","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}