Hankai Liu, Xiulong Liu, Xin Xie, Xinyu Tong, Keqiu Li
{"title":"PmTrack","authors":"Hankai Liu, Xiulong Liu, Xin Xie, Xinyu Tong, Keqiu Li","doi":"10.1145/3631433","DOIUrl":"https://doi.org/10.1145/3631433","url":null,"abstract":"The difficulty in obtaining targets' identity poses a significant obstacle to the pursuit of personalized and customized millimeter-wave (mmWave) sensing. Existing solutions that learn individual differences from signal features have limitations in practical applications. This paper presents a Personalized mmWave-based human Tracking system, PmTrack, by introducing inertial measurement units (IMUs) as identity indicators. Widely available in portable devices such as smartwatches and smartphones, IMUs utilize existing wireless networks for data uploading of identity and data, and are therefore able to assist in radar target identification in a lightweight manner with little deployment and carrying burden for users. PmTrack innovatively adopts orientation as the matching feature, thus well overcoming the data heterogeneity between radar and IMU while avoiding the effect of cumulative errors. In the implementation of PmTrack, we propose a comprehensive set of optimization methods in detection enhancement, interference suppression, continuity maintenance, and trajectory correction, which successfully solved a series of practical problems caused by the three major challenges of weak reflection, point cloud overlap, and body-bounce ghost in multi-person tracking. In addition, an orientation correction method is proposed to overcome the IMU gimbal lock. Extensive experimental results demonstrate that PmTrack achieves an identification accuracy of 98% and 95% with five people in the hall and meeting room, respectively.","PeriodicalId":20553,"journal":{"name":"Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies","volume":"3 8","pages":"1 - 30"},"PeriodicalIF":0.0,"publicationDate":"2024-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139437972","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yuheng Wei, Jie Xiong, Hui Liu, Yingtao Yu, Jiangtao Pan, Junzhao Du
{"title":"AdaStreamLite","authors":"Yuheng Wei, Jie Xiong, Hui Liu, Yingtao Yu, Jiangtao Pan, Junzhao Du","doi":"10.1145/3631460","DOIUrl":"https://doi.org/10.1145/3631460","url":null,"abstract":"Streaming speech recognition aims to transcribe speech to text in a streaming manner, providing real-time speech interaction for smartphone users. However, it is not trivial to develop a high-performance streaming speech recognition system purely running on mobile platforms, due to the complex real-world acoustic environments and the limited computational resources of smartphones. Most existing solutions lack the generalization to unseen environments and have difficulty to work with streaming speech. In this paper, we design AdaStreamLite, an environment-adaptive streaming speech recognition tool for smartphones. AdaStreamLite interacts with its surroundings to capture the characteristics of the current acoustic environment to improve the robustness against ambient noise in a lightweight manner. We design an environment representation extractor to model acoustic environments with compact feature vectors, and construct a representation lookup table to improve the generalization of AdaStreamLite to unseen environments. We train our system using large speech datasets publicly available covering different languages. We conduct experiments in a large range of real acoustic environments with different smartphones. The results show that AdaStreamLite outperforms the state-of-the-art methods in terms of recognition accuracy, computational resource consumption and robustness against unseen environments.","PeriodicalId":20553,"journal":{"name":"Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies","volume":"11 22","pages":"1 - 29"},"PeriodicalIF":0.0,"publicationDate":"2024-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139437854","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Rishiraj Adhikary, M. Sadeh, N. Batra, Mayank Goel
{"title":"JoulesEye","authors":"Rishiraj Adhikary, M. Sadeh, N. Batra, Mayank Goel","doi":"10.1145/3631422","DOIUrl":"https://doi.org/10.1145/3631422","url":null,"abstract":"Smartphones and smartwatches have contributed significantly to fitness monitoring by providing real-time statistics, thanks to accurate tracking of physiological indices such as heart rate. However, the estimation of calories burned during exercise is inaccurate and cannot be used for medical diagnosis. In this work, we present JoulesEye, a smartphone thermal camera-based system that can accurately estimate calorie burn by monitoring respiration rate. We evaluated JoulesEye on 54 participants who performed high intensity cycling and running. The mean absolute percentage error (MAPE) of JoulesEye was 5.8%, which is significantly better than the MAPE of 37.6% observed with commercial smartwatch-based methods that only use heart rate. Finally, we show that an ultra-low-resolution thermal camera that is small enough to fit inside a watch or other wearables is sufficient for accurate calorie burn estimation. These results suggest that JoulesEye is a promising new method for accurate and reliable calorie burn estimation.","PeriodicalId":20553,"journal":{"name":"Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies","volume":"10 51","pages":"1 - 29"},"PeriodicalIF":0.0,"publicationDate":"2024-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139437925","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Harish Venugopalan, Z. Din, Trevor Carpenter, Jason Lowe-Power, Samuel T. King, Zubair Shafiq
{"title":"Aragorn","authors":"Harish Venugopalan, Z. Din, Trevor Carpenter, Jason Lowe-Power, Samuel T. King, Zubair Shafiq","doi":"10.1145/3631406","DOIUrl":"https://doi.org/10.1145/3631406","url":null,"abstract":"Mobile app developers often rely on cameras to implement rich features. However, giving apps unfettered access to the mobile camera poses a privacy threat when camera frames capture sensitive information that is not needed for the app's functionality. To mitigate this threat, we present Aragorn, a novel privacy-enhancing mobile camera system that provides fine grained control over what information can be present in camera frames before apps can access them. Aragorn automatically sanitizes camera frames by detecting regions that are essential to an app's functionality and blocking out everything else to protect privacy while retaining app utility. Aragorn can cater to a wide range of camera apps and incorporates knowledge distillation and crowdsourcing to extend robust support to previously unsupported apps. In our evaluations, we see that, with no degradation in utility, Aragorn detects credit cards with 89% accuracy and faces with 100% accuracy in context of credit card scanning and face recognition respectively. We show that Aragorn's implementation in the Android camera subsystem only suffers an average drop of 0.01 frames per second in frame rate. Our evaluations show that the overhead incurred by Aragorn to system performance is reasonable.","PeriodicalId":20553,"journal":{"name":"Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies","volume":"4 4","pages":"1 - 31"},"PeriodicalIF":0.0,"publicationDate":"2024-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139437964","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"SDE","authors":"Meng Xue, Yuyang Zeng, Shengkang Gu, Qian Zhang, Bowei Tian, Changzheng Chen","doi":"10.1145/3631438","DOIUrl":"https://doi.org/10.1145/3631438","url":null,"abstract":"Early screening for dry eye disease (DED) is crucial to identify and provide timely intervention to high-risk susceptible populations. Currently, clinical methods for diagnosing DED include the tear break-up time test, meibomian gland analysis, tear osmolarity test, and tear river height test, which require in-hospital detection. Unfortunately, there is no convenient way to screen for DED yet. In this paper, we propose SDE, a contactless, convenient, and ubiquitous DED screening system based on RF signals. To extract biomarkers for early screening of DED from RF signals, we construct frame chirps variance and extract fine-grained spontaneous blinking action. SDE is carefully designed to remove interference in RF signals and refine the characterization of biomarkers that denote the symptoms of DED. To endow SDE with the ability to adapt to new users, we develop a deep learning-based model of unsupervised domain adaptation to remove the influence of different users and environments in local and global two-level feature spaces. We conduct extensive experiments to evaluate SDE with 54 volunteers in 4 scenes. The experimental results confirm that SDE can accurately screen for DED in a new user in real environments such as eye examination rooms, clinics, offices, and homes.","PeriodicalId":20553,"journal":{"name":"Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies","volume":"1 2","pages":"1 - 23"},"PeriodicalIF":0.0,"publicationDate":"2024-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139437961","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"ClearSpeech","authors":"Dong Ma, Ting Dang, Ming Ding, Rajesh Balan","doi":"10.1145/3631409","DOIUrl":"https://doi.org/10.1145/3631409","url":null,"abstract":"Wireless earbuds have been gaining increasing popularity and using them to make phone calls or issue voice commands requires the earbud microphones to pick up human speech. When the speaker is in a noisy environment, speech quality degrades significantly and requires speech enhancement (SE). In this paper, we present ClearSpeech, a novel deep-learning-based SE system designed for wireless earbuds. Specifically, by jointly using the earbud's in-ear and out-ear microphones, we devised a suite of techniques to effectively fuse the two signals and enhance the magnitude and phase of the speech spectrogram. We built an earbud prototype to evaluate ClearSpeech under various settings with data collected from 20 subjects. Our results suggest that ClearSpeech can improve the SE performance significantly compared to conventional approaches using the out-ear microphone only. We also show that ClearSpeech can process user speech in real-time on smartphones.","PeriodicalId":20553,"journal":{"name":"Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies","volume":"3 6","pages":"1 - 25"},"PeriodicalIF":0.0,"publicationDate":"2024-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139437793","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Tianyu Zhang, Dongheng Zhang, Guanzhong Wang, Yadong Li, Yang Hu, Qibin sun, Yan Chen
{"title":"RLoc","authors":"Tianyu Zhang, Dongheng Zhang, Guanzhong Wang, Yadong Li, Yang Hu, Qibin sun, Yan Chen","doi":"10.1145/3631437","DOIUrl":"https://doi.org/10.1145/3631437","url":null,"abstract":"In recent years, decimeter-level accuracy in WiFi indoor localization has become attainable within controlled environments. However, existing methods encounter challenges in maintaining robustness in more complex indoor environments: angle-based methods are compromised by the significant localization errors due to unreliable Angle of Arrival (AoA) estimations, and fingerprint-based methods suffer from performance degradation due to environmental changes. In this paper, we propose RLoc, a learning-based system designed for reliable localization and tracking. The key design principle of RLoc lies in quantifying the uncertainty level arises in the AoA estimation task and then exploiting the uncertainty to enhance the reliability of localization and tracking. To this end, RLoc first manually extracts the underutilized beamwidth feature via signal processing techniques. Then, it integrates the uncertainty quantification into neural network design through Kullback-Leibler (KL) divergence loss and ensemble techniques. Finally, these quantified uncertainties guide RLoc to optimally leverage the diversity of Access Points (APs) and the temporal continuous information of AoAs. Our experiments, evaluating on two datasets gathered from commercial off-the-shelf WiFi devices, demonstrate that RLoc surpasses state-of-the-art approaches by an average of 36.27% in in-domain scenarios and 20.40% in cross-domain scenarios.","PeriodicalId":20553,"journal":{"name":"Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies","volume":"12 3","pages":"1 - 28"},"PeriodicalIF":0.0,"publicationDate":"2024-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139437883","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"MagDot","authors":"Dongyao Chen, Qing Luo, Xiaomeng Chen, Xinbing Wang, Chenghui Zhou","doi":"10.1145/3631423","DOIUrl":"https://doi.org/10.1145/3631423","url":null,"abstract":"Tracking the angular movement of body joints has been a critical enabler for various applications, such as virtual and augmented reality, sports monitoring, and medical rehabilitation. Despite the strong demand for accurate joint tracking, existing techniques, such as cameras, IMUs, and flex sensors, suffer from major limitations that include occlusion, cumulative error, and high cost. These issues collectively undermine the practicality of joint tracking. We introduce MagDot, a new magnetic-based joint tracking method that enables high-accuracy, drift-free, and wearable joint angle tracking. To overcome the limitations of existing techniques, MagDot employs a novel tracking scheme that compensates for various real-world impacts, achieving high tracking accuracy. We tested MagDot on eight participants with a professional motion capture system, i.e., Qualisys motion capture system with nine Arqus A12 cameras. The results indicate MagDot can accurately track major body joints. For example, MagDot can achieve tracking accuracy of 2.72°, 4.14°, and 4.61° for elbow, knee, and shoulder, respectively. With a power consumption of only 98 mW, MagDot can support one-day usage with a small battery pack.","PeriodicalId":20553,"journal":{"name":"Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies","volume":"8 8","pages":"1 - 25"},"PeriodicalIF":0.0,"publicationDate":"2024-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139438005","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
E. Yi, Fusang Zhang, Jie Xiong, Kai Niu, Zhiyun Yao, Daqing Zhang
{"title":"Enabling WiFi Sensing on New-generation WiFi Cards","authors":"E. Yi, Fusang Zhang, Jie Xiong, Kai Niu, Zhiyun Yao, Daqing Zhang","doi":"10.1145/3633807","DOIUrl":"https://doi.org/10.1145/3633807","url":null,"abstract":"The last few years have witnessed the rapid development of WiFi sensing with a large spectrum of applications enabled. However, existing works mainly leverage the obsolete 802.11n WiFi cards (i.e., Intel 5300 and Atheros AR9k series cards) for sensing. On the other hand, the mainstream WiFi protocols currently in use are 802.11ac/ax and commodity WiFi products on the market are equipped with new-generation WiFi chips such as Broadcom BCM43794 and Qualcomm QCN5054. After conducting some benchmark experiments, we find that WiFi sensing has problems working on these new cards. The new communication features (e.g., MU-MIMO) designed to facilitate data transmissions negatively impact WiFi sensing. Conventional CSI base signals such as CSI amplitude and/or CSI phase difference between antennas which worked well on Intel 5300 802.11n WiFi card may fail on new cards. In this paper, we propose delicate signal processing schemes to make wireless sensing work well on these new WiFi cards. We employ two typical sensing applications, i.e., human respiration monitoring and human trajectory tracking to demonstrate the effectiveness of the proposed schemes. We believe it is critical to ensure WiFi sensing compatible with the latest WiFi protocols and this work moves one important step towards real-life adoption of WiFi sensing.","PeriodicalId":20553,"journal":{"name":"Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies","volume":"7 4","pages":"1 - 26"},"PeriodicalIF":0.0,"publicationDate":"2024-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139437586","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"TS2ACT","authors":"Kang Xia, Wenzhong Li, Shiwei Gan, Sanglu Lu","doi":"10.1145/3631445","DOIUrl":"https://doi.org/10.1145/3631445","url":null,"abstract":"Human Activity Recognition (HAR) based on embedded sensor data has become a popular research topic in ubiquitous computing, which has a wide range of practical applications in various fields such as human-computer interaction, healthcare, and motion tracking. Due to the difficulties of annotating sensing data, unsupervised and semi-supervised HAR methods are extensively studied, but their performance gap to the fully-supervised methods is notable. In this paper, we proposed a novel cross-modal co-learning approach called TS2ACT to achieve few-shot HAR. It introduces a cross-modal dataset augmentation method that uses the semantic-rich label text to search for human activity images to form an augmented dataset consisting of partially-labeled time series and fully-labeled images. Then it adopts a pre-trained CLIP image encoder to jointly train with a time series encoder using contrastive learning, where the time series and images are brought closer in feature space if they belong to the same activity class. For inference, the feature extracted from the input time series is compared with the embedding of a pre-trained CLIP text encoder using prompt learning, and the best match is output as the HAR classification results. We conducted extensive experiments on four public datasets to evaluate the performance of the proposed method. The numerical results show that TS2ACT significantly outperforms the state-of-the-art HAR methods, and it achieves performance close to or better than the fully supervised methods even using as few as 1% labeled data for model training. The source codes of TS2ACT are publicly available on GitHub1.","PeriodicalId":20553,"journal":{"name":"Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies","volume":"1 10","pages":"1 - 22"},"PeriodicalIF":0.0,"publicationDate":"2024-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139437915","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}