{"title":"BERT-Based Semantic-Aware Heterogeneous Graph Embedding Method for Enhancing App Usage Prediction Accuracy","authors":"Xi Fang;Hui Yang;Liu Shi;Yilong Wang;Li Li","doi":"10.1109/THMS.2024.3412273","DOIUrl":"10.1109/THMS.2024.3412273","url":null,"abstract":"With the widespread adoption of smartphones and mobile Internet, understanding user behavior and improving user experience are critical. This article introduces semantic-aware (SA)-BERT, a novel model that integrates spatio-temporal and semantic information to represent App usage effectively. Leveraging BERT, SA-BERT captures rich contextual information. By introducing a specific objective function to represent the cooccurrence of App-time-location paths, SA-BERT can effectively model complex App usage structures. Based on this method, we adopt the learned embedding vectors in App usage prediction tasks. We evaluate the performance of SA-BERT using a large-scale real-world dataset. As demonstrated in the numerous experimental results, our model outperformed other strategies evidently. In terms of the prediction accuracy, we achieve a performance gain of 34.9% compared with widely used the SA representation learning via graph convolutional network (SA-GCN), and 134.4% than the context-aware App usage prediction with heterogeneous graph embedding. In addition, we reduced 79.27% training time compared with SA-GCN.","PeriodicalId":48916,"journal":{"name":"IEEE Transactions on Human-Machine Systems","volume":"54 4","pages":"465-474"},"PeriodicalIF":3.5,"publicationDate":"2024-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141500638","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Modeling Brake Perception Response Time in On-Road and Roadside Hazards Using an Integrated Cognitive Architecture","authors":"Umair Rehman;Shi Cao;Carolyn G. Macgregor","doi":"10.1109/THMS.2024.3408841","DOIUrl":"10.1109/THMS.2024.3408841","url":null,"abstract":"In this article, we used a computational cognitive architecture called queuing network–adaptive control of thought rational–situation awareness (QN–ACTR–SA) to model and simulate the brake perception response time (BPRT) to visual roadway hazards. The model incorporates an integrated driver model to simulate human driving behavior and uses a dynamic visual sampling model to simulate how drivers allocate their attention. We validated the model by comparing its results to empirical data from human participants who encountered on-road and roadside hazards in a simulated driving environment. The results showed that BPRT was shorter for on-road hazards compared to roadside hazards and that the overall model fitness had a mean absolute percentage error of 9.4% and a root mean squared error of 0.13 s. The modeling results demonstrated that QN–ACTR–SA could effectively simulate BPRT to both on-road and roadside hazards and capture the difference between the two contrasting conditions.","PeriodicalId":48916,"journal":{"name":"IEEE Transactions on Human-Machine Systems","volume":"54 4","pages":"441-454"},"PeriodicalIF":3.5,"publicationDate":"2024-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141516477","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"LANDER: Visual Analysis of Activity and Uncertainty in Surveillance Video","authors":"Tong Li;Guodao Sun;Baofeng Chang;Yunchao Wang;Qi Jiang;Yuanzhong Ying;Li Jiang;Haixia Wang;Ronghua Liang","doi":"10.1109/THMS.2024.3409722","DOIUrl":"10.1109/THMS.2024.3409722","url":null,"abstract":"Vision algorithms face challenges of limited visual presentation and unreliability in pedestrian activity assessment. In this article, we introduce LANDER, an interactive analysis system for visual exploration of pedestrian activity and uncertainty in surveillance videos. This visual analytics system focuses on three common categories of uncertainties in object tracking and action recognition. LANDER offers an overview visualization of activity and uncertainty, along with spatio-temporal exploration views closely associated with the scene. Expert evaluation and user study indicate that LANDER outperforms traditional video exploration in data presentation and analysis workflow. Specifically, compared to the baseline method, it excels in reducing retrieval time (\u0000<inline-formula><tex-math>$p< $</tex-math></inline-formula>\u0000 0.01), enhancing uncertainty identification (\u0000<inline-formula><tex-math>$p< $</tex-math></inline-formula>\u0000 0.05), and improving the user experience (\u0000<inline-formula><tex-math>$p< $</tex-math></inline-formula>\u0000 0.05).","PeriodicalId":48916,"journal":{"name":"IEEE Transactions on Human-Machine Systems","volume":"54 4","pages":"427-440"},"PeriodicalIF":3.5,"publicationDate":"2024-06-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141500636","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Personalized Trajectory-based Risk Prediction on Curved Roads with Consideration of Driver Turning Behavior and Workload","authors":"Yahui Liu;Jingyuan Li;Yingbo Sun;Xuewu Ji;Chen Lv","doi":"10.1109/THMS.2024.3407333","DOIUrl":"https://doi.org/10.1109/THMS.2024.3407333","url":null,"abstract":"Accurate and robust risk prediction on curved roads can significantly reduce lane departure accidents and improve traffic safety. However, limited study has considered dynamic driver-related factors in risk prediction, resulting in poor algorithm adaptiveness to individual differences. This article presents a novel personalized risk prediction method with consideration of driver turning behavior and workload by using the predicted vehicle trajectory.First, driving simulation experiments are conducted to collect synchronized trajectory data, vehicle dynamic data, and eye movement data. The drivers are distracted by answering questions via a Bluetooth headset, leading to an increased cognitive workload. Secondly, the \u0000<italic>k</i>\u0000-means clustering algorithm is utilized to extract two turning behaviors: driving toward the inner and outer side of a curved road. The turning behavior of each trajectory is then recognized using the trajectory data. In addition, the driver workload is recognized using the vehicle dynamic features and eye movement features. Thirdly, an extra personalization index is introduced to a long short-term memory encoder–decoder trajectory prediction network. This index integrates the driver turning behavior and workload information. After introducing the personalization index, the root-mean-square errors of the proposed network are reduced by 15.6%, 23.5%, and 29.1% with prediction horizons of 2, 3, and 4 s, respectively. Fourthly, the risk potential field theory is employed for risk prediction using the predicted trajectory data. This approach implicitly incorporates the driver's personalized information into risk prediction.","PeriodicalId":48916,"journal":{"name":"IEEE Transactions on Human-Machine Systems","volume":"54 4","pages":"406-415"},"PeriodicalIF":3.5,"publicationDate":"2024-06-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141725650","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Distributed Formation Control for a Class of Human-in-the-Loop Multiagent Systems","authors":"Xiao-Xiao Zhang;Huai-Ning Wu;Jin-Liang Wang","doi":"10.1109/THMS.2024.3398631","DOIUrl":"https://doi.org/10.1109/THMS.2024.3398631","url":null,"abstract":"In this article, the distributed formation control problem for a class of human-in-the-loop (HiTL) multiagent systems (MASs) is studied. A hidden Markov jump MAS is employed to model the HiTL MAS, which integrates the human models, the MAS model, and their interactions. The HiTL MAS investigated in this article is composed of two parts: a leader without human in the control loop and a group of followers in which each follower is simultaneously controlled by a human operator and an automation. For each follower, a hidden Markov model is used for modeling the human behaviors in consideration of the random nature of human internal state (HIS) reasoning and the uncertainty from HIS observation. By means of a stochastic Lyapunov function, a necessary and sufficient condition is first developed in terms of the linear matrix inequalities (LMIs) to ensure the formation of the HiTL MAS in the mean-square sense. Then, an LMI approach to the human-assistance control design is proposed for the automations in the followers to guarantee the mean-square formation of the HiTL MAS. Finally, simulation results are presented to verify the effectiveness of the proposed methods.","PeriodicalId":48916,"journal":{"name":"IEEE Transactions on Human-Machine Systems","volume":"54 4","pages":"416-426"},"PeriodicalIF":3.5,"publicationDate":"2024-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141725599","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Utilizing Gramian Angular Fields and Convolution Neural Networks in Flex Sensors Glove for Human–Computer Interaction","authors":"Chana Chansri;Jakkree Srinonchat","doi":"10.1109/THMS.2024.3404101","DOIUrl":"https://doi.org/10.1109/THMS.2024.3404101","url":null,"abstract":"The current sensor systems using the human–computer interface to develop a hand gesture recognition system remain challenging. This research presents the development of hand gesture recognition with 16-DoF glove sensors combined with a convolution neural network. The flex sensors are attached to 16 pivot joints of the human hand on the glove so that each knuckle flex can be measured while holding the object. The 16-DoF point sensors collecting circuit and adjustable buffer circuit were developed in this research to work with the Arduino Nano microcontroller to record each sensor's signal. This article investigates the time-series data of the flex sensor signal into 2-D colored images, concatenating the signals into one bigger image with a Gramian angular field and then recognition through a deep convolutional neural network (DCNN). The 16-DoF glove sensors were proposed for testing with three experiments using 8 models of DCNN recognition. These were conducted on 20 hand gesture recognition, 12 hand sign recognition, and object manipulation according to shape. The experimental results indicated that the best performance for the hand grasp experiment is 99.49% with Resnet 101, the hand sign experiment is 100% with Alexnet, and the object attribute experiment is 99.77% with InceptionNet V3.","PeriodicalId":48916,"journal":{"name":"IEEE Transactions on Human-Machine Systems","volume":"54 4","pages":"475-483"},"PeriodicalIF":3.5,"publicationDate":"2024-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141725620","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Syed T. Mubarrat;Antonio Fernandes;Suman K. Chowdhury
{"title":"A Physics-Based Virtual Reality Haptic System Design and Evaluation by Simulating Human-Robot Collaboration","authors":"Syed T. Mubarrat;Antonio Fernandes;Suman K. Chowdhury","doi":"10.1109/THMS.2024.3407109","DOIUrl":"https://doi.org/10.1109/THMS.2024.3407109","url":null,"abstract":"Recent advancements in virtual reality (VR) technology facilitate tracking real-world objects and users' movements in the virtual environment (VE) and inspire researchers to develop a physics-based haptic system (i.e., real object haptics) instead of computer-generated haptic feedback. However, there is limited research on the efficacy of such VR systems in enhancing operators’ sensorimotor learning for tasks that require high motor and physical demands. Therefore, this study aimed to design and evaluate the efficacy of a physics-based VR system that provides users with realistic cutaneous and kinesthetic haptic feedback. We designed a physics-based VR system, named PhyVirtual, and simulated human–robot collaborative (HRC) sequential pick-and-place lifting tasks in the VE. Participants performed the same tasks in the real environment (RE) with human–human collaboration instead of human–robot collaboration. We used a custom-designed questionnaire, the NASA-TLX, and electromyography activities from biceps, middle, and anterior deltoid muscles to determine user experience, workload, and neuromuscular dynamics, respectively. Overall, the majority of responses (>65%) demonstrated that the system is easy-to-use, easy-to-learn, and effective in improving motor skill performance. While compared to tasks performed in the RE, no significant difference was observed in the overall workload for the PhyVirtual system. The electromyography data exhibited similar trends (\u0000<italic>p</i>\u0000 > 0.05; \u0000<italic>r</i>\u0000 > 0.89) for both environments. These results show that the PhyVirtual system is an effective tool to simulate safe human–robot collaboration commonly seen in many modern warehousing settings. Moreover, it can be used as a viable replacement for live sensorimotor training in a wide range of fields.","PeriodicalId":48916,"journal":{"name":"IEEE Transactions on Human-Machine Systems","volume":"54 4","pages":"375-384"},"PeriodicalIF":3.5,"publicationDate":"2024-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141725598","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Fast and Efficient Approach for Human Action Recovery From Corrupted 3-D Motion Capture Data Using QR Decomposition-Based Approximate SVD","authors":"M. S. Subodh Raj;Sudhish N. George","doi":"10.1109/THMS.2024.3400290","DOIUrl":"https://doi.org/10.1109/THMS.2024.3400290","url":null,"abstract":"In this article, we propose a robust algorithm for the fast recovery of human actions from corrupted 3-D motion capture (mocap) sequences. The proposed algorithm can deal with misrepresentations and incomplete representations in mocap data simultaneously. Fast convergence of the proposed algorithm is ensured by minimizing the overhead associated with time and resource utilization. To this end, we have used an approximate singular value decomposition (SVD) based on QR decomposition and \u0000<inline-formula><tex-math>$ell _{2,1}$</tex-math></inline-formula>\u0000 norm minimization as a replacement for the conventional nuclear norm-based SVD. In addition, the proposed method is braced by incorporating the spatio-temporal properties of human action in the optimization problem. For this, we have introduced pair-wise hierarchical constraint and the trajectory movement constraint in the problem formulation. Finally, the proposed method is void of the requirement of a sizeable database for training the model. The algorithm can easily be adapted to work on any form of corrupted mocap sequences. The proposed algorithm is faster by 30% on average compared with the counterparts employing similar kinds of constraints with improved performance in recovery.","PeriodicalId":48916,"journal":{"name":"IEEE Transactions on Human-Machine Systems","volume":"54 4","pages":"395-405"},"PeriodicalIF":3.5,"publicationDate":"2024-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141725575","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Biao Jin;Xiao Ma;Bojun Hu;Zhenkai Zhang;Zhuxian Lian;Biao Wang
{"title":"Gesture-mmWAVE: Compact and Accurate Millimeter-Wave Radar-Based Dynamic Gesture Recognition for Embedded Devices","authors":"Biao Jin;Xiao Ma;Bojun Hu;Zhenkai Zhang;Zhuxian Lian;Biao Wang","doi":"10.1109/THMS.2024.3385124","DOIUrl":"10.1109/THMS.2024.3385124","url":null,"abstract":"Dynamic gesture recognition using millimeter-wave radar is a promising contactless mode of human–computer interaction with wide-ranging applications in various fields, such as intelligent homes, automatic driving, and sign language translation. However, the existing models have too many parameters and are unsuitable for embedded devices. To address this issue, we propose a dynamic gesture recognition method (named “Gesture-mmWAVE”) using millimeter-wave radar based on the multilevel feature fusion (MLFF) and transformer model. We first arrange each frame of the original echo collected by the frequency-modulated continuously modulated millimeter-wave radar in the Chirps × Samples format. Then, we use a 2-D fast Fourier transform to obtain the range-time map and Doppler-time map of gestures while improving the echo signal-to-noise ratio by coherent accumulation. Furthermore, we build an MLFF-transformer network for dynamic gesture recognition. The MLFF-transformer network comprises an MLFF module and a transformer module. The MLFF module employs the residual strategies to fuse the shallow, middle, and deep features and reduce the parameter size of the model using depthwise-separable convolution. The transformer module captures the global features of dynamic gestures and focuses on essential features using the multihead attention mechanism. The experimental results demonstrate that our proposed model achieves an average recognition accuracy of 99.11% on a dataset with 10% random interference. The scale of the proposed model is only 0.42M, which is 25% of that of the MobileNet V3-samll model. Thus, this method has excellent potential for application in embedded devices due to its small parameter size and high recognition accuracy.","PeriodicalId":48916,"journal":{"name":"IEEE Transactions on Human-Machine Systems","volume":"54 3","pages":"337-347"},"PeriodicalIF":3.6,"publicationDate":"2024-04-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140837281","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Synergistic Formal-Statistical Model for Recognizing Complex Human Activities","authors":"Nikolaos Bourbakis;Anargyros Angeleas","doi":"10.1109/THMS.2024.3382468","DOIUrl":"10.1109/THMS.2024.3382468","url":null,"abstract":"This article presents a view-independent synergistic model (formal and statistical) for efficiently recognizing complex human activities from video frames. To reduce the computational cost, the number of video frames is subsampled from 30 to 3 frames/s. SKD, a collaborative set of formal languages (\u0000<underline>S</u>\u0000OMA, \u0000<underline>K</u>\u0000INISIS, and \u0000<underline>D</u>\u0000RASIS), models simple and complex body actions and activities. SOMA language is a frame-based formal language representing body states (poses) extracted from frames. KINISIS is a formal language that uses the body poses extracted from SOMA to determine the consecutive poses (motion) that compose an activity. DRASIS language, finally, a convolution neural net, is used to classify simple activities, and an long short-term memory is used to recognize changes in activity. Experimental results using the SKD model on MSR Daily Activity three-dimensional (3-D) and UTKinect-Action3D datasets have shown that our method is among the top ones.","PeriodicalId":48916,"journal":{"name":"IEEE Transactions on Human-Machine Systems","volume":"54 3","pages":"229-237"},"PeriodicalIF":3.6,"publicationDate":"2024-04-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140803562","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}