{"title":"An adaptive threshold based gait authentication by incorporating quality measures","authors":"Sonia Das, Sukadev Meher, Upendra Kumar Sahoo","doi":"10.3233/aic-230121","DOIUrl":"https://doi.org/10.3233/aic-230121","url":null,"abstract":"In this paper, an adaptive threshold-based gait authentication model is proposed, which incorporates the quality measure in the distance domain and maps them into the gradient domain to realize the optimal threshold of each gait sample, in contrast to the fixed threshold, as most of the authentication model utilizes. For accessing the quality measure of each gait, a gait covariate invariant generative adversarial network (GCI-GAN) is proposed to generate normal gait (canonical condition) irrespective of covariates (carrying, and viewing conditions) while preserving the subject identity. In particular, GCI-GAN connects to gradient weighted class activation mapping (Grad-CAMs) to obtain an attention mask from the significant components of input features, employs blending operation to manipulate specific regions of the input, and finally, multiple losses are employed to constrain the quality of generated samples. We validate the approach on gait datasets of CASIA-B and OU-ISIR and show a substantial increase in authentication rate over other state-of-the-art techniques.","PeriodicalId":50835,"journal":{"name":"AI Communications","volume":"54 3","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135167098","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Effective training to improve DeepPilot","authors":"L. Oyuki Rojas-Perez, Jose Martinez-Carranza","doi":"10.3233/aic-230065","DOIUrl":"https://doi.org/10.3233/aic-230065","url":null,"abstract":"We present an approach to autonomous drone racing inspired by how a human pilot learns a race track. Human pilots drive around the track multiple times to familiarise themselves with the track and find key points that allow them to complete the track without the risk of collision. This paper proposes a three-stage approach: exploration, navigation, and refinement. Our approach does not require prior knowledge about the race track, such as the number of gates, their positions, and their orientations. Instead, we use a trained neural pilot called DeepPilot to return basic flight commands from camera images where a gate is visible to navigate an unknown race track and a Single Shot Detector to visually detect the gates during the exploration stage to identify points of interest. These points are then used in the navigation stage as waypoints in a flight controller to enable faster flight and navigate the entire race track. Finally, in the refinement stage, we use the methodology developed in stages 1 and 2, to generate novel data to re-train DeepPilot, which produces more realistic manoeuvres for when the drone has to cross a gate. In this sense, similar to the original work, rather than generating examples by flying in a full track, we use small tracks of three gates to discover effective waypoints to be followed by the waypoint controller. This produces novel training data for DeepPilot without human intervention. By training with this new data, DeepPilot significantly improves its performance by increasing its flight speed twice w.r.t. its original version. Also, for this stage 3, we required 66 % less training data than in the original DeepPilot without compromising the effectiveness of DeepPilot to enable a drone to autonomously fly in a racetrack.","PeriodicalId":50835,"journal":{"name":"AI Communications","volume":"33 1-2","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135266791","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Lifetime policy reuse and the importance of task capacity","authors":"David M. Bossens, Adam J. Sobey","doi":"10.3233/aic-230040","DOIUrl":"https://doi.org/10.3233/aic-230040","url":null,"abstract":"A long-standing challenge in artificial intelligence is lifelong reinforcement learning, where learners are given many tasks in sequence and must transfer knowledge between tasks while avoiding catastrophic forgetting. Policy reuse and other multi-policy reinforcement learning techniques can learn multiple tasks but may generate many policies. This paper presents two novel contributions, namely 1) Lifetime Policy Reuse, a model-agnostic policy reuse algorithm that avoids generating many policies by optimising a fixed number of near-optimal policies through a combination of policy optimisation and adaptive policy selection; and 2) the task capacity, a measure for the maximal number of tasks that a policy can accurately solve. Comparing two state-of-the-art base-learners, the results demonstrate the importance of Lifetime Policy Reuse and task capacity based pre-selection on an 18-task partially observable Pacman domain and a Cartpole domain of up to 125 tasks.","PeriodicalId":50835,"journal":{"name":"AI Communications","volume":"72 5-6","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135220355","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"DW: Detected weight for 3D object detection","authors":"Zhi Huang","doi":"10.3233/aic-230008","DOIUrl":"https://doi.org/10.3233/aic-230008","url":null,"abstract":"It is a generic paradigm to treat all samples equally in 3D object detection. Although some works focus on discriminating samples in the training process of object detectors, the issue of whether a sample detects its target GT (Ground Truth) during training process has never been studied. In this work, we first point out that discriminating the samples that detect their target GT and the samples that don’t detect their target GT is beneficial to improve the performance measured in terms of mAP (mean Average Precision). Then we propose a novel approach name as DW (Detected Weight). The proposed approach dynamically calculates and assigns different weights to detected and undetected samples, which suppresses the former and promotes the latter. The approach is simple, low-calculation and can be integrated with available weight approaches. Further, it can be applied to almost 3D detectors, even 2D detectors because it is nothing to do with network structures. We evaluate the proposed approach with six state-of-the-art 3D detectors on two datasets. The experiment results show that the proposed approach improves mAP significantly.","PeriodicalId":50835,"journal":{"name":"AI Communications","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135805685","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Dongzhi He, Yongle Xue, Yunyu Li, Zhijie Sun, Xingmei Xiao, Jin Wang
{"title":"Multi-scale spatio-temporal network for skeleton-based gait recognition","authors":"Dongzhi He, Yongle Xue, Yunyu Li, Zhijie Sun, Xingmei Xiao, Jin Wang","doi":"10.3233/aic-230033","DOIUrl":"https://doi.org/10.3233/aic-230033","url":null,"abstract":"Gait has unique physiological characteristics and supports long-distance recognition, so gait recognition is ideal for areas such as home security and identity detection. Methods using graph convolutional networks usually extract features in the spatial and temporal dimensions by stacking GCNs and TCNs, but different joints are interconnected at different moments, so splitting the spatial and temporal dimensions can cause the loss of gait information. Focus on this problem, we propose a gait recognition network, Multi-scale Spatio-Temporal Gait (MST-Gait), which can learn multi-scale gait information simultaneously from spatial and temporal dimensions. We design a multi-scale spatio-temporal groups Transformer (MSTGT) to model the correlation of intra-frame and inter-frame joints simultaneously. And a multi-scale segmentation strategy is designed to capture the periodic and local features of the gait. To fully exploit the temporal information of gait motion, we design a fusion temporal convolution (FTC) to aggregate temporal information at different scales and motion information. Experiments on the popular CASIA-B gait dataset and OUMVLP-Pose dataset show that our method outperforms most existing skeleton-based methods, verifying the effectiveness of the proposed modules.","PeriodicalId":50835,"journal":{"name":"AI Communications","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135804908","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Dual cross-domain session-based recommendation with multi-channel integration","authors":"Jinjin Zhang, Xiang Hua, Peng Zhao, Kai Kang","doi":"10.3233/aic-230084","DOIUrl":"https://doi.org/10.3233/aic-230084","url":null,"abstract":"Session-based recommendation aims at predicting the next behavior when the current interaction sequence is given. Recent advances evaluate the effectiveness of dual cross-domain information for the session-based recommendation. However, we discover that accurately modeling the session representations is still a challenging problem due to the complexity of preference interactions in the cross-domain, and various methods are proposed to only model the common features of cross-domain, while ignoring the specific features and enhanced features for the dual cross-domain. Without modeling the complete features, the existing methods suffer from poor recommendation accuracy. Therefore, we propose an end-to-end dual cross-domain with multi-channel interaction model (DCMI), which utilizes dual cross-domain session information and multiple preference interaction encoders, for session-based recommendation. In DCMI, we apply a graph neural network to generate the session global preference and local preference. Then, we design a cross-preference interaction module to capture the common, specific, and enhanced features for cross-domain sessions with local preferences and global preferences. Finally, we combine multiple preferences with a bilinear fusion mechanism to characterize and make recommendations. Experimental results on the Amazon dataset demonstrate the superiority of the DCMI model over the state-of-the-art methods.","PeriodicalId":50835,"journal":{"name":"AI Communications","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135805221","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ning Sun, Pengfei Shen, Xiaoling Ye, Yifei Chen, Xiping Cheng, Pingping Wang, Jie Min
{"title":"Conflagration-YOLO: a lightweight object detection architecture for conflagration","authors":"Ning Sun, Pengfei Shen, Xiaoling Ye, Yifei Chen, Xiping Cheng, Pingping Wang, Jie Min","doi":"10.3233/aic-230094","DOIUrl":"https://doi.org/10.3233/aic-230094","url":null,"abstract":"Fire monitoring of fire-prone areas is essential, and in order to meet the requirements of edge deployment and the balance of fire recognition accuracy and speed, we design a lightweight fire recognition network, Conflagration-YOLO. Conflagration-YOLO is constructed by depthwise separable convolution and more attention to fire feature information extraction from a three-dimensional(3D) perspective, which improves the network feature extraction capability, achieves a balance of accuracy and speed, and reduces model parameters. In addition, a new activation function is used to improve the accuracy of fire recognition while minimizing the inference time of the network. All models are trained and validated on a custom fire dataset and fire inference is performed on the CPU. The mean Average Precision(mAP) of the proposed model reaches 80.92%, which has a great advantage compared with Faster R-CNN. Compared with YOLOv3-Tiny, the proposed model decreases the number of parameters by 5.71 M and improves the mAP by 6.67%. Compared with YOLOv4-Tiny, the number of parameters decreases by 3.54 M, mAP increases by 8.47%, and inference time decreases by 62.59 ms. Compared with YOLOv5s, the difference in the number of parameters is nearly twice reduced by 4.45 M and the inference time is reduced by 41.87 ms. Compared with YOLOX-Tiny, the number of parameters decreases by 2.5 M, mAP increases by 0.7%, and inference time decreases by 102.49 ms. Compared with YOLOv7, the number of parameters decreases significantly and the balance of accuracy and speed is achieved. Compared with YOLOv7-Tiny, the number of parameters decreases by 3.64 M, mAP increases by 0.5%, and inference time decreases by 15.65 ms. The experiment verifies the superiority and effectiveness of Conflagration-YOLO compared to the state-of-the-art (SOTA) network model. Furthermore, our proposed model and its dimensional variants can be applied to computer vision downstream target detection tasks in other scenarios as required.","PeriodicalId":50835,"journal":{"name":"AI Communications","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135805222","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ana Lucia C. Bazzan, Vicente N. de Almeida, Monireh Abdoos
{"title":"Transferring experiences in k-nearest neighbors based multiagent reinforcement learning: an application to traffic signal control","authors":"Ana Lucia C. Bazzan, Vicente N. de Almeida, Monireh Abdoos","doi":"10.3233/aic-220305","DOIUrl":"https://doi.org/10.3233/aic-220305","url":null,"abstract":"The increasing demand for mobility in our society poses various challenges to traffic engineering, computer science in general, and artificial intelligence in particular. Increasing the capacity of road networks is not always possible, thus a more efficient use of the available transportation infrastructure is required. Another issue is that many problems in traffic management and control are inherently decentralized and/or require adaptation to the traffic situation. Hence, there is a close relationship to multiagent reinforcement learning. However, using reinforcement learning poses the challenge that the state space is normally large and continuous, thus it is necessary to find appropriate schemes to deal with discretization of the state space. To address these issues, a multiagent system with agents learning independently via a learning algorithm was proposed, which is based on estimating Q-values from k-nearest neighbors. In the present paper, we extend this approach and include transfer of experiences among the agents, especially when an agent does not have a good set of k experiences. We deal with traffic signal control, running experiments on a traffic network in which we vary the traffic situation along time, and compare our approach to two baselines (one involving reinforcement learning and one based on fixed times). Our results show that the extended method pays off when an agent returns to an already experienced traffic situation.","PeriodicalId":50835,"journal":{"name":"AI Communications","volume":"69 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135586495","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Debaditya Roy, Vangjush Komini, Sarunas Girdzijauskas
{"title":"Classifying falls using out-of-distribution detection in human activity recognition","authors":"Debaditya Roy, Vangjush Komini, Sarunas Girdzijauskas","doi":"10.3233/aic-220205","DOIUrl":"https://doi.org/10.3233/aic-220205","url":null,"abstract":"As the research community focuses on improving the reliability of deep learning, identifying out-of-distribution (OOD) data has become crucial. Detecting OOD inputs during test/prediction allows the model to account for discriminative features unknown to the model. This capability increases the model’s reliability since this model provides a class prediction solely at incoming data similar to the training one. Although OOD detection is well-established in computer vision, it is relatively unexplored in other areas, like time series-based human activity recognition (HAR). Since uncertainty has been a critical driver for OOD in vision-based models, the same component has proven effective in time-series applications. In this work, we propose an ensemble-based temporal learning framework to address the OOD detection problem in HAR with time-series data. First, we define different types of OOD for HAR that arise from realistic scenarios. Then we apply our ensemble-based temporal learning framework incorporating uncertainty to detect OODs for the defined HAR workloads. This particular formulation also allows a novel approach to fall detection. We train our model on non-fall activities and detect falls as OOD. Our method shows state-of-the-art performance in a fall detection task using much lesser data. Furthermore, the ensemble framework outperformed the traditional deep-learning method (our baseline) on the OOD detection task across all the other chosen datasets.","PeriodicalId":50835,"journal":{"name":"AI Communications","volume":" ","pages":""},"PeriodicalIF":0.8,"publicationDate":"2023-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49090518","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"TMTrans: texture mixed transformers for medical image segmentation","authors":"Lifang Chen, Tao Wang, Hongze Ge","doi":"10.3233/aic-230089","DOIUrl":"https://doi.org/10.3233/aic-230089","url":null,"abstract":"Accurate segmentation of skin cancer is crucial for doctors to identify and treat lesions. Researchers are increasingly using auxiliary modules with Transformers to optimize the model’s ability to process global context information and reduce detail loss. Additionally, diseased skin texture differs from normal skin, and pre-processed texture images can reflect the shape and edge information of the diseased area. We propose TMTrans (Texture Mixed Transformers). We have innovatively designed a dual axis attention mechanism (IEDA-Trans) that considers both global context and local information, as well as a multi-scale fusion (MSF) module that associates surface shape information with deep semantics. Additionally, we utilize TE(Texture Enhance) and SK(Skip connection) modules to bridge the semantic gap between encoders and decoders and enhance texture features. Our model was evaluated on multiple skin datasets, including ISIC 2016/2017/2018 and PH2, and outperformed other convolution and Transformer-based models. Furthermore, we conducted a generalization test on the 2018 DSB dataset, which resulted in a nearly 2% improvement in the Dice index, demonstrating the effectiveness of our proposed model.","PeriodicalId":50835,"journal":{"name":"AI Communications","volume":" ","pages":""},"PeriodicalIF":0.8,"publicationDate":"2023-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44943605","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}