{"title":"Deep Learning to Interpret Autism Spectrum Disorder Behind the Camera","authors":"Shi Chen;Ming Jiang;Qi Zhao","doi":"10.1109/TCDS.2024.3386656","DOIUrl":"10.1109/TCDS.2024.3386656","url":null,"abstract":"There is growing interest in understanding the visual behavioral patterns of individuals with autism spectrum disorder (ASD) based on their attentional preferences. Attention reveals the cognitive or perceptual variation in ASD and can serve as a biomarker to assist diagnosis and intervention. The development of machine learning methods for attention-based ASD screening shows promises, yet it has been limited by the need for high-precision eye trackers, the scope of stimuli, and black-box neural networks, making it impractical for real-life clinical scenarios. This study proposes an interpretable and generalizable framework for quantifying atypical attention in people with ASD. Our framework utilizes photos taken by participants with standard cameras to enable practical and flexible deployment in resource-constrained regions. With an emphasis on interpretability and trustworthiness, our method automates human-like diagnostic reasoning, associates photos with semantically plausible attention patterns, and provides clinical evidence to support ASD experts. We further evaluate models on both in-domain and out-of-domain data and demonstrate that our approach accurately classifies individuals with ASD and generalizes across different domains. The proposed method offers an innovative, reliable, and cost-effective tool to assist the diagnostic procedure, which can be an important effort toward transforming clinical research in ASD screening with artificial intelligence systems. Our code is publicly available at \u0000<uri>https://github.com/szzexpoi/proto_asd</uri>\u0000.","PeriodicalId":54300,"journal":{"name":"IEEE Transactions on Cognitive and Developmental Systems","volume":"16 5","pages":"1803-1813"},"PeriodicalIF":5.0,"publicationDate":"2024-04-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140593962","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Shan Zhong;Guoqiang Li;Wenhao Ying;Fuzhou Zhao;Gengsheng Xie;Shengrong Gong
{"title":"Efficient Semisupervised Object Segmentation for Long-Term Videos Using Adaptive Memory Network","authors":"Shan Zhong;Guoqiang Li;Wenhao Ying;Fuzhou Zhao;Gengsheng Xie;Shengrong Gong","doi":"10.1109/TCDS.2024.3385849","DOIUrl":"10.1109/TCDS.2024.3385849","url":null,"abstract":"Video object segmentation (VOS) uses the first annotated video mask to achieve consistent and precise segmentation in subsequent frames. Recently, memory-based methods have received significant attention owing to their substantial performance enhancements. However, these approaches rely on a fixed global memory strategy, which poses a challenge to segmentation accuracy and speed in the context of longer videos. To alleviate this limitation, we propose a novel semisupervised VOS model, founded on the principles of the adaptive memory network. Our proposed model adaptively extracts object features by focusing on the object area while effectively filtering out extraneous background noise. An identification mechanism is also thoughtfully applied to discern each object in multiobject scenarios. To further reduce storage consumption without compromising the saliency of object information, the outdated features residing in the memory pool are compressed into salient features through the employment of a self-attention mechanism. Furthermore, we introduce a local matching module, specifically devised to refine object features by fusing the contextual information from historical frames. We demonstrate the efficiency of our approach through experiments, substantially augmenting both the speed and precision of segmentation for long-term videos, while maintaining comparable performance for short videos.","PeriodicalId":54300,"journal":{"name":"IEEE Transactions on Cognitive and Developmental Systems","volume":"16 5","pages":"1789-1802"},"PeriodicalIF":5.0,"publicationDate":"2024-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140593831","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Md. Nurul Ahad Tawhid;Siuly Siuly;Kate Wang;Hua Wang
{"title":"GENet: A Generic Neural Network for Detecting Various Neurological Disorders From EEG","authors":"Md. Nurul Ahad Tawhid;Siuly Siuly;Kate Wang;Hua Wang","doi":"10.1109/TCDS.2024.3386364","DOIUrl":"10.1109/TCDS.2024.3386364","url":null,"abstract":"The global health burden of neurological disorders (NDs) is vast, and they are recognized as major causes of mortality and disability worldwide. Most existing NDs detection methods are disease-specific, which limits an algorithm's cross-disease applicability. A single diagnostic platform can save time and money over multiple diagnostic systems. There is currently no unified standard platform for diagnosing different types of NDs utilizing electroencephalogram (EEG) signal data. To address this issue, this study aims to develop a generic EEG neural Network (GENet) framework based on a convolutional neural network that can identify various NDs from EEG. The proposed framework consists of several parts: 1) preparing data using channel reduction, resampling, and segmentation for the GENet model; 2) designing and training the GENet model to carry out important features for the classification task; and 3) assessing the proposed model's performance using different signal segment lengths and several training batch sizes and also cross-validating using seven different EEG datasets of six distinct NDs namely schizophrenia, autism spectrum disorder, epilepsy, Parkinson's disease, mild cognitive impairment, and attention-deficit/hyperactivity disorder. In addition, this study also investigates whether the proposed GENet model can identify multiple NDs from EEG. The proposed model achieved much better performance for both binary and multiclass classification compared to state-of-the-art methods. In addition, the proposed model is validated using several ablation studies and layerwise feature visualization, which provide consistency and efficiency to the proposed model. The proposed GENet model will help technologists create standard software for detecting any of these NDs from EEG.","PeriodicalId":54300,"journal":{"name":"IEEE Transactions on Cognitive and Developmental Systems","volume":"16 5","pages":"1829-1842"},"PeriodicalIF":5.0,"publicationDate":"2024-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140593957","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"IEEE Transactions on Cognitive and Developmental Systems Publication Information","authors":"","doi":"10.1109/TCDS.2024.3373151","DOIUrl":"https://doi.org/10.1109/TCDS.2024.3373151","url":null,"abstract":"","PeriodicalId":54300,"journal":{"name":"IEEE Transactions on Cognitive and Developmental Systems","volume":"16 2","pages":"C2-C2"},"PeriodicalIF":5.0,"publicationDate":"2024-04-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10491580","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140348382","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Junpei Zhong;Ran Dong;Soichiro Ikuno;Yanan Li;Chenguang Yang
{"title":"Guest Editorial Special Issue on Movement Sciences in Cognitive Systems","authors":"Junpei Zhong;Ran Dong;Soichiro Ikuno;Yanan Li;Chenguang Yang","doi":"10.1109/TCDS.2024.3372274","DOIUrl":"https://doi.org/10.1109/TCDS.2024.3372274","url":null,"abstract":"Movements play a critical role in robotic systems, with considerations varying across different robotic systems regarding factors, such as accuracy, speed, energy consumption, and naturalness of movements in various parts of the robotic mechanics. Over the past decades, the robotics community has developed computationally efficient mathematical tools for studying, simulating, and optimizing movements of articulated bodies to address these challenges.","PeriodicalId":54300,"journal":{"name":"IEEE Transactions on Cognitive and Developmental Systems","volume":"16 2","pages":"403-406"},"PeriodicalIF":5.0,"publicationDate":"2024-04-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10491578","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140348383","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"IEEE Transactions on Cognitive and Developmental Systems Information for Authors","authors":"","doi":"10.1109/TCDS.2024.3373155","DOIUrl":"https://doi.org/10.1109/TCDS.2024.3373155","url":null,"abstract":"","PeriodicalId":54300,"journal":{"name":"IEEE Transactions on Cognitive and Developmental Systems","volume":"16 2","pages":"C4-C4"},"PeriodicalIF":5.0,"publicationDate":"2024-04-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10491285","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140348385","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Attention Mechanism and Out-of-Distribution Data on Cross Language Image Matching for Weakly Supervised Semantic Segmentation","authors":"Chi-Chia Sun;Jing-Ming Guo;Chen-Hung Chung;Bo-Yu Chen","doi":"10.1109/TCDS.2024.3382914","DOIUrl":"10.1109/TCDS.2024.3382914","url":null,"abstract":"The fully supervised semantic segmentation requires detailed annotation of each pixel, which is time-consuming and laborious at the pixel-by-pixel level. To solve this problem, the direction of this article is to perform the semantic segmentation task by using image-level categorical annotation. Existing methods using image level annotation usually use class activation maps (CAMs) to find the location of the target object as the first step. By training a classifier, the presence of objects in the image can be searched effectively. However, CAMs appear that as follows: 1) objects are excessively focused on specific regions, capturing only the most prominent and critical areas and 2) it is easy to misinterpret the frequently occurring background regions, the foreground and background are confused. This article introduces cross language image matching based on out-of-distribution data and convolutional block attention module (CLODA), the concept of double branching in the cross language image matching framework, and adds a convolutional attention module to the attention branch to solve the problem of excess focus on objects in the CAMs. Importing out-of-distribution data on out of distribution branches helps classification networks improve misinterpretation of areas of focus. Optimizing regions of interest for attentional branch learning using cross pseudosupervision on two branches. Experimental results show that the pseudomasks generated by the proposed network can achieve 75.3% in mean Intersection over Union (mIoU) with the pattern analysis, statistical modeling and computational learning visual object classes (PASCAL VOC) 2012 training set. The performance of the segmentation network trained with the pseudomasks is up to 72.3% and 72.1% in mIoU on the validation and testing set of PASCAL VOC 2012.","PeriodicalId":54300,"journal":{"name":"IEEE Transactions on Cognitive and Developmental Systems","volume":"16 4","pages":"1604-1610"},"PeriodicalIF":5.0,"publicationDate":"2024-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140593955","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"DatUS: Data-Driven Unsupervised Semantic Segmentation With Pretrained Self-Supervised Vision Transformer","authors":"Sonal Kumar;Arijit Sur;Rashmi Dutta Baruah","doi":"10.1109/TCDS.2024.3383952","DOIUrl":"10.1109/TCDS.2024.3383952","url":null,"abstract":"Successive proposals of several self-supervised training schemes (STSs) continue to emerge, taking one step closer to developing a universal foundation model. In this process, unsupervised downstream tasks are recognized as one of the evaluation methods to validate the quality of visual features learned with self-supervised training. However, unsupervised dense semantic segmentation has yet to be explored as a downstream task, which can utilize and evaluate the quality of semantic information introduced in patch-level feature representations during self-supervised training of vision transformers. Therefore, we propose a novel data-driven framework, DatUS, to perform unsupervised dense semantic segmentation (DSS) as a downstream task. DatUS generates semantically consistent pseudosegmentation masks for an unlabeled image dataset without using visual prior or synchronized data. The experiment shows that the proposed framework achieves the highest MIoU (24.90) and average F1 score (36.3) by choosing DINOv2 and the highest pixel accuracy (62.18) by choosing DINO as the STS on the training set of SUIM dataset. It also outperforms state-of-the-art methods for the unsupervised DSS task with 15.02% MIoU, 21.47% pixel accuracy, and 16.06% average F1 score on the validation set of SUIM dataset. It achieves a competitive level of accuracy for a large-scale COCO dataset.","PeriodicalId":54300,"journal":{"name":"IEEE Transactions on Cognitive and Developmental Systems","volume":"16 5","pages":"1775-1788"},"PeriodicalIF":5.0,"publicationDate":"2024-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140593839","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Deep-Reinforcement-Learning-Based Driving Policy at Intersections Utilizing Lane Graph Networks","authors":"Yuqi Liu;Qichao Zhang;Yinfeng Gao;Dongbin Zhao","doi":"10.1109/TCDS.2024.3384269","DOIUrl":"10.1109/TCDS.2024.3384269","url":null,"abstract":"Learning an efficient and safe driving strategy in a traffic-heavy intersection scenario and generalizing it to different intersections remains a challenging task for autonomous driving. This is because there are differences in the structure of roads at different intersections, and autonomous vehicles need to generalize the strategies they have learned in the training environments. This requires the autonomous vehicle to capture not only the interactions between agents but also the relationships between agents and the map effectively. To address this challenge, we present a technique that integrates the information of high-definition (HD) maps and traffic participants into vector representations, called lane graph vectorization (LGV). In order to construct a driving policy for intersection navigation, we incorporate LGV into the twin-delayed deep deterministic policy gradient (TD3) algorithm with prioritized experience replay (PER). To train and validate the proposed algorithm, we construct a gym environment for intersection navigation within the high-fidelity CARLA simulator, integrating dense interactive traffic flow and various generalization test intersection scenarios. Experimental results demonstrate the effectiveness of LGV for intersection navigation tasks and outperform the state-of-the-art in our proposed scenarios.","PeriodicalId":54300,"journal":{"name":"IEEE Transactions on Cognitive and Developmental Systems","volume":"16 5","pages":"1759-1774"},"PeriodicalIF":5.0,"publicationDate":"2024-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140594174","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}