{"title":"IEEE Transactions on Cognitive and Developmental Systems Information for Authors","authors":"","doi":"10.1109/TCDS.2024.3352775","DOIUrl":"https://doi.org/10.1109/TCDS.2024.3352775","url":null,"abstract":"","PeriodicalId":54300,"journal":{"name":"IEEE Transactions on Cognitive and Developmental Systems","volume":null,"pages":null},"PeriodicalIF":5.0,"publicationDate":"2024-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10419135","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139676399","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Kendi Li;Weichen Huang;Wei Gao;Zijing Guan;Qiyun Huang;Jin-Gang Yu;Zhu Liang Yu;Yuanqing Li
{"title":"An Electroencephalography-Based Brain–Computer Interface for Emotion Regulation With Virtual Reality Neurofeedback","authors":"Kendi Li;Weichen Huang;Wei Gao;Zijing Guan;Qiyun Huang;Jin-Gang Yu;Zhu Liang Yu;Yuanqing Li","doi":"10.1109/TCDS.2024.3357547","DOIUrl":"10.1109/TCDS.2024.3357547","url":null,"abstract":"An increasing number of people fail to properly regulate their emotions for various reasons. Although brain–computer interfaces (BCIs) have shown potential in neural regulation, few effective BCI systems have been developed to assist users in emotion regulation. In this article, we propose an electroencephalography (EEG)-based BCI for emotion regulation with virtual reality (VR) neurofeedback. Specifically, music clips with positive, neutral, and negative emotions were first presented, based on which the participants were asked to regulate their emotions. The BCI system simultaneously collected the participants’ EEG signals and then assessed their emotions. Furthermore, based on the emotion recognition results, the neurofeedback was provided to participants in the form of a facial expression of a virtual pop star on a three-dimensional (3-D) virtual stage. Eighteen healthy participants achieved satisfactory performance with an average accuracy of 81.1% with neurofeedback. Additionally, the average accuracy increased significantly from 65.4% at the start to 87.6% at the end of a regulation trial (a trial corresponded to a music clip). In comparison, these participants could not significantly improve the accuracy within a regulation trial without neurofeedback. The results demonstrated the effectiveness of our system and showed that VR neurofeedback played a key role during emotion regulation.","PeriodicalId":54300,"journal":{"name":"IEEE Transactions on Cognitive and Developmental Systems","volume":null,"pages":null},"PeriodicalIF":5.0,"publicationDate":"2024-01-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139954649","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jiahui Pan;Jie Liu;Jianhao Zhang;Xueli Li;Dongming Quan;Yuanqing Li
{"title":"Depression Detection Using an Automatic Sleep Staging Method With an Interpretable Channel-Temporal Attention Mechanism","authors":"Jiahui Pan;Jie Liu;Jianhao Zhang;Xueli Li;Dongming Quan;Yuanqing Li","doi":"10.1109/TCDS.2024.3358022","DOIUrl":"10.1109/TCDS.2024.3358022","url":null,"abstract":"Despite previous efforts in depression detection studies, there is a scarcity of research on automatic depression detection using sleep structure, and several challenges remain: 1) how to apply sleep staging to detect depression and distinguish easily misjudged classes; and 2) how to adaptively capture attentive channel-dimensional information to enhance the interpretability of sleep staging methods. To address these challenges, an automatic sleep staging method based on a channel-temporal attention mechanism and a depression detection method based on sleep structure features are proposed. In sleep staging, a temporal attention mechanism is adopted to update the feature matrix, confidence scores are estimated for each sleep stage, the weight of each channel is adjusted based on these scores, and the final results are obtained through a temporal convolutional network. In depression detection, seven sleep structure features based on the results of sleep staging are extracted for depression detection between unipolar depressive disorder (UDD) patients, bipolar disorder (BD) patients, and healthy subjects. Experiments demonstrate the effectiveness of the proposed approaches, and the visualization of the channel attention mechanism illustrates the interpretability of our method. Additionally, this is the first attempt to employ sleep structure features to automatically detect UDD and BD in patients.","PeriodicalId":54300,"journal":{"name":"IEEE Transactions on Cognitive and Developmental Systems","volume":null,"pages":null},"PeriodicalIF":5.0,"publicationDate":"2024-01-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139954268","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ruiqi Wang;Wonse Jo;Dezhong Zhao;Weizheng Wang;Arjun Gupte;Baijian Yang;Guohua Chen;Byung-Cheol Min
{"title":"Husformer: A Multimodal Transformer for Multimodal Human State Recognition","authors":"Ruiqi Wang;Wonse Jo;Dezhong Zhao;Weizheng Wang;Arjun Gupte;Baijian Yang;Guohua Chen;Byung-Cheol Min","doi":"10.1109/TCDS.2024.3357618","DOIUrl":"10.1109/TCDS.2024.3357618","url":null,"abstract":"Human state recognition is a critical topic with pervasive and important applications in human–machine systems. Multimodal fusion, which entails integrating metrics from various data sources, has proven to be a potent method for boosting recognition performance. Although recent multimodal-based models have shown promising results, they often fall short in fully leveraging sophisticated fusion strategies essential for modeling adequate cross-modal dependencies in the fusion representation. Instead, they rely on costly and inconsistent feature crafting and alignment. To address this limitation, we propose an end-to-end multimodal transformer framework for multimodal human state recognition called \u0000<italic>Husformer</i>\u0000. Specifically, we propose using cross-modal transformers, which inspire one modality to reinforce itself through directly attending to latent relevance revealed in other modalities, to fuse different modalities while ensuring sufficient awareness of the cross-modal interactions introduced. Subsequently, we utilize a self-attention transformer to further prioritize contextual information in the fusion representation. Extensive experiments on two human emotion corpora (DEAP and WESAD) and two cognitive load datasets [multimodal dataset for objective cognitive workload assessment on simultaneous tasks (MOCAS) and CogLoad] demonstrate that in the recognition of the human state, our \u0000<italic>Husformer</i>\u0000 outperforms both state-of-the-art multimodal baselines and the use of a single modality by a large margin, especially when dealing with raw multimodal features. We also conducted an ablation study to show the benefits of each component in \u0000<italic>Husformer</i>\u0000. Experimental details and source code are available at \u0000<uri>https://github.com/SMARTlab-Purdue/Husformer</uri>\u0000.","PeriodicalId":54300,"journal":{"name":"IEEE Transactions on Cognitive and Developmental Systems","volume":null,"pages":null},"PeriodicalIF":5.0,"publicationDate":"2024-01-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139954274","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xiaoge Cao;Tao Lu;Liming Zheng;Yinghao Cai;Shuo Wang
{"title":"PLOT: Human-Like Push-Grasping Synergy Learning in Clutter With One-Shot Target Recognition","authors":"Xiaoge Cao;Tao Lu;Liming Zheng;Yinghao Cai;Shuo Wang","doi":"10.1109/TCDS.2024.3357084","DOIUrl":"10.1109/TCDS.2024.3357084","url":null,"abstract":"In unstructured environments, robotic grasping tasks are frequently required to interactively search for and retrieve specific objects from a cluttered workspace under the condition that only partial information about the target is available, like images, text descriptions, 3-D models, etc. It is a great challenge to correctly recognize the targets with limited information and learn synergies between different action primitives to grasp the targets from densely occluding objects efficiently. In this article, we propose a novel human-like push-grasping method that could grasp unknown objects in clutter using only one target RGB with Depth (RGB-D) image, called push-grasping synergy learning in clutter with one-shot target recognition (PLOT). First, we propose a target recognition (TR) method which automatically segments the objects both from the query image and workspace image, and extract the robust features of each segmented object. Through the designed feature matching criterion, the targets could be quickly located in the workspace. Second, we introduce a self-supervised target-oriented grasping system based on synergies between push and grasp actions. In this system, we propose a salient Q (SQ)-learning framework that focuses the \u0000<italic>Q</i>\u0000 value learning in the area including targets and a coordination mechanism (CM) that selects the proper actions to search and isolate the targets from the surrounding objects, even in the condition of targets invisible. Our method is inspired by the working memory mechanism of human brain and can grasp any target object shown through the image and has good generality in application. Experimental results in simulation and real-world show that our method achieved the best performance compared with the baselines in finding the unknown target objects from the cluttered environment with only one demonstrated target RGB-D image and had the high efficiency of grasping under the synergies of push and grasp actions.","PeriodicalId":54300,"journal":{"name":"IEEE Transactions on Cognitive and Developmental Systems","volume":null,"pages":null},"PeriodicalIF":5.0,"publicationDate":"2024-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139954139","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Raveendra Pilli;Tripti Goel;R. Murugan;M. Tanveer;P. N. Suganthan
{"title":"Kernel-Ridge-Regression-Based Randomized Network for Brain Age Classification and Estimation","authors":"Raveendra Pilli;Tripti Goel;R. Murugan;M. Tanveer;P. N. Suganthan","doi":"10.1109/TCDS.2024.3349593","DOIUrl":"10.1109/TCDS.2024.3349593","url":null,"abstract":"Accelerated brain aging and abnormalities are associated with variations in brain patterns. Effective and reliable assessment methods are required to accurately classify and estimate brain age. In this study, a brain age classification and estimation framework is proposed using structural magnetic resonance imaging (sMRI) scans, a 3-D convolutional neural network (3-D-CNN), and a kernel ridge regression-based random vector functional link (KRR-RVFL) network. We used 480 brain MRI images from the publicly availabel IXI database and segmented them into gray matter (GM), white matter (WM), and cerebrospinal fluid (CSF) images to show age-related associations by region. Features from MRI images are extracted using 3-D-CNN and fed into the wavelet KRR-RVFL network for brain age classification and prediction. The proposed algorithm achieved high classification accuracy, 97.22%, 99.31%, and 95.83% for GM, WM, and CSF regions, respectively. Moreover, the proposed algorithm demonstrated excellent prediction accuracy with a mean absolute error (MAE) of \u0000<inline-formula><tex-math>$3.89$</tex-math></inline-formula>\u0000 years, \u0000<inline-formula><tex-math>$3.64$</tex-math></inline-formula>\u0000 years, and \u0000<inline-formula><tex-math>$4.49$</tex-math></inline-formula>\u0000 years for GM, WM, and CSF regions, confirming that changes in WM volume are significantly associated with normal brain aging. Additionally, voxel-based morphometry (VBM) examines age-related anatomical alterations in different brain regions in GM, WM, and CSF tissue volumes.","PeriodicalId":54300,"journal":{"name":"IEEE Transactions on Cognitive and Developmental Systems","volume":null,"pages":null},"PeriodicalIF":5.0,"publicationDate":"2024-01-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10405861","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139954140","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Marieke van Otterdijk;Bruno Laeng;Diana Saplacan Lindblom;Jim Torresen
{"title":"The Effect of Expressive Robot Behavior on Users’ Mental Effort: A Pupillometry Study","authors":"Marieke van Otterdijk;Bruno Laeng;Diana Saplacan Lindblom;Jim Torresen","doi":"10.1109/TCDS.2024.3352893","DOIUrl":"10.1109/TCDS.2024.3352893","url":null,"abstract":"Robots are becoming part of our social landscape. Social interaction with humans must be efficient and intuitive to understand because nonverbal cues make social interactions between humans and robots more efficient. This study measures mental effort to investigate what factors influence the intuitive understanding of expressive nonverbal robot motions. Fifty participants were asked to watch, while their pupil response and gaze were measured with an eye tracker, eighteen short video clips of three different robot types while performing expressive robot behaviors. Our findings indicate that the appearance of the robot, the viewing angle, and the expression shown by the robot all influence the cognitive load, and therefore, they may influence the intuitive understanding of expressive robot behavior. Furthermore, we found differences in the fixation time for different features of the different robots. With these insights, we identified possible improvement directions for making interactions between humans and robots more efficient and intuitive.","PeriodicalId":54300,"journal":{"name":"IEEE Transactions on Cognitive and Developmental Systems","volume":null,"pages":null},"PeriodicalIF":5.0,"publicationDate":"2024-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139954281","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"TR-TransGAN: Temporal Recurrent Transformer Generative Adversarial Network for Longitudinal MRI Dataset Expansion","authors":"Chen-Chen Fan;Hongjun Yang;Liang Peng;Xiao-Hu Zhou;Shiqi Liu;Sheng Chen;Zeng-Guang Hou","doi":"10.1109/TCDS.2023.3345922","DOIUrl":"10.1109/TCDS.2023.3345922","url":null,"abstract":"Longitudinal magnetic resonance imaging (MRI) datasets have important implications for the study of degenerative diseases because such datasets have data from multiple points in time to track disease progression. However, longitudinal datasets are often incomplete due to unexpected quits of patients. In previous work, we proposed an augmentation method temporal recurrent generative adversarial network (TR-GAN) that can complement missing session data of MRI datasets. TR-GAN uses a simple U-Net as a generator, which limits its performance. Transformers have had great success in the research of computer vision and this article attempts to introduce it into longitudinal dataset completion tasks. The multihead attention mechanism in transformer has huge memory requirements, and it is difficult to train 3-D MRI data on graphics processing units (GPUs) with small memory. To build a memory-friendly transformer-based generator, we introduce a Hilbert transform module (HTM) to convert 3-D data to 2-D data that preserves locality fairly well. To make up for the insufficiency of convolutional neural network (CNN)-based models that are difficult to establish long-range dependencies, we propose an Swin transformer-based up/down sampling module (STU/STD) module that combines the Swin transformer module and CNN module to capture global and local information simultaneously. Extensive experiments show that our model can reduce mean squared error (MMSE) by at least 7.16% compared to the previous state-of-the-art method.","PeriodicalId":54300,"journal":{"name":"IEEE Transactions on Cognitive and Developmental Systems","volume":null,"pages":null},"PeriodicalIF":5.0,"publicationDate":"2024-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139954270","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Multiple Instance Learning for Cheating Detection and Localization in Online Examinations","authors":"Yemeng Liu;Jing Ren;Jianshuo Xu;Xiaomei Bai;Roopdeep Kaur;Feng Xia","doi":"10.1109/TCDS.2024.3349705","DOIUrl":"10.1109/TCDS.2024.3349705","url":null,"abstract":"The spread of the Coronavirus disease-2019 epidemic has caused many courses and exams to be conducted online. The cheating behavior detection model in examination invigilation systems plays a pivotal role in guaranteeing the equality of long-distance examinations. However, cheating behavior is rare, and most researchers do not comprehensively take into account features such as head posture, gaze angle, body posture, and background information in the task of cheating behavior detection. In this article, we develop and present CHEESE, a CHEating detection framework via multiple instance learning. The framework consists of a label generator that implements weak supervision and a feature encoder to learn discriminative features. In addition, the framework combines body posture and background features extracted by 3-D convolution with eye gaze, head posture, and facial features captured by OpenFace 2.0. These features are fed into the spatiotemporal graph module by stitching to analyze the spatiotemporal changes in video clips to detect the cheating behaviors. Our experiments on three datasets, University of Central Florida (UCF)-Crime, ShanghaiTech, and online exam proctoring (OEP), prove the effectiveness of our method as compared to the state-of-the-art approaches and obtain the frame-level area under the curve (AUC) score of 87.58% on the OEP dataset.","PeriodicalId":54300,"journal":{"name":"IEEE Transactions on Cognitive and Developmental Systems","volume":null,"pages":null},"PeriodicalIF":5.0,"publicationDate":"2024-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139850467","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Dominik Mattern;Pierre Schumacher;Francisco M. López;Marcel C. Raabe;Markus R. Ernst;Arthur Aubret;Jochen Triesch
{"title":"MIMo: A Multimodal Infant Model for Studying Cognitive Development","authors":"Dominik Mattern;Pierre Schumacher;Francisco M. López;Marcel C. Raabe;Markus R. Ernst;Arthur Aubret;Jochen Triesch","doi":"10.1109/TCDS.2024.3350448","DOIUrl":"10.1109/TCDS.2024.3350448","url":null,"abstract":"Human intelligence and human consciousness emerge gradually during the process of cognitive development. Understanding this development is an essential aspect of understanding the human mind and may facilitate the construction of artificial minds with similar properties. Importantly, human cognitive development relies on embodied interactions with the physical and social environment, which is perceived via complementary sensory modalities. These interactions allow the developing mind to probe the causal structure of the world. This is in stark contrast to common machine learning approaches, e.g., for large language models, which are merely passively “digesting” large amounts of training data, but are not in control of their sensory inputs. However, computational modeling of the kind of self-determined embodied interactions that lead to human intelligence and consciousness is a formidable challenge. Here, we present Multimodal Infant Model (MiMo), an open-source multimodal infant model for studying early cognitive development through computer simulations. MIMo's body is modeled after an 18-month-old child with detailed five-fingered hands. MIMo perceives its surroundings via binocular vision, a vestibular system, proprioception, and touch perception through a full-body virtual skin, while two different actuation models allow control of his body. We describe the design and interfaces of MIMo and provide examples illustrating its use.","PeriodicalId":54300,"journal":{"name":"IEEE Transactions on Cognitive and Developmental Systems","volume":null,"pages":null},"PeriodicalIF":5.0,"publicationDate":"2024-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139954426","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}