ESS MS-G3D: extension and supplement shift MS-G3D network for the assessment of severe mental retardation

IF 5 2区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Complex & Intelligent Systems Pub Date : 2023-11-21 DOI:10.1007/s40747-023-01275-1

Quan Liu, Mincheng Cai, Dujuan Liu, Simeng Ma, Qianhong Zhang, Dan Xiang, Lihua Yao, Zhongchun Liu, Jun Yang

{"title":"ESS MS-G3D: extension and supplement shift MS-G3D network for the assessment of severe mental retardation","authors":"Quan Liu, Mincheng Cai, Dujuan Liu, Simeng Ma, Qianhong Zhang, Dan Xiang, Lihua Yao, Zhongchun Liu, Jun Yang","doi":"10.1007/s40747-023-01275-1","DOIUrl":null,"url":null,"abstract":"Automated mental retardation (MR) assessment is potential for improving the diagnostic efficiency and objectivity in clinical practice. Based on the researches on abnormal behavior characteristics of patients with MR, we propose an extension and supplement shift multi-scale G3D (ESS MS-G3D) network for video-based assessment of MR. Specifically, all videos are collected from clinical diagnostic scenarios and the skeleton sequence of human body is extracted from videos through an advanced pose estimation model. To solve the shortcomings of existing behavior characteristic learning methods, we present: (1) three G3D styles, enable the network to have different input forms; (2) two G3D graphs and two extension graphs, redefine and extend the graph structure of spatial–temporal nodes; (3) two learnable parameters, realize adaptive adjustment of graph structure; (4) a shift layer, enable the network to learn global features. Finally, we construct a three-branch model ESS MS-STGC, which can capture the discriminative spatial–temporal features and explore the co-occurrence relationship between spatial and temporal domains. Experiments in clinical video data set show that our proposed model has good performance in MR assessment and is superior to the existing vision-based methods. In two-classification task, our model with joint stream achieves the highest accuracy of \\(94.63\\%\\) in validation set and \\(89.13\\%\\) in test set. The results are further improved to \\(96.52\\%\\) and \\(93.22\\%\\), respectively, by utilizing multi-stream fusion strategy. In four-classification task, our model obtains Top1 accuracy of \\(78.84\\%\\) and Top2 accuracy of \\(91.34\\%\\) in test set. The proposed method provides a new idea for clinical mental retardation assessment.","PeriodicalId":10524,"journal":{"name":"Complex & Intelligent Systems","volume":"29 1","pages":""},"PeriodicalIF":5.0000,"publicationDate":"2023-11-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Complex & Intelligent Systems","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s40747-023-01275-1","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Automated mental retardation (MR) assessment is potential for improving the diagnostic efficiency and objectivity in clinical practice. Based on the researches on abnormal behavior characteristics of patients with MR, we propose an extension and supplement shift multi-scale G3D (ESS MS-G3D) network for video-based assessment of MR. Specifically, all videos are collected from clinical diagnostic scenarios and the skeleton sequence of human body is extracted from videos through an advanced pose estimation model. To solve the shortcomings of existing behavior characteristic learning methods, we present: (1) three G3D styles, enable the network to have different input forms; (2) two G3D graphs and two extension graphs, redefine and extend the graph structure of spatial–temporal nodes; (3) two learnable parameters, realize adaptive adjustment of graph structure; (4) a shift layer, enable the network to learn global features. Finally, we construct a three-branch model ESS MS-STGC, which can capture the discriminative spatial–temporal features and explore the co-occurrence relationship between spatial and temporal domains. Experiments in clinical video data set show that our proposed model has good performance in MR assessment and is superior to the existing vision-based methods. In two-classification task, our model with joint stream achieves the highest accuracy of \(94.63\%\) in validation set and \(89.13\%\) in test set. The results are further improved to \(96.52\%\) and \(93.22\%\), respectively, by utilizing multi-stream fusion strategy. In four-classification task, our model obtains Top1 accuracy of \(78.84\%\) and Top2 accuracy of \(91.34\%\) in test set. The proposed method provides a new idea for clinical mental retardation assessment.

Abstract Image

查看原文本刊更多论文

ESS MS-G3D:扩展和补充移位MS-G3D网络，用于重度智力迟钝的评估

在临床实践中，自动评估智力迟钝(MR)具有提高诊断效率和客观性的潜力。在对MR患者异常行为特征研究的基础上，我们提出了一种扩展和补充移位多尺度G3D (ESS MS-G3D)网络，用于基于视频的MR评估，其中，从临床诊断场景中收集所有视频，并通过一种先进的姿态估计模型从视频中提取人体骨骼序列。针对现有行为特征学习方法的不足，我们提出:(1)三种G3D风格，使网络具有不同的输入形式;(2) 2个G3D图和2个可拓图，重新定义和扩展时空节点的图结构;(3)两个可学习参数，实现图结构的自适应调整;(4)移位层，使网络能够学习全局特征。最后，我们构建了一个三分支模型ESS MS-STGC，该模型能够捕捉具有区别性的时空特征，并探索时空共现关系。在临床视频数据集上的实验表明，该模型具有良好的MR评估性能，优于现有的基于视觉的方法。在双分类任务中，我们的联合流模型在验证集中达到\(94.63\%\)，在测试集中达到\(89.13\%\)，准确率最高。利用多流融合策略将结果进一步改进为\(96.52\%\)和\(93.22\%\)。在四分类任务中，我们的模型在测试集中获得了\(78.84\%\)的Top1精度和\(91.34\%\)的Top2精度。该方法为临床智力低下评价提供了一种新的思路。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Complex & Intelligent Systems COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE-

CiteScore

9.60

自引率

10.30%

发文量

297

期刊介绍： Complex & Intelligent Systems aims to provide a forum for presenting and discussing novel approaches, tools and techniques meant for attaining a cross-fertilization between the broad fields of complex systems, computational simulation, and intelligent analytics and visualization. The transdisciplinary research that the journal focuses on will expand the boundaries of our understanding by investigating the principles and processes that underlie many of the most profound problems facing society today.