腹部肌电图挖掘:睡眠成人的呼吸模式

Q3 Computer Science
Gennady Chuiko, Yevhen Darnapuk, Olga Dvornik, Yaroslav Krainyk
{"title":"腹部肌电图挖掘:睡眠成人的呼吸模式","authors":"Gennady Chuiko, Yevhen Darnapuk, Olga Dvornik, Yaroslav Krainyk","doi":"10.32620/reks.2023.3.06","DOIUrl":null,"url":null,"abstract":"The article’s subject matter is the processing of abdominal EMG recordings and finding breathing patterns. The goal is to automatically classify respiratory patterns into two classes, or clusters, by two breathing patterns, regular and irregular, using machine learning (ML) methods. The object of the study was to obtain a dataset of 40 randomly picked abdominal EMG recordings (sampling rate equal to 200 Hz) borrowed from the complete dataset published by the Computational Clinical Neurophysiology Laboratory and the Clinical Data Animation Laboratory of Massachusetts General Hospital. The tasks to be solved are as follows: finding ETS (errors-trend-seasonality) model for the EMG series using the exponential smoothing method; obtaining denoised and detrended signals; obtaining the Hurst exponents for EMGs using the power-law decaying of correlograms for the denoised and detrended signals; describing the variabilities, SNR, the outlier fractions, and Hurst exponents by robust statistics, performing correlation analysis, and Principal Components Analysis (PCA); analyzing the structure of the distant matrix by a graph-based technique; obtaining the periodograms in the frequency domain using the known Wiener-Khinchin theorem; and finding the best models and methods of classification and clusterization and evaluating them within modern Machine Learning methods. The methods used are exponential smoothing, the Wiener-Khinchin theorem, the graph theory method, principal component analysis, programing within MAPLE 2020, and data processing by Weka. The authors obtained the following results: 1) wide data variability has been rated with the median absolute deviations, which is the most robust statistic in this case; 2) most of the signals (38 of 40) showed frequent outliers: from a few percent up to 24.6 % of emissions; 3) these four variables: outliers' percentage, variability, SNR, and persistency factors – form the attributes of input vectors of the subjects for further Machine Learning with Weka software; 4) Manhattan distances matrix among subjects' vectors in 4D attributes space allows imaging the data set as a weighted graph, the vertices of which are subjects; 5) the weights of the graph's edges reflect distances between any pair of them. \"Closeness centralities\" of vertices allowed us to cluster the data set on two clusters with 11 and 29 subjects, and Weka clustering algorithms confirmed this result. 6) The learning curve shows that a sufficiently small data set (from 25 subjects) might be suitable for classification purposes. Conclusions. The scientific novelty of the results obtained is as follows: 1) the Error-Trend-Seasonality model was the same for all data sets. Abdominal EMG of sleeping patients had additive errors and undamped trends without any seasonality; 2) the correlograms' decaying according to power law had been set, and Hurst exponents were in the range (of 0.776–0.887). This testifies to \"long memory\" (high persistence) of abdominal EMGs; 3) the modified Z-scores and robust statistics with the highest breakdown values were used for the EMG parameters because of many outliers; 4) breathing patterns were set using the periodograms in the frequency domain using the Wiener-Khinchin theorem; 5) the new graph-based method was successfully exploited to cluster the dataset. Parallel clustering with Weka algorithms confirmed the graph-based clustering results.","PeriodicalId":36122,"journal":{"name":"Radioelectronic and Computer Systems","volume":"48 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-09-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Abdominal electromyograms mining: breathing patterns of asleep adults\",\"authors\":\"Gennady Chuiko, Yevhen Darnapuk, Olga Dvornik, Yaroslav Krainyk\",\"doi\":\"10.32620/reks.2023.3.06\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The article’s subject matter is the processing of abdominal EMG recordings and finding breathing patterns. The goal is to automatically classify respiratory patterns into two classes, or clusters, by two breathing patterns, regular and irregular, using machine learning (ML) methods. The object of the study was to obtain a dataset of 40 randomly picked abdominal EMG recordings (sampling rate equal to 200 Hz) borrowed from the complete dataset published by the Computational Clinical Neurophysiology Laboratory and the Clinical Data Animation Laboratory of Massachusetts General Hospital. The tasks to be solved are as follows: finding ETS (errors-trend-seasonality) model for the EMG series using the exponential smoothing method; obtaining denoised and detrended signals; obtaining the Hurst exponents for EMGs using the power-law decaying of correlograms for the denoised and detrended signals; describing the variabilities, SNR, the outlier fractions, and Hurst exponents by robust statistics, performing correlation analysis, and Principal Components Analysis (PCA); analyzing the structure of the distant matrix by a graph-based technique; obtaining the periodograms in the frequency domain using the known Wiener-Khinchin theorem; and finding the best models and methods of classification and clusterization and evaluating them within modern Machine Learning methods. The methods used are exponential smoothing, the Wiener-Khinchin theorem, the graph theory method, principal component analysis, programing within MAPLE 2020, and data processing by Weka. The authors obtained the following results: 1) wide data variability has been rated with the median absolute deviations, which is the most robust statistic in this case; 2) most of the signals (38 of 40) showed frequent outliers: from a few percent up to 24.6 % of emissions; 3) these four variables: outliers' percentage, variability, SNR, and persistency factors – form the attributes of input vectors of the subjects for further Machine Learning with Weka software; 4) Manhattan distances matrix among subjects' vectors in 4D attributes space allows imaging the data set as a weighted graph, the vertices of which are subjects; 5) the weights of the graph's edges reflect distances between any pair of them. \\\"Closeness centralities\\\" of vertices allowed us to cluster the data set on two clusters with 11 and 29 subjects, and Weka clustering algorithms confirmed this result. 6) The learning curve shows that a sufficiently small data set (from 25 subjects) might be suitable for classification purposes. Conclusions. The scientific novelty of the results obtained is as follows: 1) the Error-Trend-Seasonality model was the same for all data sets. Abdominal EMG of sleeping patients had additive errors and undamped trends without any seasonality; 2) the correlograms' decaying according to power law had been set, and Hurst exponents were in the range (of 0.776–0.887). This testifies to \\\"long memory\\\" (high persistence) of abdominal EMGs; 3) the modified Z-scores and robust statistics with the highest breakdown values were used for the EMG parameters because of many outliers; 4) breathing patterns were set using the periodograms in the frequency domain using the Wiener-Khinchin theorem; 5) the new graph-based method was successfully exploited to cluster the dataset. Parallel clustering with Weka algorithms confirmed the graph-based clustering results.\",\"PeriodicalId\":36122,\"journal\":{\"name\":\"Radioelectronic and Computer Systems\",\"volume\":\"48 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-09-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Radioelectronic and Computer Systems\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.32620/reks.2023.3.06\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"Computer Science\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Radioelectronic and Computer Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.32620/reks.2023.3.06","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"Computer Science","Score":null,"Total":0}
引用次数: 0

摘要

这篇文章的主题是处理腹部肌电图记录和发现呼吸模式。目标是使用机器学习(ML)方法,根据两种呼吸模式(规则和不规则)自动将呼吸模式分为两类或集群。本研究的目的是从麻省总医院计算临床神经生理学实验室和临床数据动画实验室出版的完整数据集中,随机抽取40个腹部肌电图记录(采样率为200hz),获取数据集。要解决的问题是:利用指数平滑法寻找肌电信号序列的误差-趋势-季节性(ETS)模型;获取去噪和去趋势信号;利用去噪和去趋势信号相关图的幂律衰减,得到肌电信号的Hurst指数;通过稳健统计描述变量、信噪比、离群分数和Hurst指数,进行相关分析和主成分分析(PCA);基于图的距离矩阵结构分析利用已知的Wiener-Khinchin定理得到频域的周期图;寻找分类和聚类的最佳模型和方法,并在现代机器学习方法中对它们进行评估。使用的方法是指数平滑、Wiener-Khinchin定理、图论方法、主成分分析、MAPLE 2020编程和Weka数据处理。作者得到了以下结果:1)广泛的数据变异性已被评为中位数绝对偏差,这是最稳健的统计在这种情况下;2)大多数信号(40个信号中的38个)显示出频繁的异常值:从百分之几到24.6%的排放量;3)这四个变量:异常值百分比、可变性、信噪比和持久性因素——形成受试者输入向量的属性,用于使用Weka软件进行进一步的机器学习;4) 4D属性空间中被试向量间的曼哈顿距离矩阵允许将数据集成像为加权图,其顶点为被试;5)图边的权值反映任意一对边之间的距离。顶点的“接近中心性”允许我们将数据集聚在两个有11个和29个主题的聚类上,Weka聚类算法证实了这一结果。6)学习曲线表明,足够小的数据集(来自25个主题)可能适合分类目的。结论。所得结果的科学新颖性如下:1)误差-趋势-季节性模型对所有数据集都是相同的。睡眠患者腹部肌电图存在叠加性误差和无衰减趋势,无任何季节性;2)相关图按幂律衰减,Hurst指数在0.776 ~ 0.887范围内。这证明了腹部肌电图的“长记忆”(高持久性);3)由于异常值较多,肌电参数采用改进的z分数和击穿值最高的稳健统计量;4)利用Wiener-Khinchin定理在频域上对呼吸模式进行周期图设置;5)成功利用基于图的聚类方法对数据集进行聚类。Weka算法的并行聚类验证了基于图的聚类结果。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Abdominal electromyograms mining: breathing patterns of asleep adults
The article’s subject matter is the processing of abdominal EMG recordings and finding breathing patterns. The goal is to automatically classify respiratory patterns into two classes, or clusters, by two breathing patterns, regular and irregular, using machine learning (ML) methods. The object of the study was to obtain a dataset of 40 randomly picked abdominal EMG recordings (sampling rate equal to 200 Hz) borrowed from the complete dataset published by the Computational Clinical Neurophysiology Laboratory and the Clinical Data Animation Laboratory of Massachusetts General Hospital. The tasks to be solved are as follows: finding ETS (errors-trend-seasonality) model for the EMG series using the exponential smoothing method; obtaining denoised and detrended signals; obtaining the Hurst exponents for EMGs using the power-law decaying of correlograms for the denoised and detrended signals; describing the variabilities, SNR, the outlier fractions, and Hurst exponents by robust statistics, performing correlation analysis, and Principal Components Analysis (PCA); analyzing the structure of the distant matrix by a graph-based technique; obtaining the periodograms in the frequency domain using the known Wiener-Khinchin theorem; and finding the best models and methods of classification and clusterization and evaluating them within modern Machine Learning methods. The methods used are exponential smoothing, the Wiener-Khinchin theorem, the graph theory method, principal component analysis, programing within MAPLE 2020, and data processing by Weka. The authors obtained the following results: 1) wide data variability has been rated with the median absolute deviations, which is the most robust statistic in this case; 2) most of the signals (38 of 40) showed frequent outliers: from a few percent up to 24.6 % of emissions; 3) these four variables: outliers' percentage, variability, SNR, and persistency factors – form the attributes of input vectors of the subjects for further Machine Learning with Weka software; 4) Manhattan distances matrix among subjects' vectors in 4D attributes space allows imaging the data set as a weighted graph, the vertices of which are subjects; 5) the weights of the graph's edges reflect distances between any pair of them. "Closeness centralities" of vertices allowed us to cluster the data set on two clusters with 11 and 29 subjects, and Weka clustering algorithms confirmed this result. 6) The learning curve shows that a sufficiently small data set (from 25 subjects) might be suitable for classification purposes. Conclusions. The scientific novelty of the results obtained is as follows: 1) the Error-Trend-Seasonality model was the same for all data sets. Abdominal EMG of sleeping patients had additive errors and undamped trends without any seasonality; 2) the correlograms' decaying according to power law had been set, and Hurst exponents were in the range (of 0.776–0.887). This testifies to "long memory" (high persistence) of abdominal EMGs; 3) the modified Z-scores and robust statistics with the highest breakdown values were used for the EMG parameters because of many outliers; 4) breathing patterns were set using the periodograms in the frequency domain using the Wiener-Khinchin theorem; 5) the new graph-based method was successfully exploited to cluster the dataset. Parallel clustering with Weka algorithms confirmed the graph-based clustering results.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Radioelectronic and Computer Systems
Radioelectronic and Computer Systems Computer Science-Computer Graphics and Computer-Aided Design
CiteScore
3.60
自引率
0.00%
发文量
50
审稿时长
2 weeks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信