Hyeonseok Kim, Justin Luo, Shannon Chu, C. Cannard, Sven Hoffmann, M. Miyakoshi
{"title":"ICA的bug:由于EEG电极插值和不正确的重新引用导致的有效秩不足,如何产生幽灵ic","authors":"Hyeonseok Kim, Justin Luo, Shannon Chu, C. Cannard, Sven Hoffmann, M. Miyakoshi","doi":"10.3389/frsip.2023.1064138","DOIUrl":null,"url":null,"abstract":"Independent component analysis (ICA) has been widely used for electroencephalography (EEG) analyses. However, ICA performance relies on several crucial assumptions about the data. Here, we focus on the granularity of data rank, i.e., the number of linearly independent EEG channels. When the data are rank-full (i.e., all channels are independent), ICA produces as many independent components (ICs) as the number of input channels (rank-full decomposition). However, when the input data are rank-deficient, as is the case with bridged or interpolated electrodes, ICA produces the same number of ICs as the data rank (forced rank deficiency decomposition), introducing undesired ghost ICs and indicating a bug in ICA. We demonstrated that the ghost ICs have white noise properties, in both time and frequency domains, while maintaining surprisingly typical scalp topographies, and can therefore be easily missed by EEG researchers and affect findings in unknown ways. This problem occurs when the minimum eigenvalue λ min of the input data is smaller than a certain threshold, leading to matrix inversion failure as if the rank-deficient inversion was forced, even if the data rank is cleanly deficient by one. We defined this problem as the effective rank deficiency. Using sound file mixing simulations, we first demonstrated the effective rank deficiency problem and determined that the critical threshold for λ min is 10−7 in the given situation. Second, we used empirical EEG data to show how two preprocessing stages, re-referencing to average without including the initial reference and non-linear electrode interpolation, caused this forced rank deficiency problem. Finally, we showed that the effective rank deficiency problem can be solved by using the identified threshold ( λ min = 10−7) and the correct re-referencing procedure described herein. The former ensures the achievement of effective rank-full decomposition by properly reducing the input data rank, and the latter allows avoidance of a widely practiced incorrect re-referencing approach. Based on the current literature, we discuss the ambiguous status of the initial reference electrode when re-referencing. We have made our data and code available to facilitate the implementation of our recommendations by the EEG community.","PeriodicalId":93557,"journal":{"name":"Frontiers in signal processing","volume":"36 1","pages":""},"PeriodicalIF":1.3000,"publicationDate":"2023-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"ICA’s bug: How ghost ICs emerge from effective rank deficiency caused by EEG electrode interpolation and incorrect re-referencing\",\"authors\":\"Hyeonseok Kim, Justin Luo, Shannon Chu, C. Cannard, Sven Hoffmann, M. Miyakoshi\",\"doi\":\"10.3389/frsip.2023.1064138\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Independent component analysis (ICA) has been widely used for electroencephalography (EEG) analyses. However, ICA performance relies on several crucial assumptions about the data. Here, we focus on the granularity of data rank, i.e., the number of linearly independent EEG channels. When the data are rank-full (i.e., all channels are independent), ICA produces as many independent components (ICs) as the number of input channels (rank-full decomposition). However, when the input data are rank-deficient, as is the case with bridged or interpolated electrodes, ICA produces the same number of ICs as the data rank (forced rank deficiency decomposition), introducing undesired ghost ICs and indicating a bug in ICA. We demonstrated that the ghost ICs have white noise properties, in both time and frequency domains, while maintaining surprisingly typical scalp topographies, and can therefore be easily missed by EEG researchers and affect findings in unknown ways. This problem occurs when the minimum eigenvalue λ min of the input data is smaller than a certain threshold, leading to matrix inversion failure as if the rank-deficient inversion was forced, even if the data rank is cleanly deficient by one. We defined this problem as the effective rank deficiency. Using sound file mixing simulations, we first demonstrated the effective rank deficiency problem and determined that the critical threshold for λ min is 10−7 in the given situation. Second, we used empirical EEG data to show how two preprocessing stages, re-referencing to average without including the initial reference and non-linear electrode interpolation, caused this forced rank deficiency problem. Finally, we showed that the effective rank deficiency problem can be solved by using the identified threshold ( λ min = 10−7) and the correct re-referencing procedure described herein. The former ensures the achievement of effective rank-full decomposition by properly reducing the input data rank, and the latter allows avoidance of a widely practiced incorrect re-referencing approach. Based on the current literature, we discuss the ambiguous status of the initial reference electrode when re-referencing. We have made our data and code available to facilitate the implementation of our recommendations by the EEG community.\",\"PeriodicalId\":93557,\"journal\":{\"name\":\"Frontiers in signal processing\",\"volume\":\"36 1\",\"pages\":\"\"},\"PeriodicalIF\":1.3000,\"publicationDate\":\"2023-04-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Frontiers in signal processing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.3389/frsip.2023.1064138\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Frontiers in signal processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3389/frsip.2023.1064138","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 2
摘要
独立分量分析(ICA)在脑电图分析中得到了广泛的应用。然而,ICA的性能依赖于对数据的几个关键假设。在这里,我们关注的是数据等级的粒度,即线性独立的脑电信号通道的数量。当数据是全秩的(即,所有通道都是独立的),ICA产生与输入通道数量一样多的独立分量(ic)(全秩分解)。然而,当输入数据秩不足时,如桥接或内插电极的情况,ICA产生与数据秩相同数量的ic(强制秩不足分解),引入不希望的幽灵ic,并表明ICA存在缺陷。我们证明了幽灵ic在时域和频域都具有白噪声特性,同时保持了令人惊讶的典型头皮地形,因此很容易被脑电图研究人员遗漏,并以未知的方式影响研究结果。当输入数据的最小特征值λ min小于某一阈值时,即使数据秩明显不足1,也会导致矩阵反演失败,就像强制进行秩亏缺反演一样。我们把这个问题定义为有效等级不足。通过声音文件混合模拟,我们首先证明了有效秩不足问题,并确定在给定情况下λ min的临界阈值为10−7。其次,我们使用经验EEG数据来显示两个预处理阶段,即不包括初始参考和非线性电极插值的重新参考平均,是如何导致这种强制秩不足问题的。最后,我们证明了有效的秩不足问题可以通过使用识别的阈值(λ min = 10−7)和本文描述的正确的重新引用程序来解决。前者通过适当降低输入数据的秩确保实现有效的秩全分解,后者允许避免广泛使用的错误重引用方法。在现有文献的基础上,我们讨论了初始参比电极在重新参比时的模糊状态。我们已经提供了我们的数据和代码,以促进EEG社区实施我们的建议。
ICA’s bug: How ghost ICs emerge from effective rank deficiency caused by EEG electrode interpolation and incorrect re-referencing
Independent component analysis (ICA) has been widely used for electroencephalography (EEG) analyses. However, ICA performance relies on several crucial assumptions about the data. Here, we focus on the granularity of data rank, i.e., the number of linearly independent EEG channels. When the data are rank-full (i.e., all channels are independent), ICA produces as many independent components (ICs) as the number of input channels (rank-full decomposition). However, when the input data are rank-deficient, as is the case with bridged or interpolated electrodes, ICA produces the same number of ICs as the data rank (forced rank deficiency decomposition), introducing undesired ghost ICs and indicating a bug in ICA. We demonstrated that the ghost ICs have white noise properties, in both time and frequency domains, while maintaining surprisingly typical scalp topographies, and can therefore be easily missed by EEG researchers and affect findings in unknown ways. This problem occurs when the minimum eigenvalue λ min of the input data is smaller than a certain threshold, leading to matrix inversion failure as if the rank-deficient inversion was forced, even if the data rank is cleanly deficient by one. We defined this problem as the effective rank deficiency. Using sound file mixing simulations, we first demonstrated the effective rank deficiency problem and determined that the critical threshold for λ min is 10−7 in the given situation. Second, we used empirical EEG data to show how two preprocessing stages, re-referencing to average without including the initial reference and non-linear electrode interpolation, caused this forced rank deficiency problem. Finally, we showed that the effective rank deficiency problem can be solved by using the identified threshold ( λ min = 10−7) and the correct re-referencing procedure described herein. The former ensures the achievement of effective rank-full decomposition by properly reducing the input data rank, and the latter allows avoidance of a widely practiced incorrect re-referencing approach. Based on the current literature, we discuss the ambiguous status of the initial reference electrode when re-referencing. We have made our data and code available to facilitate the implementation of our recommendations by the EEG community.