多源功能块缺失数据的多项式逻辑因子回归。

IF 2.9 2区心理学 Q1 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS

Psychometrika Pub Date : 2023-09-01 Epub Date: 2023-06-02 DOI:10.1007/s11336-023-09918-5

Xiuli Du, Xiaohu Jiang, Jinguan Lin

{"title":"多源功能块缺失数据的多项式逻辑因子回归。","authors":"Xiuli Du, Xiaohu Jiang, Jinguan Lin","doi":"10.1007/s11336-023-09918-5","DOIUrl":null,"url":null,"abstract":"Multi-source functional block-wise missing data arise more commonly in medical care recently with the rapid development of big data and medical technology, hence there is an urgent need to develop efficient dimension reduction to extract important information for classification under such data. However, most existing methods for classification problems consider high-dimensional data as covariates. In the paper, we propose a novel multinomial imputed-factor Logistic regression model with multi-source functional block-wise missing data as covariates. Our main contribution is to establishing two multinomial factor regression models by using the imputed multi-source functional principal component scores and imputed canonical scores as covariates, respectively, where the missing factors are imputed by both the conditional mean imputation and the multiple block-wise imputation approaches. Specifically, the univariate FPCA is carried out for the observable data of each data source firstly to obtain the univariate principal component scores and the eigenfunctions. Then, the block-wise missing univariate principal component scores instead of the block-wise missing functional data are imputed by the conditional mean imputation method and the multiple block-wise imputation method, respectively. After that, based on the imputed univariate factors, the multi-source principal component scores are constructed by using the relationship between the multi-source principal component scores and the univariate principal component scores; and at the same time, the canonical scores are obtained by the multiple-set canonial correlation analysis. Finally, the multinomial imputed-factor Logistic regression model is established with the multi-source principal component scores or the canonical scores as factors. Numerical simulations and real data analysis on ADNI data show the proposed method works well.","PeriodicalId":54534,"journal":{"name":"Psychometrika","volume":"88 3","pages":"975-1001"},"PeriodicalIF":2.9000,"publicationDate":"2023-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Multinomial Logistic Factor Regression for Multi-source Functional Block-wise Missing Data.\",\"authors\":\"Xiuli Du, Xiaohu Jiang, Jinguan Lin\",\"doi\":\"10.1007/s11336-023-09918-5\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Multi-source functional block-wise missing data arise more commonly in medical care recently with the rapid development of big data and medical technology, hence there is an urgent need to develop efficient dimension reduction to extract important information for classification under such data. However, most existing methods for classification problems consider high-dimensional data as covariates. In the paper, we propose a novel multinomial imputed-factor Logistic regression model with multi-source functional block-wise missing data as covariates. Our main contribution is to establishing two multinomial factor regression models by using the imputed multi-source functional principal component scores and imputed canonical scores as covariates, respectively, where the missing factors are imputed by both the conditional mean imputation and the multiple block-wise imputation approaches. Specifically, the univariate FPCA is carried out for the observable data of each data source firstly to obtain the univariate principal component scores and the eigenfunctions. Then, the block-wise missing univariate principal component scores instead of the block-wise missing functional data are imputed by the conditional mean imputation method and the multiple block-wise imputation method, respectively. After that, based on the imputed univariate factors, the multi-source principal component scores are constructed by using the relationship between the multi-source principal component scores and the univariate principal component scores; and at the same time, the canonical scores are obtained by the multiple-set canonial correlation analysis. Finally, the multinomial imputed-factor Logistic regression model is established with the multi-source principal component scores or the canonical scores as factors. Numerical simulations and real data analysis on ADNI data show the proposed method works well.\",\"PeriodicalId\":54534,\"journal\":{\"name\":\"Psychometrika\",\"volume\":\"88 3\",\"pages\":\"975-1001\"},\"PeriodicalIF\":2.9000,\"publicationDate\":\"2023-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Psychometrika\",\"FirstCategoryId\":\"102\",\"ListUrlMain\":\"https://doi.org/10.1007/s11336-023-09918-5\",\"RegionNum\":2,\"RegionCategory\":\"心理学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2023/6/2 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q1\",\"JCRName\":\"MATHEMATICS, INTERDISCIPLINARY APPLICATIONS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Psychometrika","FirstCategoryId":"102","ListUrlMain":"https://doi.org/10.1007/s11336-023-09918-5","RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2023/6/2 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"MATHEMATICS, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}

引用次数: 0

摘要

近年来，随着大数据和医疗技术的快速发展，多源功能块缺失数据在医疗领域越来越常见，因此迫切需要开发高效的降维方法来提取此类数据下的重要分类信息。然而，现有的分类方法大多将高维数据作为协变量。在本文中，我们提出了一种以多源功能块缺失数据为协变量的新型多项式归因 Logistic 回归模型。我们的主要贡献是利用多源功能主成分得分和卡农式得分分别作为协变量，建立了两个多项式因子回归模型，其中缺失因子是通过条件均值估算和多区块估算两种方法估算的。具体来说，首先对每个数据源的可观测数据进行单变量 FPCA，得到单变量主成分得分和特征函数。然后，用条件平均估算法和多块估算法分别估算缺失的单变量主成分得分和缺失的功能数据。然后，根据估算出的单变量因子，利用多源主成分得分与单变量主成分得分之间的关系，构建多源主成分得分；同时，通过多集卡农相关分析，得到卡农得分。最后，以多源主成分得分或标准分数为因子，建立多项式估算因子 Logistic 回归模型。对 ADNI 数据的数值模拟和实际数据分析表明，所提出的方法效果良好。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

Multinomial Logistic Factor Regression for Multi-source Functional Block-wise Missing Data.

查看原文本刊更多论文

Multinomial Logistic Factor Regression for Multi-source Functional Block-wise Missing Data.

Multi-source functional block-wise missing data arise more commonly in medical care recently with the rapid development of big data and medical technology, hence there is an urgent need to develop efficient dimension reduction to extract important information for classification under such data. However, most existing methods for classification problems consider high-dimensional data as covariates. In the paper, we propose a novel multinomial imputed-factor Logistic regression model with multi-source functional block-wise missing data as covariates. Our main contribution is to establishing two multinomial factor regression models by using the imputed multi-source functional principal component scores and imputed canonical scores as covariates, respectively, where the missing factors are imputed by both the conditional mean imputation and the multiple block-wise imputation approaches. Specifically, the univariate FPCA is carried out for the observable data of each data source firstly to obtain the univariate principal component scores and the eigenfunctions. Then, the block-wise missing univariate principal component scores instead of the block-wise missing functional data are imputed by the conditional mean imputation method and the multiple block-wise imputation method, respectively. After that, based on the imputed univariate factors, the multi-source principal component scores are constructed by using the relationship between the multi-source principal component scores and the univariate principal component scores; and at the same time, the canonical scores are obtained by the multiple-set canonial correlation analysis. Finally, the multinomial imputed-factor Logistic regression model is established with the multi-source principal component scores or the canonical scores as factors. Numerical simulations and real data analysis on ADNI data show the proposed method works well.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Psychometrika 数学-数学跨学科应用

CiteScore

4.40

自引率

10.00%

发文量

审稿时长

>12 weeks

期刊介绍： The journal Psychometrika is devoted to the advancement of theory and methodology for behavioral data in psychology, education and the social and behavioral sciences generally. Its coverage is offered in two sections: Theory and Methods (T& M), and Application Reviews and Case Studies (ARCS). T&M articles present original research and reviews on the development of quantitative models, statistical methods, and mathematical techniques for evaluating data from psychology, the social and behavioral sciences and related fields. Application Reviews can be integrative, drawing together disparate methodologies for applications, or comparative and evaluative, discussing advantages and disadvantages of one or more methodologies in applications. Case Studies highlight methodology that deepens understanding of substantive phenomena through more informative data analysis, or more elegant data description.