Statistics and Its Interface最新文献_第2页

Estimating extreme value index by subsampling for massive datasets with heavy-tailed distributions 通过子采样估计重尾分布海量数据集的极值指数

IF 0.8 4区数学

Statistics and Its Interface Pub Date : 2024-07-19 DOI: 10.4310/22-sii749

Yongxin Li, Liujun Chen, Deyuan Li, Hansheng Wang

引用次数: 0

A random projection method for large-scale community detection 大规模群落探测的随机投影法

IF 0.8 4区数学

Statistics and Its Interface Pub Date : 2024-02-01 DOI: 10.4310/22-sii752

Haobo Qi, Hansheng Wang, Xuening Zhu

引用次数: 0

IF 0.8 4区数学

Statistics and Its Interface Pub Date : 2024-02-01 DOI: 10.4310/22-sii770

Zhou Lan

引用次数: 0

Robust and covariance-assisted tensor response regression 稳健和协方差辅助张量响应回归

IF 0.8 4区数学

Statistics and Its Interface Pub Date : 2024-02-01 DOI: 10.4310/sii.2024.v17.n2.a10

Ning Wang, Xin Zhang

引用次数: 0

Bayesian tensor-on-tensor regression with efficient computation 高效计算的贝叶斯张量对张量回归

IF 0.8 4区数学

Statistics and Its Interface Pub Date : 2024-02-01 DOI: 10.4310/23-sii786

Kunbo Wang, Yanxun Xu

{"title":"Bayesian tensor-on-tensor regression with efficient computation","authors":"Kunbo Wang, Yanxun Xu","doi":"10.4310/23-sii786","DOIUrl":"https://doi.org/10.4310/23-sii786","url":null,"abstract":"We propose a Bayesian tensor-on-tensor regression approach to predict a multidimensional array (tensor) of arbitrary dimensions from another tensor of arbitrary dimensions, building upon the Tucker decomposition of the regression coefficient tensor. Traditional tensor regression methods making use of the Tucker decomposition either assume the dimension of the core tensor to be known or estimate it via cross-validation or some model selection criteria. However, no existing method can simultaneously estimate the model dimension (the dimension of the core tensor) and other model parameters. To fill this gap, we develop an efficient Markov Chain Monte Carlo (MCMC) algorithm to estimate both the model dimension and parameters for posterior inference. Besides the MCMC sampler, we also develop an ultra-fast optimization-based computing algorithm wherein the maximum <i>a posteriori</i> estimators for parameters are computed, and the model dimension is optimized via a simulated annealing algorithm. The proposed Bayesian framework provides a natural way for uncertainty quantification. Through extensive simulation studies, we evaluate the proposed Bayesian tensor-on-tensor regression model and show its superior performance compared to alternative methods. We also demonstrate its practical effectiveness by applying it to two real-world datasets, including facial imaging data and 3D motion data.","PeriodicalId":51230,"journal":{"name":"Statistics and Its Interface","volume":"23 1","pages":""},"PeriodicalIF":0.8,"publicationDate":"2024-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139658968","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Density-convoluted tensor support vector machines 密度卷积张量支持向量机

IF 0.8 4区数学

Statistics and Its Interface Pub Date : 2024-02-01 DOI: 10.4310/23-sii796

Boxiang Wang, Le Zhou, Jian Yang, Qing Mai

引用次数: 0

Multi-way overlapping clustering by Bayesian tensor decomposition 通过贝叶斯张量分解进行多向重叠聚类

IF 0.8 4区数学

Statistics and Its Interface Pub Date : 2024-02-01 DOI: 10.4310/23-sii790

Zhuofan Wang, Fangting Zhou, Kejun He, Yang Ni

{"title":"Multi-way overlapping clustering by Bayesian tensor decomposition","authors":"Zhuofan Wang, Fangting Zhou, Kejun He, Yang Ni","doi":"10.4310/23-sii790","DOIUrl":"https://doi.org/10.4310/23-sii790","url":null,"abstract":"The development of modern sequencing technologies provides great opportunities to measure gene expression of multiple tissues from different individuals. The three-way variation across genes, tissues, and individuals makes statistical inference a challenging task. In this paper, we propose a Bayesian multi-way clustering approach to cluster genes, tissues, and individuals simultaneously. The proposed model adaptively trichotomizes the observed data into three latent categories and uses a Bayesian hierarchical construction to further decompose the latent variables into lower-dimensional features, which can be interpreted as overlapping clusters. With a Bayesian nonparametric prior, i.e., the Indian buffet process, our method determines the cluster number automatically. The utility of our approach is demonstrated through simulation studies and an application to the Genotype-Tissue Expression (GTEx) RNA-seq data. The clustering result reveals some interesting findings about depression-related genes in human brain, which are also consistent with biological domain knowledge. The detailed algorithm and some numerical results are available in the online Supplementary Material, available at $href{https://intlpress.com/site/pub/files/supp/sii/2024/0017/0002/sii-2024-0017-0002-s001.pdf}{ https://intlpress.com/site/pub/files/supp/sii/2024/0017/0002/sii-2024-0017-0002-s001.pdf}.","PeriodicalId":51230,"journal":{"name":"Statistics and Its Interface","volume":"12 1","pages":""},"PeriodicalIF":0.8,"publicationDate":"2024-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139659442","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Community detection in temporal citation network via a tensor-based approach 基于张量的时态引文网络社群检测方法

IF 0.8 4区数学

Statistics and Its Interface Pub Date : 2024-02-01 DOI: 10.4310/22-sii751

Tianchen Gao, Rui Pan, Junfei Zhang, Hansheng Wang

引用次数: 0

Bayesian methods in tensor analysis 张量分析中的贝叶斯方法

IF 0.8 4区数学

Statistics and Its Interface Pub Date : 2024-02-01 DOI: 10.4310/23-sii802

Shi Yiyao, Shen Weining

引用次数: 0

Learning conditional dependence graph for concepts via matrix normal graphical model 通过矩阵正态图模型学习概念的条件依赖图

IF 0.8 4区数学

Statistics and Its Interface Pub Date : 2024-02-01 DOI: 10.4310/23-sii784

Jizheng Lai, Jianxin Yin

{"title":"Learning conditional dependence graph for concepts via matrix normal graphical model","authors":"Jizheng Lai, Jianxin Yin","doi":"10.4310/23-sii784","DOIUrl":"https://doi.org/10.4310/23-sii784","url":null,"abstract":"Conditional dependence relationships for random vectors are extensively studied and broadly applied. But it is not very clear how to construct the dependence graph for unstructured data like concept words or phrases in text corpus, where the variables(concepts) are not jointly observed with i.i.d. assumption. Using the global embedding methods like GloVe, we get the ‘structured’ representation vectors for concepts. Then we assume that all the concept vectors jointly follow a matrix normal distribution with sparse precision matrices. With the observation of the word-word co-occurrence matrix and the GloVe construction procedure, we can test this assumption empirically. The asymptotic distribution for the test statistics is derived. Another advantage of this matrix-normal distributional assumption is that the linearly additive property in word analogy tasks is natural and straightforward. Different from knowledge graph methods, the conditional dependence graph describes the conditional dependence structure between concepts given all other concepts, which means that the concepts(nodes) linked by edges cannot be separated by other concepts. It represents an essential semantic relationship. There is no need to enumerate all related pairs as head and tail elements of a triplet in knowledge graph regime. And the relation type in this graph is solely the conditional dependence between concepts. A penalized matrix normal graphical model (MNGM) is then employed to learn the conditional dependence graph for both the concepts and the embedding ‘dimensions’. Since the concept words are nodes in our graph with huge dimensions, we employ the MDMC optimization method to speed up the glasso algorithm. Also, the algorithm is adaptive to incremental accumulation of new concepts in text corpus. On the other hand, we propose a sentence granularity bootstrap to get ‘independent’ repeats of samples to enhance the penalized MNGM algorithm.We name the proposed method as Matrix-GloVe. In simulation studies, we check that the graph learned by Matrix-GloVe is more suitable for Graph Convolutional Networks(GCN) than a correlation graph, i.e. a graph determined from the k-NN method. We employ the proposed method in two scenarios from real data. The first scenario is concept graph learning for concepts in textbook corpus. Under this scenario, two tasks are studied. One is comparing the vectors output by GloVe and other word2vec methods, i.e. CBOW and Skip-Gram, then the vectors are used by penalized MNGM. Another task is link prediction among the concepts. On both tasks, Matrix-GloVe achieves better. In the second scenario, Matrix-GloVe is applied to a downstream method i.e. GCN. For node classification tasks on the BBC and BBCSport datasets, both GCN with Matrix- GloVe and GCN with Matrix-GloVe plus Deepwalk outperform GCN with k-NN.","PeriodicalId":51230,"journal":{"name":"Statistics and Its Interface","volume":"281 1","pages":""},"PeriodicalIF":0.8,"publicationDate":"2024-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139659215","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0