{"title":"Network-Constrained Eigen-Single-Cell Profile Estimation for Uncovering Crucial Immunogene Regulatory Systems in Human Bone Marrow.","authors":"Heewon Park, Satoru Miyano","doi":"10.1089/cmb.2024.0539","DOIUrl":null,"url":null,"abstract":"<p><p>We focus on characterizing cell lines from young and aged-healthy and -AML (acute myeloid leukemia) cell lines, and our goal is to identify the key markers associated with the progression of AML. To characterize the age-related phenotypes in AML cell lines, we consider eigenCell analysis that effectively encapsulates the primary expression level patterns across the cell lines. However, earlier investigations utilizing eigenGenes and eigenCells analysis were based on linear combination of all features, leading to the disturbance from noise features. Moreover, the analysis based on a fully dense loading matrix makes it challenging to interpret the results of eigenCells analysis. In order to address these challenges, we develop a novel computational approach termed network-constrained eigenCells profile estimation, which employs a sparse learning strategy. The proposed method estimates eigenCell based on not only the lasso but also network constrained penalization. The use of the network-constrained penalization enables us to simultaneously select neighborhood genes. Furthermore, the hub genes and their regulator/target genes are easily selected as crucial markers for eigenCells estimation. That is, our method can incorporate insights from network biology into the process of sparse loading estimation. Through our methodology, we estimate sparse eigenCells profiles, where only critical markers exhibit expression levels. This allows us to identify the key markers associated with a specific phenotype. Monte Carlo simulations demonstrate the efficacy of our method in reconstructing the sparse structure of eigenCells profiles. We employed our approach to unveil the regulatory system of immunogenes in both young/aged-healthy and -AML cell lines. The markers we have identified for the age-related phenotype in both healthy and AML cell lines have garnered strong support from previous studies. Specifically, our findings, in conjunction with the existing literature, indicate that the activities within this subnetwork of CD79A could be pivotal in elucidating the mechanism driving AML progression, particularly noting the significant role played by the diminished activities in the CD79A subnetwork. We expect that the proposed method will be a useful tool for characterizing disease-related subsets of cell lines, encompassing phenotypes and clones.</p>","PeriodicalId":15526,"journal":{"name":"Journal of Computational Biology","volume":" ","pages":"1158-1178"},"PeriodicalIF":1.4000,"publicationDate":"2024-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Computational Biology","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1089/cmb.2024.0539","RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/9/6 0:00:00","PubModel":"Epub","JCR":"Q4","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}
引用次数: 0
Abstract
We focus on characterizing cell lines from young and aged-healthy and -AML (acute myeloid leukemia) cell lines, and our goal is to identify the key markers associated with the progression of AML. To characterize the age-related phenotypes in AML cell lines, we consider eigenCell analysis that effectively encapsulates the primary expression level patterns across the cell lines. However, earlier investigations utilizing eigenGenes and eigenCells analysis were based on linear combination of all features, leading to the disturbance from noise features. Moreover, the analysis based on a fully dense loading matrix makes it challenging to interpret the results of eigenCells analysis. In order to address these challenges, we develop a novel computational approach termed network-constrained eigenCells profile estimation, which employs a sparse learning strategy. The proposed method estimates eigenCell based on not only the lasso but also network constrained penalization. The use of the network-constrained penalization enables us to simultaneously select neighborhood genes. Furthermore, the hub genes and their regulator/target genes are easily selected as crucial markers for eigenCells estimation. That is, our method can incorporate insights from network biology into the process of sparse loading estimation. Through our methodology, we estimate sparse eigenCells profiles, where only critical markers exhibit expression levels. This allows us to identify the key markers associated with a specific phenotype. Monte Carlo simulations demonstrate the efficacy of our method in reconstructing the sparse structure of eigenCells profiles. We employed our approach to unveil the regulatory system of immunogenes in both young/aged-healthy and -AML cell lines. The markers we have identified for the age-related phenotype in both healthy and AML cell lines have garnered strong support from previous studies. Specifically, our findings, in conjunction with the existing literature, indicate that the activities within this subnetwork of CD79A could be pivotal in elucidating the mechanism driving AML progression, particularly noting the significant role played by the diminished activities in the CD79A subnetwork. We expect that the proposed method will be a useful tool for characterizing disease-related subsets of cell lines, encompassing phenotypes and clones.
期刊介绍:
Journal of Computational Biology is the leading peer-reviewed journal in computational biology and bioinformatics, publishing in-depth statistical, mathematical, and computational analysis of methods, as well as their practical impact. Available only online, this is an essential journal for scientists and students who want to keep abreast of developments in bioinformatics.
Journal of Computational Biology coverage includes:
-Genomics
-Mathematical modeling and simulation
-Distributed and parallel biological computing
-Designing biological databases
-Pattern matching and pattern detection
-Linking disparate databases and data
-New tools for computational biology
-Relational and object-oriented database technology for bioinformatics
-Biological expert system design and use
-Reasoning by analogy, hypothesis formation, and testing by machine
-Management of biological databases