IEEE/ACM Transactions on Computational Biology and Bioinformatics最新文献_第9页

MOTHER-DB: A Database for Sharing Nonhuman Ovarian Histology Images MOTHER-DB：共享非人类卵巢组织学图像的数据库。

IF 3.6 3区生物学

IEEE/ACM Transactions on Computational Biology and Bioinformatics Pub Date : 2024-07-12 DOI: 10.1109/TCBB.2024.3426999

Suzanne W. Dietrich;Wenli Ma;Yian Ding;Karen H. Watanabe;Mary B. Zelinski;James P. Sluka

{"title":"MOTHER-DB: A Database for Sharing Nonhuman Ovarian Histology Images","authors":"Suzanne W. Dietrich;Wenli Ma;Yian Ding;Karen H. Watanabe;Mary B. Zelinski;James P. Sluka","doi":"10.1109/TCBB.2024.3426999","DOIUrl":"10.1109/TCBB.2024.3426999","url":null,"abstract":"The goal of the Multispecies Ovary Tissue Histology Electronic Repository (MOTHER) project is to establish a collection of nonhuman ovary histology images for multiple species as a resource for researchers and educators. An important component of sharing scientific data is the inclusion of the contextual metadata that describes the data. MOTHER extends the Ecological Metadata Language (EML) for documenting research data, leveraging its data provenance and usage license with the inclusion of metadata for ovary histology images. The design of the MOTHER metadata includes information on the donor animal, including reproductive cycle status, the slide and its preparation. MOTHER also extends the ezEML tool, called ezEML+MOTHER, for the specification of the metadata. The design of the MOTHER database (MOTHER-DB) captures the metadata about the histology images, providing a searchable resource for discovering relevant images. MOTHER also defines a curation process for the ingestion of a collection of images and its metadata, verifying the validity of the metadata before its inclusion in the MOTHER collection. A Web search provides the ability to identify relevant images based on various characteristics in the metadata itself, such as genus and species, using filters.","PeriodicalId":13344,"journal":{"name":"IEEE/ACM Transactions on Computational Biology and Bioinformatics","volume":"21 6","pages":"2598-2603"},"PeriodicalIF":3.6,"publicationDate":"2024-07-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141599244","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Bi-SeqCNN: A Novel Light-Weight Bi-Directional CNN Architecture for Protein Function Prediction Bi-SeqCNN：用于蛋白质功能预测的新型轻量级双向 CNN 架构

IF 3.6 3区生物学

IEEE/ACM Transactions on Computational Biology and Bioinformatics Pub Date : 2024-07-11 DOI: 10.1109/TCBB.2024.3426491

Vikash Kumar;Akshay Deepak;Ashish Ranjan;Aravind Prakash

{"title":"Bi-SeqCNN: A Novel Light-Weight Bi-Directional CNN Architecture for Protein Function Prediction","authors":"Vikash Kumar;Akshay Deepak;Ashish Ranjan;Aravind Prakash","doi":"10.1109/TCBB.2024.3426491","DOIUrl":"10.1109/TCBB.2024.3426491","url":null,"abstract":"Deep learning approaches, such as convolution neural networks (CNNs) and deep recurrent neural networks (RNNs), have been the backbone for predicting protein function, with promising state-of-the-art (SOTA) results. RNNs with an in-built ability (i) focus on past information, (ii) collect both \u0000short-and-long\u0000 range dependency information, and (iii) bi-directional processing offers a strong sequential processing mechanism. CNNs, however, are confined to focusing on \u0000short-term\u0000 information from both the past and the future, although they offer parallelism. Therefore, a novel \u0000bi-directional CNN\u0000 that strictly complies with the sequential processing mechanism of RNNs is introduced and is used for developing a protein function prediction framework, Bi-SeqCNN. This is a sub-sequence-based framework. Further, Bi-SeqCNN\u0000<inline-formula><tex-math>$^+$</tex-math></inline-formula>\u0000 is an ensemble approach to better the prediction results. To our knowledge, this is the first time \u0000bi-directional CNNs\u0000 are employed for general temporal data analysis and not just for protein sequences. The proposed architecture produces improvements up to +5.5% over contemporary SOTA methods on three benchmark protein sequence datasets. Moreover, it is substantially lighter and attain these results with (0.50–0.70 times) fewer parameters than the SOTA methods.","PeriodicalId":13344,"journal":{"name":"IEEE/ACM Transactions on Computational Biology and Bioinformatics","volume":"21 6","pages":"1922-1933"},"PeriodicalIF":3.6,"publicationDate":"2024-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141590210","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

SCRN: Single-Cell Gene Regulatory Network Identification in Alzheimer's Disease SCRN：阿尔茨海默病的单细胞基因调控网络鉴定。

IF 3.6 3区生物学

IEEE/ACM Transactions on Computational Biology and Bioinformatics Pub Date : 2024-07-08 DOI: 10.1109/TCBB.2024.3424400

Wentao Zhu;Zhiqiang Du;Ziang Xu;Defu Yang;Minghan Chen;Qianqian Song

{"title":"SCRN: Single-Cell Gene Regulatory Network Identification in Alzheimer's Disease","authors":"Wentao Zhu;Zhiqiang Du;Ziang Xu;Defu Yang;Minghan Chen;Qianqian Song","doi":"10.1109/TCBB.2024.3424400","DOIUrl":"10.1109/TCBB.2024.3424400","url":null,"abstract":"Alzheimer's disease (AD) is the most common neurodegenerative disease, and it consumes considerable medical resources with increasing number of patients every year. Mounting evidence show that the regulatory disruptions altering the intrinsic activity of genes in brain cells contribute to AD pathogenesis. To gain insights into the underlying gene regulation in AD, we proposed a graph learning method, Single-Cell based Regulatory Network (SCRN), to identify the regulatory mechanisms based on single-cell data. SCRN implements the γ-decaying heuristic link prediction based on graph neural networks and can identify reliable gene regulatory networks using locally closed subgraphs. In this work, we first performed UMAP dimension reduction analysis on single-cell RNA sequencing (scRNA-seq) data of AD and normal samples. Then we used SCRN to construct the gene regulatory network based on three well-recognized AD genes (APOE, CX3CR1, and P2RY12). Enrichment analysis of the regulatory network revealed significant pathways including NGF signaling, ERBB2 signaling, and hemostasis. These findings demonstrate the feasibility of using SCRN to uncover potential biomarkers and therapeutic targets related to AD.","PeriodicalId":13344,"journal":{"name":"IEEE/ACM Transactions on Computational Biology and Bioinformatics","volume":"21 6","pages":"1886-1896"},"PeriodicalIF":3.6,"publicationDate":"2024-07-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141558630","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Improved Fuzzy Cognitive Maps for Gene Regulatory Networks Inference Based on Time Series Data 基于时间序列数据的基因调控网络推断的改进型模糊认知图。

IF 3.6 3区生物学

IEEE/ACM Transactions on Computational Biology and Bioinformatics Pub Date : 2024-07-04 DOI: 10.1109/TCBB.2024.3423383

Marzieh Emadi;Farsad Zamani Boroujeni;Jamshid Pirgazi

{"title":"Improved Fuzzy Cognitive Maps for Gene Regulatory Networks Inference Based on Time Series Data","authors":"Marzieh Emadi;Farsad Zamani Boroujeni;Jamshid Pirgazi","doi":"10.1109/TCBB.2024.3423383","DOIUrl":"10.1109/TCBB.2024.3423383","url":null,"abstract":"Microarray data provide lots of information regarding gene expression levels. Due to the large amount of such data, their analysis requires sufficient computational methods for identifying and analyzing gene regulation networks; however, researchers in this field are faced with numerous challenges such as consideration for too many genes and at the same time, the limited number of samples and their noisy nature of the data. In this paper, a hybrid method base on fuzzy cognitive map and compressed sensing is used to identify interactions between genes. For this purpose, in inference of the gene regulation network, the Ensemble Kalman filtered compressed sensing is used to learn the fuzzy cognitive map. Using the Ensemble Kalman filter and compressed sensing, the fuzzy cognitive map will be robust against noise. The proposed algorithm is evaluated using several metrics and compared with several well know methods such as LASSOFCM, KFRegular, CMI2NI. The experimental results show that the proposed method outperforms methods proposed in recent years in terms of SSmean, Data Error and accuracy.","PeriodicalId":13344,"journal":{"name":"IEEE/ACM Transactions on Computational Biology and Bioinformatics","volume":"21 6","pages":"1816-1829"},"PeriodicalIF":3.6,"publicationDate":"2024-07-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141534365","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Prediction of Potential miRNA-Disease Associations Based on a Masked Graph Autoencoder 基于屏蔽图自动编码器的潜在 miRNA 与疾病关联预测

IF 3.6 3区生物学

IEEE/ACM Transactions on Computational Biology and Bioinformatics Pub Date : 2024-07-02 DOI: 10.1109/TCBB.2024.3421924

Hailin Feng;Chenchen Ke;Quan Zou;Zhechen Zhu;Tongcun Liu

{"title":"Prediction of Potential miRNA-Disease Associations Based on a Masked Graph Autoencoder","authors":"Hailin Feng;Chenchen Ke;Quan Zou;Zhechen Zhu;Tongcun Liu","doi":"10.1109/TCBB.2024.3421924","DOIUrl":"10.1109/TCBB.2024.3421924","url":null,"abstract":"Biomedical evidence has demonstrated the relevance of microRNA (miRNA) dysregulation in complex human diseases, and determining the relationship between miRNAs and diseases can aid in the early detection and prevention of diseases. Traditional biological experimental methods have the disadvantages of high cost and low efficiency, which are well compensated by computational methods. However, many computational methods have the challenge of excessively focusing on the neighbor relationship, ignoring the structural information of the graph, and belittling the redundant information of the graph structure. This study proposed a computational model based on a graph-masking autoencoder named MGAEMDA. MGAEMDA is an asymmetric framework in which the encoder maps partially observed graphs into latent representations. The decoder reconstructs the masked structural information based on the edge and node levels and combines it with linear matrices to obtain the result. The empirical results on the two datasets reveal that the MGAEMDA model performs better than its counterparts. We also demonstrated the predictive performance of MGAEMDA using a case study of four diseases, and all the top 30 predicted miRNAs were validated in the database, providing further evidence of the excellent performance of the model.","PeriodicalId":13344,"journal":{"name":"IEEE/ACM Transactions on Computational Biology and Bioinformatics","volume":"21 6","pages":"1874-1885"},"PeriodicalIF":3.6,"publicationDate":"2024-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141491810","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Graph Convolutional Network With Self-Supervised Learning for Brain Disease Classification 基于自我监督学习的图卷积网络用于脑疾病分类

IF 3.6 3区生物学

IEEE/ACM Transactions on Computational Biology and Bioinformatics Pub Date : 2024-07-02 DOI: 10.1109/TCBB.2024.3422152

Guangyu Wang;Ying Chu;Qianqian Wang;Limei Zhang;Lishan Qiao;Mingxia Liu

{"title":"Graph Convolutional Network With Self-Supervised Learning for Brain Disease Classification","authors":"Guangyu Wang;Ying Chu;Qianqian Wang;Limei Zhang;Lishan Qiao;Mingxia Liu","doi":"10.1109/TCBB.2024.3422152","DOIUrl":"10.1109/TCBB.2024.3422152","url":null,"abstract":"Brain functional network (BFN) analysis has become a popular method for identifying neurological diseases at their early stages and revealing sensitive biomarkers related to these diseases. Due to the fact that BFN is a graph with complex structure, graph convolutional networks (GCNs) can be naturally used in the identification of BFN, and can generally achieve an encouraging performance if given large amounts of training data. In practice, however, it is very difficult to obtain sufficient brain functional data, especially from subjects with brain disorders. As a result, GCNs usually fail to learn a reliable feature representation from limited BFNs, leading to overfitting issues. In this paper, we propose an improved GCN method to classify brain diseases by introducing a self-supervised learning (SSL) module for assisting the graph feature representation. We conduct experiments to classify subjects with mild cognitive impairment (MCI) and autism spectrum disorder (ASD) respectively from normal controls (NCs). Experimental results on two benchmark databases demonstrate that our proposed scheme tends to obtain higher classification accuracy than the baseline methods.","PeriodicalId":13344,"journal":{"name":"IEEE/ACM Transactions on Computational Biology and Bioinformatics","volume":"21 6","pages":"1830-1841"},"PeriodicalIF":3.6,"publicationDate":"2024-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141491809","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Optimal Structured Matrix Approximation for Robustness to Incomplete Biosequence Data 针对不完整生物序列数据的最佳结构化矩阵近似。

IF 3.6 3区生物学

IEEE/ACM Transactions on Computational Biology and Bioinformatics Pub Date : 2024-07-01 DOI: 10.1109/TCBB.2024.3420903

Chris Salahub;Jeffrey Uhlmann

引用次数: 0

Ense-i6mA: Identification of DNA N6-Methyladenine Sites Using XGB-RFE Feature Selection and Ensemble Machine Learning Ense-i6mA：利用 XGB-RFE 特征选择和集合机器学习识别 DNA N6-甲基腺嘌呤位点。

IF 3.6 3区生物学

IEEE/ACM Transactions on Computational Biology and Bioinformatics Pub Date : 2024-07-01 DOI: 10.1109/TCBB.2024.3421228

Xueqiang Fan;Bing Lin;Jun Hu;Zhongyi Guo

{"title":"Ense-i6mA: Identification of DNA N6-Methyladenine Sites Using XGB-RFE Feature Selection and Ensemble Machine Learning","authors":"Xueqiang Fan;Bing Lin;Jun Hu;Zhongyi Guo","doi":"10.1109/TCBB.2024.3421228","DOIUrl":"10.1109/TCBB.2024.3421228","url":null,"abstract":"DNA N\u00006\u0000-methyladenine (6mA) is an important epigenetic modification that plays a vital role in various cellular processes. Accurate identification of the 6mA sites is fundamental to elucidate the biological functions and mechanisms of modification. However, experimental methods for detecting 6mA sites are high-priced and time-consuming. In this study, we propose a novel computational method, called Ense-i6mA, to predict 6mA sites. Firstly, five encoding schemes, i.e., one-hot encoding, gcContent, Z-Curve, \u0000<italic>K\u0000-mer nucleotide frequency, and \u0000<italic>K\u0000-mer nucleotide frequency with gap, are employed to extract DNA sequence features. Secondly, eXtreme gradient boosting coupled with recursive feature elimination is applied to remove noisy features for avoiding over-fitting, reducing computing time and complexity. Then, the best subset of features is fed into base-classifiers composed of Extra Trees, eXtreme Gradient Boosting, Light Gradient Boosting Machine, and Support Vector Machine. Finally, to minimize generalization errors, the prediction probabilities of the base-classifiers are aggregated by averaging for inferring the final 6mA sites results. We conduct experiments on two species, i.e., Arabidopsis thaliana and Drosophila melanogaster, to compare the performance of Ense-i6mA against the recent 6mA sites prediction methods. The experimental results demonstrate that the proposed Ense-i6mA achieves area under the receiver operating characteristic curve values of 0.967 and 0.968, accuracies of 91.4% and 92.0%, and Mathew's correlation coefficient values of 0.829 and 0.842 on two benchmark datasets, respectively, and outperforms several existing state-of-the-art methods.","PeriodicalId":13344,"journal":{"name":"IEEE/ACM Transactions on Computational Biology and Bioinformatics","volume":"21 6","pages":"1842-1854"},"PeriodicalIF":3.6,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141476496","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Haplotype Frequency Inference From Pooled Genetic Data With a Latent Multinomial Model 利用潜在多项式模型从集合遗传数据中推断单倍型频率。

IF 3.6 3区生物学

IEEE/ACM Transactions on Computational Biology and Bioinformatics Pub Date : 2024-06-28 DOI: 10.1109/TCBB.2024.3420430

Yong See Foo;Jennifer Flegg

{"title":"Haplotype Frequency Inference From Pooled Genetic Data With a Latent Multinomial Model","authors":"Yong See Foo;Jennifer Flegg","doi":"10.1109/TCBB.2024.3420430","DOIUrl":"10.1109/TCBB.2024.3420430","url":null,"abstract":"In genetic association studies, haplotype data provide more refined information than data about separate genetic markers. However, large-scale studies that genotype hundreds to thousands of individuals may only provide results of pooled data. Methods for inferring haplotype frequencies from pooled genetic data that scale well with pool size rely on a normal approximation, which we observe to produce unreliable inference when applied to real data. We illustrate cases where the approximation fails, due to the normal covariance matrix being near-singular. As an alternative to approximate methods, in this paper we propose two exact methods to infer haplotype frequencies from pooled genetic data based on a latent multinomial model, where the pooled results are considered integer combinations of latent, unobserved haplotype counts. One of our methods, latent count sampling via Markov bases, achieves approximately linear runtime with respect to pool size. Our exact methods produce more accurate inference over existing approximate methods for synthetic data and for haplotype data from the 1000 Genomes Project. We also demonstrate how our methods can be applied to time-series of pooled genetic data, as a proof of concept of how our methods are relevant to more complex hierarchical settings, such as spatiotemporal models.","PeriodicalId":13344,"journal":{"name":"IEEE/ACM Transactions on Computational Biology and Bioinformatics","volume":"21 6","pages":"1864-1873"},"PeriodicalIF":3.6,"publicationDate":"2024-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141467650","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Tropical Density Estimation of Phylogenetic Trees 系统发生树的热带密度估计

IF 3.6 3区生物学

IEEE/ACM Transactions on Computational Biology and Bioinformatics Pub Date : 2024-06-28 DOI: 10.1109/TCBB.2024.3420815

Ruriko Yoshida;David Barnhill;Keiji Miura;Daniel Howe

{"title":"Tropical Density Estimation of Phylogenetic Trees","authors":"Ruriko Yoshida;David Barnhill;Keiji Miura;Daniel Howe","doi":"10.1109/TCBB.2024.3420815","DOIUrl":"10.1109/TCBB.2024.3420815","url":null,"abstract":"Much evidence from biological theory and empirical data indicates that, gene trees, phylogenetic trees reconstructed from different genes (loci), do not have to have exactly the same tree topologies. Such incongruence between gene trees might be caused by some “unusual” evolutionary events, such as meiotic sexual recombination in eukaryotes or horizontal transfers of genetic material in prokaryotes. However, most of the gene trees are constrained by the tree topology of the underlying species tree, that is, the phylogenetic tree depicting the evolutionary history of the set of species under consideration. In order to discover “outlying” gene trees which do not follow the “main distribution(s)” of trees, we propose to apply the “tropical metric” with the max-plus algebra from tropical geometry to a non-parametric estimation of gene trees over the space of phylogenetic trees. In this research we apply the “tropical metric,” a well-defined metric over the space of phylogenetic trees under the max-plus algebra, to non-parametric estimation of gene trees distribution over the tree space. Kernel density estimator (KDE) is one of the most popular non-parametric estimation of a distribution from a given sample, and we propose an analogue of the classical KDE in the setting of tropical geometry with the tropical metric which measures the length of an intrinsic geodesic between trees over the tree space. We estimate the probability of an observed tree by empirical frequencies of nearby trees, with the level of influence determined by the tropical metric. Then, with simulated data generated from the multispecies coalescent model, we show that the non-parametric estimation of the gene tree distribution using the tropical metric performs better than one using the Billera-Holmes-Vogtmann (BHV) metric developed by Weyenberg et al. in terms of computational times and accuracy. We then apply it to Apicomplexa data.","PeriodicalId":13344,"journal":{"name":"IEEE/ACM Transactions on Computational Biology and Bioinformatics","volume":"21 6","pages":"1855-1863"},"PeriodicalIF":3.6,"publicationDate":"2024-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10577088","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141467651","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0