Quantitative Biology最新文献

筛选
英文 中文
Bioinformatics and biomedical informatics with ChatGPT: Year one review. 使用 ChatGPT 的生物信息学和生物医学信息学:一年回顾。
IF 0.6 4区 生物学
Quantitative Biology Pub Date : 2024-12-01 Epub Date: 2024-06-27 DOI: 10.1002/qub2.67
Jinge Wang, Zien Cheng, Qiuming Yao, Li Liu, Dong Xu, Gangqing Hu
{"title":"Bioinformatics and biomedical informatics with ChatGPT: Year one review.","authors":"Jinge Wang, Zien Cheng, Qiuming Yao, Li Liu, Dong Xu, Gangqing Hu","doi":"10.1002/qub2.67","DOIUrl":"10.1002/qub2.67","url":null,"abstract":"<p><p>The year 2023 marked a significant surge in the exploration of applying large language model chatbots, notably Chat Generative Pre-trained Transformer (ChatGPT), across various disciplines. We surveyed the application of ChatGPT in bioinformatics and biomedical informatics throughout the year, covering omics, genetics, biomedical text mining, drug discovery, biomedical image understanding, bioinformatics programming, and bioinformatics education. Our survey delineates the current strengths and limitations of this chatbot in bioinformatics and offers insights into potential avenues for future developments.</p>","PeriodicalId":45660,"journal":{"name":"Quantitative Biology","volume":null,"pages":null},"PeriodicalIF":0.6,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11446534/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142373184","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A comprehensive evaluation of large language models in mining gene relations and pathway knowledge. 对挖掘基因关系和路径知识的大型语言模型进行综合评估。
IF 0.6 4区 生物学
Quantitative Biology Pub Date : 2024-12-01 Epub Date: 2024-06-21 DOI: 10.1002/qub2.57
Muhammad Azam, Yibo Chen, Micheal Olaolu Arowolo, Haowang Liu, Mihail Popescu, Dong Xu
{"title":"A comprehensive evaluation of large language models in mining gene relations and pathway knowledge.","authors":"Muhammad Azam, Yibo Chen, Micheal Olaolu Arowolo, Haowang Liu, Mihail Popescu, Dong Xu","doi":"10.1002/qub2.57","DOIUrl":"10.1002/qub2.57","url":null,"abstract":"<p><p>Understanding complex biological pathways, including gene-gene interactions and gene regulatory networks, is critical for exploring disease mechanisms and drug development. Manual literature curation of biological pathways cannot keep up with the exponential growth of new discoveries in the literature. Large-scale language models (LLMs) trained on extensive text corpora contain rich biological information, and they can be mined as a biological knowledge graph. This study assesses 21 LLMs, including both application programming interface (API)-based models and open-source models in their capacities of retrieving biological knowledge. The evaluation focuses on predicting gene regulatory relations (activation, inhibition, and phosphorylation) and the Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway components. Results indicated a significant disparity in model performance. API-based models GPT-4 and Claude-Pro showed superior performance, with an F1 score of 0.4448 and 0.4386 for the gene regulatory relation prediction, and a Jaccard similarity index of 0.2778 and 0.2657 for the KEGG pathway prediction, respectively. Open-source models lagged behind their API-based counterparts, whereas Falcon-180b and llama2-7b had the highest F1 scores of 0.2787 and 0.1923 in gene regulatory relations, respectively. The KEGG pathway recognition had a Jaccard similarity index of 0.2237 for Falcon-180b and 0.2207 for llama2-7b. Our study suggests that LLMs are informative in gene network analysis and pathway mapping, but their effectiveness varies, necessitating careful model selection. This work also provides a case study and insight into using LLMs das knowledge graphs. Our code is publicly available at the website of GitHub (Muh-aza).</p>","PeriodicalId":45660,"journal":{"name":"Quantitative Biology","volume":null,"pages":null},"PeriodicalIF":0.6,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11446478/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142373183","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Foundation models for bioinformatics 生物信息学基础模型
IF 0.6 4区 生物学
Quantitative Biology Pub Date : 2024-07-24 DOI: 10.1002/qub2.69
Ziyu Chen, Lin Wei, Ge Gao
{"title":"Foundation models for bioinformatics","authors":"Ziyu Chen, Lin Wei, Ge Gao","doi":"10.1002/qub2.69","DOIUrl":"https://doi.org/10.1002/qub2.69","url":null,"abstract":"Transformer‐based foundation models such as ChatGPTs have revolutionized our daily life and affected many fields including bioinformatics. In this perspective, we first discuss about the direct application of textual foundation models on bioinformatics tasks, focusing on how to make the most out of canonical large language models and mitigate their inherent flaws. Meanwhile, we go through the transformer‐based, bioinformatics‐tailored foundation models for both sequence and non‐sequence data. In particular, we envision the further development directions as well as challenges for bioinformatics foundation models.","PeriodicalId":45660,"journal":{"name":"Quantitative Biology","volume":null,"pages":null},"PeriodicalIF":0.6,"publicationDate":"2024-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141806077","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A penalized integrative deep neural network for variable selection among multiple omics datasets 用于在多个 omics 数据集中进行变量选择的惩罚性整合深度神经网络
IF 3.1 4区 生物学
Quantitative Biology Pub Date : 2024-06-07 DOI: 10.1002/qub2.51
Yang Li, Xiaonan Ren, Haochen Yu, Tao Sun, Shuangge Ma
{"title":"A penalized integrative deep neural network for variable selection among multiple omics datasets","authors":"Yang Li, Xiaonan Ren, Haochen Yu, Tao Sun, Shuangge Ma","doi":"10.1002/qub2.51","DOIUrl":"https://doi.org/10.1002/qub2.51","url":null,"abstract":"Deep learning has been increasingly popular in omics data analysis. Recent works incorporating variable selection into deep learning have greatly enhanced the model’s interpretability. However, because deep learning desires a large sample size, the existing methods may result in uncertain findings when the dataset has a small sample size, commonly seen in omics data analysis. With the explosion and availability of omics data from multiple populations/studies, the existing methods naively pool them into one dataset to enhance the sample size while ignoring that variable structures can differ across datasets, which might lead to inaccurate variable selection results. We propose a penalized integrative deep neural network (PIN) to simultaneously select important variables from multiple datasets. PIN directly aggregates multiple datasets as input and considers both homogeneity and heterogeneity situations among multiple datasets in an integrative analysis framework. Results from extensive simulation studies and applications of PIN to gene expression datasets from elders with different cognitive statuses or ovarian cancer patients at different stages demonstrate that PIN outperforms existing methods with considerably improved performance among multiple datasets. The source code is freely available on Github (rucliyang/PINFunc). We speculate that the proposed PIN method will promote the identification of disease‐related important variables based on multiple studies/datasets from diverse origins.","PeriodicalId":45660,"journal":{"name":"Quantitative Biology","volume":null,"pages":null},"PeriodicalIF":3.1,"publicationDate":"2024-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141372161","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Comprehensive cross cancer analyses reveal mutational signature cancer specificity 癌症交叉综合分析揭示突变特征癌症特异性
IF 3.1 4区 生物学
Quantitative Biology Pub Date : 2024-06-05 DOI: 10.1002/qub2.49
Rui Xin, Limin Jiang, Hui Yu, Fengyao Yan, Jijun Tang, Yan Guo
{"title":"Comprehensive cross cancer analyses reveal mutational signature cancer specificity","authors":"Rui Xin, Limin Jiang, Hui Yu, Fengyao Yan, Jijun Tang, Yan Guo","doi":"10.1002/qub2.49","DOIUrl":"https://doi.org/10.1002/qub2.49","url":null,"abstract":"Mutational signatures refer to distinct patterns of DNA mutations that occur in a specific context or under certain conditions. It is a powerful tool to describe cancer etiology. We conducted a study to show cancer heterogeneity and cancer specificity from the aspect of mutational signatures through collinearity analysis and machine learning techniques. Through thorough training and independent validation, our results show that while the majority of the mutational signatures are distinct, similarities between certain mutational signature pairs can be observed through both mutation patterns and mutational signature abundance. The observation can potentially assist to determine the etiology of yet elusive mutational signatures. Further analysis using machine learning approaches demonstrated moderate mutational signature cancer specificity. Skin cancer among all cancer types demonstrated the strongest mutational signature specificity.","PeriodicalId":45660,"journal":{"name":"Quantitative Biology","volume":null,"pages":null},"PeriodicalIF":3.1,"publicationDate":"2024-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141383956","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SimHOEPI: A resampling simulator for generating single nucleotide polymorphism data with a high‐order epistasis model SimHOEPI:利用高阶外显率模型生成单核苷酸多态性数据的再采样模拟器
IF 3.1 4区 生物学
Quantitative Biology Pub Date : 2024-04-16 DOI: 10.1002/qub2.42
Yahan Li, Xinrui Cai, J. Shang, Yuanyuan Zhang, Jinxing Liu
{"title":"SimHOEPI: A resampling simulator for generating single nucleotide polymorphism data with a high‐order epistasis model","authors":"Yahan Li, Xinrui Cai, J. Shang, Yuanyuan Zhang, Jinxing Liu","doi":"10.1002/qub2.42","DOIUrl":"https://doi.org/10.1002/qub2.42","url":null,"abstract":"Epistasis is a ubiquitous phenomenon in genetics, and is considered to be one of main factors in current efforts to unveil missing heritability of complex diseases. Simulation data is crucial for evaluating epistasis detection tools in genome‐wide association studies (GWAS). Existing simulators normally suffer from two limitations: absence of support for high‐order epistasis models containing multiple single nucleotide polymorphisms (SNPs), and inability to generate simulation SNP data independently. In this study, we proposed a simulator SimHOEPI, which is capable of calculating penetrance tables of high‐order epistasis models depending on either prevalence or heritability, and uses a resampling strategy to generate simulation data independently. Highlights of SimHOEPI are the preservation of realistic minor allele frequencies in sampling data, the accurate calculation and embedding of high‐order epistasis models, and acceptable simulation time. A series of experiments were carried out to verify these properties from different aspects. Experimental results show that SimHOEPI can generate simulation SNP data independently with high‐order epistasis models, implying that it might be an alternative simulator for GWAS.","PeriodicalId":45660,"journal":{"name":"Quantitative Biology","volume":null,"pages":null},"PeriodicalIF":3.1,"publicationDate":"2024-04-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140695859","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Functional predictability of universal gene circuits in diverse microbial hosts 不同微生物宿主中通用基因回路的功能可预测性
IF 3.1 4区 生物学
Quantitative Biology Pub Date : 2024-04-14 DOI: 10.1002/qub2.41
Chenrui Qin, Tong Xu, Xuejin Zhao, Yeqing Zong, Haoqian M. Zhang, Chunbo Lou, Ouyang Qi, Long Qian
{"title":"Functional predictability of universal gene circuits in diverse microbial hosts","authors":"Chenrui Qin, Tong Xu, Xuejin Zhao, Yeqing Zong, Haoqian M. Zhang, Chunbo Lou, Ouyang Qi, Long Qian","doi":"10.1002/qub2.41","DOIUrl":"https://doi.org/10.1002/qub2.41","url":null,"abstract":"Although the principles of synthetic biology were initially established in model bacteria, microbial producers, extremophiles and gut microbes have now emerged as valuable prokaryotic chassis for biological engineering. Extending the host range in which designed circuits can function reliably and predictably presents a major challenge for the concept of synthetic biology to materialize. In this work, we systematically characterized the cross‐species universality of two transcriptional regulatory modules—the T7 RNA polymerase activator module and the repressors module—in three non‐model microbes. We found striking linear relationships in circuit activities among different organisms for both modules. Parametrized model fitting revealed host non‐specific parameters defining the universality of both modules. Lastly, a genetic NOT gate and a band‐pass filter circuit were constructed from these modules and tested in non‐model organisms. Combined models employing host non‐specific parameters were successful in quantitatively predicting circuit behaviors, underscoring the potential of universal biological parts and predictive modeling in synthetic bioengineering.","PeriodicalId":45660,"journal":{"name":"Quantitative Biology","volume":null,"pages":null},"PeriodicalIF":3.1,"publicationDate":"2024-04-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140704855","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A clinical trial termination prediction model based on denoising autoencoder and deep survival regression 基于去噪自动编码器和深度生存回归的临床试验终止预测模型
IF 3.1 4区 生物学
Quantitative Biology Pub Date : 2024-04-12 DOI: 10.1002/qub2.43
Huamei Qi, Wenhui Yang, Wenqin Zou, Yuxuan Hu
{"title":"A clinical trial termination prediction model based on denoising autoencoder and deep survival regression","authors":"Huamei Qi, Wenhui Yang, Wenqin Zou, Yuxuan Hu","doi":"10.1002/qub2.43","DOIUrl":"https://doi.org/10.1002/qub2.43","url":null,"abstract":"Effective clinical trials are necessary for understanding medical advances but early termination of trials can result in unnecessary waste of resources. Survival models can be used to predict survival probabilities in such trials. However, survival data from clinical trials are sparse, and DeepSurv cannot accurately capture their effective features, making the models weak in generalization and decreasing their prediction accuracy. In this paper, we propose a survival prediction model for clinical trial completion based on the combination of denoising autoencoder (DAE) and DeepSurv models. The DAE is used to obtain a robust representation of features by breaking the loop of raw features after autoencoder training, and then the robust features are provided to DeepSurv as input for training. The clinical trial dataset for training the model was obtained from the ClinicalTrials.gov dataset. A study of clinical trial completion in pregnant women was conducted in response to the fact that many current clinical trials exclude pregnant women. The experimental results showed that the denoising autoencoder and deep survival regression (DAE‐DSR) model was able to extract meaningful and robust features for survival analysis; the C‐index of the training and test datasets were 0.74 and 0.75 respectively. Compared with the Cox proportional hazards model and DeepSurv model, the survival analysis curves obtained by using DAE‐DSR model had more prominent features, and the model was more robust and performed better in actual prediction.","PeriodicalId":45660,"journal":{"name":"Quantitative Biology","volume":null,"pages":null},"PeriodicalIF":3.1,"publicationDate":"2024-04-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140711890","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A feature extraction framework for discovering pan‐cancer driver genes based on multi‐omics data 基于多组学数据发现泛癌症驱动基因的特征提取框架
IF 3.1 4区 生物学
Quantitative Biology Pub Date : 2024-04-05 DOI: 10.1002/qub2.40
Xiaomeng Xue, Feng Li, J. Shang, Lingyun Dai, Daohui Ge, Qianqian Ren
{"title":"A feature extraction framework for discovering pan‐cancer driver genes based on multi‐omics data","authors":"Xiaomeng Xue, Feng Li, J. Shang, Lingyun Dai, Daohui Ge, Qianqian Ren","doi":"10.1002/qub2.40","DOIUrl":"https://doi.org/10.1002/qub2.40","url":null,"abstract":"The identification of tumor driver genes facilitates accurate cancer diagnosis and treatment, playing a key role in precision oncology, along with gene signaling, regulation, and their interaction with protein complexes. To tackle the challenge of distinguishing driver genes from a large number of genomic data, we construct a feature extraction framework for discovering pan‐cancer driver genes based on multi‐omics data (mutations, gene expression, copy number variants, and DNA methylation) combined with protein–protein interaction (PPI) networks. Using a network propagation algorithm, we mine functional information among nodes in the PPI network, focusing on genes with weak node information to represent specific cancer information. From these functional features, we extract distribution features of pan‐cancer data, pan‐cancer TOPSIS features of functional features using the ideal solution method, and SetExpan features of pan‐cancer data from the gene functional features, a method to rank pan‐cancer data based on the average inverse rank. These features represent the common message of pan‐cancer. Finally, we use the lightGBM classification algorithm for gene prediction. Experimental results show that our method outperforms existing methods in terms of the area under the check precision‐recall curve (AUPRC) and demonstrates better performance across different PPI networks. This indicates our framework’s effectiveness in predicting potential cancer genes, offering valuable insights for the diagnosis and treatment of tumors.","PeriodicalId":45660,"journal":{"name":"Quantitative Biology","volume":null,"pages":null},"PeriodicalIF":3.1,"publicationDate":"2024-04-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140736823","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
GCARDTI: Drug–target interaction prediction based on a hybrid mechanism in drug SELFIES GCARDTI:基于药物 SELFIES 混合机制的药物-靶点相互作用预测
IF 3.1 4区 生物学
Quantitative Biology Pub Date : 2024-04-01 DOI: 10.1002/qub2.39
Yinfei Feng, Yuanyuan Zhang, Zengqian Deng, Mimi Xiong
{"title":"GCARDTI: Drug–target interaction prediction based on a hybrid mechanism in drug SELFIES","authors":"Yinfei Feng, Yuanyuan Zhang, Zengqian Deng, Mimi Xiong","doi":"10.1002/qub2.39","DOIUrl":"https://doi.org/10.1002/qub2.39","url":null,"abstract":"The prediction of the interaction between a drug and a target is the most critical issue in the fields of drug development and repurposing. However, there are still two challenges in current deep learning research: (i) the structural information of drug molecules is not fully explored in most drug target studies, and the previous drug SMILES does not correspond well to effective drug molecules and (ii) exploration of the potential relationship between drugs and targets is in need of improvement. In this work, we use a new and better representation of the effective molecular graph structure, SELFIES. We propose a hybrid mechanism framework based on convolutional neural network and graph attention network to capture multi‐view feature information of drug and target molecular structures, and we aim to enhance the ability to capture interaction sites between a drug and a target. In this study, our experiments using two different datasets show that the GCARDTI model outperforms a variety of different model algorithms on different metrics. We also demonstrate the accuracy of our model through two case studies.","PeriodicalId":45660,"journal":{"name":"Quantitative Biology","volume":null,"pages":null},"PeriodicalIF":3.1,"publicationDate":"2024-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140766626","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信