Computational and structural biotechnology journal最新文献

筛选
英文 中文
Substituted 1,4-naphthoquinones for potential anticancer therapeutics: In vitro cytotoxic effects and QSAR-guided design of new analogs. 取代1,4-萘醌作为潜在的抗癌治疗药物:体外细胞毒作用和qsar引导的新类似物设计。
IF 4.1 2区 生物学
Computational and structural biotechnology journal Pub Date : 2025-07-25 eCollection Date: 2025-01-01 DOI: 10.1016/j.csbj.2025.07.040
Veda Prachayasittikul, Prasit Mandi, Ratchanok Pingaew, Supaluk Prachayasittikul, Somsak Ruchirawat, Virapong Prachayasittikul
{"title":"Substituted 1,4-naphthoquinones for potential anticancer therapeutics: <i>In vitro</i> cytotoxic effects and QSAR-guided design of new analogs.","authors":"Veda Prachayasittikul, Prasit Mandi, Ratchanok Pingaew, Supaluk Prachayasittikul, Somsak Ruchirawat, Virapong Prachayasittikul","doi":"10.1016/j.csbj.2025.07.040","DOIUrl":"10.1016/j.csbj.2025.07.040","url":null,"abstract":"<p><p>1,4-Naphthoquinone is a promising pharmacophore in drug discovery due to its unique redox reactive nature and wide-ranging bioactivities. Herein, a series of 1,4-naphthoquinones (<b>1</b>-<b>14</b>) were investigated for their anticancer activities against 4 cancer cell lines (i.e., HepG2, HuCCA-1, A549, and MOLT-3). Compound <b>11</b> was found to be the most potent and selective anticancer agent against all tested cell lines (IC<sub>50</sub> = 0.15 - 1.55 μM, selectivity index = 4.14 - 43.57). QSAR modelling was performed to elucidate key structural features influencing activities against four cancer cell lines. Four QSAR models were successfully constructed using multiple linear regression (MLR) algorithm providing good predictive performance (R: training set = 0.8928-0.9664; testing set = 0.7824-0.9157; RMSE: training set = 0.1755-0.2600; testing set = 0.2726-0.3748). QSAR models suggested that the potent anticancer activities of these naphthoquinones were mainly influenced by polarizability (MATS3p and BELp8), van der Waals volume (GATS5v, GATS6v, and Mor16v), mass (G1m), electronegativity (E1e), and dipole moment (Dipole and EEig15d) as well as ring complexity (RCI) and shape of the compound (SHP2). The models were further applied for guiding the design and predicting activities of an additional set of 248 structurally modified compounds in which the ones with promising predicted activities were highlighted for potential further development. Additionally, pharmacokinetic profiles and possible binding modes towards potential biological targets of the compounds were virtually assessed. Structure-activity relationship analysis was also conducted to highlight key structural features beneficial for further successful design of the related naphthoquinones.</p>","PeriodicalId":10715,"journal":{"name":"Computational and structural biotechnology journal","volume":"27 ","pages":"3492-3509"},"PeriodicalIF":4.1,"publicationDate":"2025-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12345971/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144844832","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Decoding PDI diversity: Insights into structure, domains, and functionality in sorghum. 解码PDI多样性:对高粱结构、域和功能的洞察。
IF 4.1 2区 生物学
Computational and structural biotechnology journal Pub Date : 2025-07-25 eCollection Date: 2025-01-01 DOI: 10.1016/j.csbj.2025.07.035
Carla F López-Gómez, Marc T Morris, Karen Massel, Millicent Smith, Peter Crisp, Gerhard Schenk, Ian D Godwin
{"title":"Decoding PDI diversity: Insights into structure, domains, and functionality in sorghum.","authors":"Carla F López-Gómez, Marc T Morris, Karen Massel, Millicent Smith, Peter Crisp, Gerhard Schenk, Ian D Godwin","doi":"10.1016/j.csbj.2025.07.035","DOIUrl":"10.1016/j.csbj.2025.07.035","url":null,"abstract":"<p><p>Proteins play indispensable roles in cellular function, acting as both structural components and catalysts for essential biological processes. Their proper folding into three-dimensional structures is critical for functionality. To ensure correct folding, proteins interact with chaperones and folding catalysts such as Protein Disulfide Isomerases (PDIs), which assist in the formation and rearrangement of disulfide bonds that stabilize proteins by linking cysteine residues. PDIs are part of the thioredoxin (TRX) superfamily and are characterized by a conserved CXXC motif that contributes to their redox potential. They exhibit isomerase and oxidoreductase activities, that enable them to rearrange and form new disulfide bonds. PDI family members in sorghum (SbPDI) present a broad and largely unexplored diversity in domain order, structure, and architecture between or even within species. To shed light on this diversity, we identified and characterized PDI family members in sorghum <i>in silico</i> to explore their domain architecture, three-dimensional structure and functionality.</p>","PeriodicalId":10715,"journal":{"name":"Computational and structural biotechnology journal","volume":"27 ","pages":"3328-3336"},"PeriodicalIF":4.1,"publicationDate":"2025-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12337123/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144820765","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Integrative machine learning approach for forecasting lung cancer chemosensitivity: From algorithm to cell line validation. 预测肺癌化疗敏感性的综合机器学习方法:从算法到细胞系验证。
IF 4.1 2区 生物学
Computational and structural biotechnology journal Pub Date : 2025-07-24 eCollection Date: 2025-01-01 DOI: 10.1016/j.csbj.2025.07.043
Jinghong Chen, Yonglin Yi, Chunqian Yang, Haoxuan Ying, Jian Zhang, Anqi Lin, Ting Wei, Peng Luo
{"title":"Integrative machine learning approach for forecasting lung cancer chemosensitivity: From algorithm to cell line validation.","authors":"Jinghong Chen, Yonglin Yi, Chunqian Yang, Haoxuan Ying, Jian Zhang, Anqi Lin, Ting Wei, Peng Luo","doi":"10.1016/j.csbj.2025.07.043","DOIUrl":"10.1016/j.csbj.2025.07.043","url":null,"abstract":"<p><strong>Background: </strong>Chemotherapy remains the primary treatment modality for patients with lung cancer; however, substantial inter-patient variability exists in responses to chemotherapeutic agents. Therefore, predicting individual responses is critical for optimizing treatment outcomes and improving patient prognosis.</p><p><strong>Methods: </strong>This study developed a model to predict chemotherapy response in lung cancer patients by integrating multi-omics and clinical data from the Genomics of Drug Sensitivity in Cancer database, employing 45 machine learning algorithms. Data from the Gene Expression Omnibus database were utilized to validate the model. The impact of key genes on chemotherapy response was assessed in cell lines.</p><p><strong>Results: </strong>A model combining random forest and support vector machine algorithms exhibited superior performance in both the training and validation sets. Furthermore, patients in the sensitive group demonstrated longer overall survival compared to those in the resistant group. TMED4 and DYNLRB1 genes were identified as pivotal features in the model and exhibited higher expression levels in the chemotherapy-resistant group. SiRNA-mediated knockdown of gene expression enhanced the chemosensitivity of lung cancer cell lines to chemotherapeutic agents.</p><p><strong>Conclusions: </strong>This study successfully developed a high-performance machine learning model for predicting chemotherapy response in lung cancer and elucidated a strong correlation between TMED4 and DYNLRB1 gene expression and chemotherapy resistance. We further provide a user-friendly web server (available at https://smuonco.shinyapps.io/LC-DrugPortal/) to enable clinical utilization of our model, promoting personalized chemotherapy selection for lung cancer patients.</p>","PeriodicalId":10715,"journal":{"name":"Computational and structural biotechnology journal","volume":"27 ","pages":"3307-3318"},"PeriodicalIF":4.1,"publicationDate":"2025-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12329548/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144798434","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Machine learning identification of molecular targets for medulloblastoma subgroups using microarray gene fingerprint analysis. 利用微阵列基因指纹分析机器学习鉴定成神经管细胞瘤亚群的分子靶点。
IF 4.1 2区 生物学
Computational and structural biotechnology journal Pub Date : 2025-07-24 eCollection Date: 2025-01-01 DOI: 10.1016/j.csbj.2025.07.033
Alicia Reveles-Espinoza, Ulises Villela, Edgar Hernandez-Martinez, Isaac Chairez, Sergio Juárez-Méndez, J Casanova-Moreno, Ma Del Pilar Eguía-Aguilar, Luis Figueroa-Yáñez, Adriana Vallejo-Cardona, Iván Salgado
{"title":"Machine learning identification of molecular targets for medulloblastoma subgroups using microarray gene fingerprint analysis.","authors":"Alicia Reveles-Espinoza, Ulises Villela, Edgar Hernandez-Martinez, Isaac Chairez, Sergio Juárez-Méndez, J Casanova-Moreno, Ma Del Pilar Eguía-Aguilar, Luis Figueroa-Yáñez, Adriana Vallejo-Cardona, Iván Salgado","doi":"10.1016/j.csbj.2025.07.033","DOIUrl":"10.1016/j.csbj.2025.07.033","url":null,"abstract":"<p><p>The study introduces a structured methodology for the identification of molecular targets that accurately classify medulloblastoma subgroups: WNT, SHH, Group 3 (G3) and Group 4 (G4). An artificial neural network (ANN) model trained on microarray gene expression data determined minimal gene combinations for each subgroup. The classification achieved an average accuracy of 96%, demonstrating the effectiveness of the proposed approach. Feature selection using the Kruskal-Wallis and <math> <msup><mrow><mi>χ</mi></mrow> <mrow><mn>2</mn></mrow> </msup> </math> tests revealed statistically relevant genes contributing to subgroup discrimination. Reverse transcription followed by digital Polymerase Chain Reaction (dPCR) measured the expression levels of a subset of these genes in tumor samples, validating the computational predictions with experimental evidence. The integration of machine learning and molecular quantification provides a reproducible framework for medulloblastoma subgroup classification supported by both statistical and experimental consistency.</p>","PeriodicalId":10715,"journal":{"name":"Computational and structural biotechnology journal","volume":"27 ","pages":"3481-3491"},"PeriodicalIF":4.1,"publicationDate":"2025-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12345876/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144844830","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Developing foundations for biomedical knowledgebases from literature using large language models - A systematic assessment. 从使用大型语言模型的文献中开发生物医学知识库的基础-系统评估。
IF 4.1 2区 生物学
Computational and structural biotechnology journal Pub Date : 2025-07-24 eCollection Date: 2025-01-01 DOI: 10.1016/j.csbj.2025.07.042
Chen Miao, Zhenghao Zhang, Jiamin Chen, Daniel Rebibo, Haoran Wu, Sin-Hang Fung, Alfred Sze-Lok Cheng, Stephen Kwok-Wing Tsui, Sanju Sinha, Qin Cao, Kevin Y Yip
{"title":"Developing foundations for biomedical knowledgebases from literature using large language models - A systematic assessment.","authors":"Chen Miao, Zhenghao Zhang, Jiamin Chen, Daniel Rebibo, Haoran Wu, Sin-Hang Fung, Alfred Sze-Lok Cheng, Stephen Kwok-Wing Tsui, Sanju Sinha, Qin Cao, Kevin Y Yip","doi":"10.1016/j.csbj.2025.07.042","DOIUrl":"10.1016/j.csbj.2025.07.042","url":null,"abstract":"<p><p>While large language models (LLMs) have shown promising capabilities in biomedical applications, measuring their reliability in knowledge extraction remains a challenge. We developed a benchmark to compare LLMs in 11 literature knowledge extraction tasks that are foundational to automatic knowledgebase development, with or without task-specific examples supplied. We found large variation across the LLMs' performance, depending on the level of technical specialization, difficulty of tasks, scattering of original information, and format and terminology standardization requirements. We also found that asking the LLMs to provide the source text behind their answers is useful for overcoming some key challenges, but that specifying this requirement in the prompt is difficult.</p>","PeriodicalId":10715,"journal":{"name":"Computational and structural biotechnology journal","volume":"27 ","pages":"3299-3306"},"PeriodicalIF":4.1,"publicationDate":"2025-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12329539/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144798432","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Stoichiometric insights into SARS-CoV-2 spike-ACE2 binding across variants. SARS-CoV-2刺突- ace2跨变体结合的化学计量学见解。
IF 4.1 2区 生物学
Computational and structural biotechnology journal Pub Date : 2025-07-24 eCollection Date: 2025-01-01 DOI: 10.1016/j.csbj.2025.07.034
Ishola Abeeb Akinwumi, Sneha Bheemireddy, Laurent Chaloin, Serge Perez, Hamed Khakzad, Bernard Maigret, Yasaman Karami
{"title":"Stoichiometric insights into SARS-CoV-2 spike-ACE2 binding across variants.","authors":"Ishola Abeeb Akinwumi, Sneha Bheemireddy, Laurent Chaloin, Serge Perez, Hamed Khakzad, Bernard Maigret, Yasaman Karami","doi":"10.1016/j.csbj.2025.07.034","DOIUrl":"10.1016/j.csbj.2025.07.034","url":null,"abstract":"<p><p>The SARS-CoV-2 spike protein binds to the angiotensin-converting enzyme 2 (ACE2) receptor to mediate viral entry, with mutations in different variants influencing binding affinity and conformational dynamics. Using large-scale molecular dynamics simulations, we analyzed the Spike-ACE2 complex in the wild-type (WT), Beta, and Delta variants. Our findings reveal significant conformational rearrangements at the interface in Beta and Delta compared to WT, leading to distinct interaction networks and changes in complex stability. Binding free energy analysis further highlights variant-specific differences in ACE2 affinity, with alternative binding modes emerging over the simulation. The results enhance our understanding of spike-ACE2 stoichiometry across variants, providing implications for viral infectivity and therapeutic targeting.</p>","PeriodicalId":10715,"journal":{"name":"Computational and structural biotechnology journal","volume":"27 ","pages":"3285-3291"},"PeriodicalIF":4.1,"publicationDate":"2025-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12329073/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144798436","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Deciphering the proteome of Escherichia coli K-12: Integrating transcriptomics and machine learning to annotate hypothetical proteins. 破译大肠杆菌K-12的蛋白质组:整合转录组学和机器学习来注释假设的蛋白质。
IF 4.1 2区 生物学
Computational and structural biotechnology journal Pub Date : 2025-07-24 eCollection Date: 2025-01-01 DOI: 10.1016/j.csbj.2025.07.036
Sagarika Chakraborty, Zachary Ardern, Habibu Aliyu, Anne-Kristin Kaster
{"title":"Deciphering the proteome of <i>Escherichia coli</i> K-12: Integrating transcriptomics and machine learning to annotate hypothetical proteins.","authors":"Sagarika Chakraborty, Zachary Ardern, Habibu Aliyu, Anne-Kristin Kaster","doi":"10.1016/j.csbj.2025.07.036","DOIUrl":"10.1016/j.csbj.2025.07.036","url":null,"abstract":"<p><p>Omics technologies have led to the discovery of a vast number of proteins that are expressed but have no functional annotation - so called hypothetical proteins (HPs). Even in the best-studied model organism <i>Escherichia coli</i> K-12, over 2 % of the proteome remains uncharacterized. This knowledge gap becomes even worse when looking at microbial dark matter. However, knowing the functions of proteins is crucial for elucidating cellular and metabolic processes and harnessing biotechnological potentials. Here, we employed machine learning to decipher the transcriptional regulatory network of <i>E. coli</i> K-12, as well as other <i>in silico</i> tools to assign functions to uncharacterized HPs. We further provide experimental validation of <i>in silico</i> predicted functions for three HP-encoding genes (<i>yhdN</i>, <i>yeaC</i> and <i>ydgH</i>) as proof of concept, by analyzing growth patterns of deletion mutants compared to the wild type, as well as their transcriptional responses to specific conditions. This study demonstrates that the use of Big Omics Data in combination with Artificial Intelligence and experimental controls is a powerful approach to illuminate functional dark matter.</p>","PeriodicalId":10715,"journal":{"name":"Computational and structural biotechnology journal","volume":"27 ","pages":"3565-3578"},"PeriodicalIF":4.1,"publicationDate":"2025-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12356324/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144871833","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
AttnW2V-Enhancer: Leveraging attention and Word2Vec for enhanced enhancer prediction. AttnW2V-Enhancer:利用注意力和Word2Vec来增强增强预测。
IF 4.1 2区 生物学
Computational and structural biotechnology journal Pub Date : 2025-07-23 eCollection Date: 2025-01-01 DOI: 10.1016/j.csbj.2025.07.008
Mobeen Ur Rehman, Zeeshan Abbas, Farman Ullah, Irfan Hussain
{"title":"AttnW2V-Enhancer: Leveraging attention and Word2Vec for enhanced enhancer prediction.","authors":"Mobeen Ur Rehman, Zeeshan Abbas, Farman Ullah, Irfan Hussain","doi":"10.1016/j.csbj.2025.07.008","DOIUrl":"10.1016/j.csbj.2025.07.008","url":null,"abstract":"<p><p>Accurate identification of enhancer regions in DNA sequences is essential for understanding gene regulation and its role in diverse biological processes. Enhancers are regulatory elements that influence gene expression, but their detection remains challenging due to the complexity and variability of genomic sequences. In this study, we propose AttnW2V-Enhancer, a novel model that combines Word2Vec-based sequence encoding, convolutional neural networks (CNN), and attention mechanisms to address this challenge. By leveraging Word2Vec embeddings, our model captures biologically meaningful patterns and offers a more efficient and interpretable representation than traditional methods such as one-hot encoding and physicochemical descriptors. We evaluate AttnW2V-Enhancer on an independent test set, where it achieves superior performance with an accuracy of 81.75%, sensitivity of 83.50%, specificity of 80.00%, and a Matthews Correlation Coefficient (MCC) of 0.635, outperforming existing models. Additionally, we demonstrate the effectiveness of the attention mechanism in enhancing feature learning by dynamically focusing on the most relevant sequence regions. These results confirm that integrating Word2Vec encoding with CNNs and attention mechanisms provides a powerful and interpretable framework for enhancer prediction, offering valuable insights into the identification of regulatory sequences. The source code and implementation are publicly available at: https://github.com/Rehman1995/AttnW2V-Enhancer.</p>","PeriodicalId":10715,"journal":{"name":"Computational and structural biotechnology journal","volume":"27 ","pages":"3275-3284"},"PeriodicalIF":4.1,"publicationDate":"2025-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12329123/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144798431","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
PCLDA: An interpretable cell annotation tool for single-cell RNA-sequencing data based on simple statistical methods. PCLDA:基于简单统计方法的单细胞rna测序数据的可解释细胞注释工具。
IF 4.1 2区 生物学
Computational and structural biotechnology journal Pub Date : 2025-07-23 eCollection Date: 2025-01-01 DOI: 10.1016/j.csbj.2025.07.019
Kailun Bai, Belaid Moa, Xiaojian Shao, Xuekui Zhang
{"title":"PCLDA: An interpretable cell annotation tool for single-cell RNA-sequencing data based on simple statistical methods.","authors":"Kailun Bai, Belaid Moa, Xiaojian Shao, Xuekui Zhang","doi":"10.1016/j.csbj.2025.07.019","DOIUrl":"10.1016/j.csbj.2025.07.019","url":null,"abstract":"<p><p>Single-cell RNA sequencing (scRNA-seq) enables high-resolution analysis of cellular heterogeneity, yet accurate and consistent cell-type annotation remains a crucial challenge. Numerous automated tools exist, but their complex modeling assumptions can hinder reliability across varied datasets and protocols. We propose PCLDA, a pipeline composed of three modules: t-test-based gene screening, principal component analysis (PCA) and linear discriminant analysis (LDA), all built on simple statistical methods. An ablation study shows that each module in PCLDA contributes significantly to performance and robustness, with two novel enhancements in the second module yielding substantial gains. Despite these additions, the model retains its original assumptions, computational efficiency, and interpretability. Benchmarking against nine state-of-the-art methods across 22 public scRNA-seq datasets and 35 distinct evaluation scenarios, PCLDA consistently achieves top-tier accuracy under both intra-dataset (cross-validation) and inter-dataset (cross-platform) conditions. Notably, when reference and query data are generated via different protocols, PCLDA remains stable and often outperforms more complex machine-learning approaches. Furthermore, PCLDA offers strong interpretability, attributed to the linear nature of its PCA and LDA modules. The final decision boundaries are linear combinations of the original gene expression values, directly reflecting the contribution of each gene to the classification. Top-weighted genes identified by PCLDA better capture biologically meaningful signals in enrichment analyses than those selected via marginal screening alone, offering deeper functional insights into cell-type specificity. In conclusion, our work underscores the utility of carefully enhanced simple statistics methods for single-cell annotation. PCLDA's simplicity, interpretability, and consistently high performance make it a practical, reliable alternative to more complex annotation pipelines. Code is available on GitHub:https://github.com/kellen8hao/PCLDA.</p>","PeriodicalId":10715,"journal":{"name":"Computational and structural biotechnology journal","volume":"27 ","pages":"3264-3274"},"PeriodicalIF":4.1,"publicationDate":"2025-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12329077/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144798435","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Improving prediction accuracy in chimeric proteins with windowed multiple sequence alignment. 利用窗口多序列比对提高嵌合蛋白预测精度。
IF 4.1 2区 生物学
Computational and structural biotechnology journal Pub Date : 2025-07-23 eCollection Date: 2025-01-01 DOI: 10.1016/j.csbj.2025.07.039
Sanketh Vedula, Alex M Bronstein, Ailie Marx
{"title":"Improving prediction accuracy in chimeric proteins with windowed multiple sequence alignment.","authors":"Sanketh Vedula, Alex M Bronstein, Ailie Marx","doi":"10.1016/j.csbj.2025.07.039","DOIUrl":"10.1016/j.csbj.2025.07.039","url":null,"abstract":"<p><p>A key step in protein structure prediction involves the detection of co-evolving pairs of residues, a signal for spatial proximity. This information is gleaned from multiple sequence alignment and underscores Alphafold's structure prediction for almost every known protein. A simple means to create proteins beyond those found in nature, is by unnaturally fusing together two known proteins or protein parts. Here we demonstrate that structured peptides are predicted with significantly reduced accuracy when added to the terminal ends of scaffold proteins. Appending the multiple sequence alignment for the individual peptide tags to that of the scaffold protein often restores prediction accuracy. This work suggests that this windowed multiple sequence alignment approach can be a useful tool for predicting the structure of fused, chimeric proteins.</p>","PeriodicalId":10715,"journal":{"name":"Computational and structural biotechnology journal","volume":"27 ","pages":"3292-3298"},"PeriodicalIF":4.1,"publicationDate":"2025-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12328686/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144798433","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信