{"title":"<i>ColorI-DT</i>: An open-source tool for the quantitative evaluation of differences in microscopy color images.","authors":"Filippo Piccinini, Michele Tritto, Jae-Chul Pyun, Misu Lee, Bongseop Kwak, Bosung Ku, Nicola Normanno, Gastone Castellani","doi":"10.1016/j.csbj.2025.06.019","DOIUrl":"https://doi.org/10.1016/j.csbj.2025.06.019","url":null,"abstract":"<p><p>In several fields, quantitatively comparing color images is crucial. For instance, this is important in Histopathology, where different microscopes/cameras are typically used for visualizing patient samples by causing significant color variation. No ground-truth metric exists for estimating differences between pairs of color images. A range of possible solutions is available but there is no existing open-source tool that allow clinicians and researchers to apply these metrics to microscopy images through an intuitive, easy-to-use software. In this work, we developed <i>Color Image Difference Tool</i> (<i>ColorI-DT</i>), an open-source tool for measuring quantitative differences between color images of the same subject acquired under different settings. Thanks to a user-friendly graphical user interface, it allows the selection of a pair of color images and a metric from a list of available options, and produces an output 2D pixel-wise color difference matrix between corresponding pixels in the input images. The metrics currently implemented are: (<i>1</i>) Euclidean <math><mrow><mi>Δ</mi> <mi>E</mi></mrow> </math> ; (<i>2</i>) International Commission on Illumination (CIE) 76 (Luv); (<i>3</i>) CIE76 (Lab); (<i>4</i>) CIE94; (<i>5</i>) CIE00; (<i>6</i>) Colour Measurement Committee (CMC). To demonstrate how to use the tool, microscopy images with a predominant color in the red, green, or blue channel were used. In particular, we checked which among the 6 metrics displays the most predictable and linear behavior in the case of controlled primary color alterations. For more pronounced color adjustments, a qualitative comparison would be likely sufficient for analyzing color differences, as a quantitative tool may become unreliable due to the inherent limitations of the implemented metrics.</p>","PeriodicalId":10715,"journal":{"name":"Computational and structural biotechnology journal","volume":"27 ","pages":"2526-2536"},"PeriodicalIF":4.4,"publicationDate":"2025-06-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12197881/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144505007","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Applications of machine learning-assisted extracellular vesicles analysis technology in tumor diagnosis.","authors":"Liang Xu, Jing Li, Wei Gong","doi":"10.1016/j.csbj.2025.06.014","DOIUrl":"10.1016/j.csbj.2025.06.014","url":null,"abstract":"<p><p>Precision medicine for tumors represents a pivotal focus in contemporary medical research. Nonetheless, the diversity of tumor types and the complexity of their pathogenesis present significant challenges in the diagnostic process. Extracellular vesicles (EVs), as a category of nanoparticles, carry a wealth of biological information and play a crucial role in tumor initiation and progression, thereby offering novel approaches for early tumor diagnosis. In recent years, machine learning (ML) technology in the medical field has gained momentum, which utilize various algorithms to analyze input data, identify potential patterns and trends, develop predictive models, and generate high-precision predictions of unknown data, demonstrating its clinical potential in disease diagnosis. This review provides a comprehensive summary of advancements in EVs analysis technology based on ML for auxiliary tumor diagnosis, including early diagnosis, classification, stage recognition, and molecular diagnosis, and discusses their advantages in clinical applications. Additionally, the article anticipates future development trends in the field, aiming to serve as a reference for researchers engaged in ML-assisted liquid biopsy for tumor diagnosis.</p>","PeriodicalId":10715,"journal":{"name":"Computational and structural biotechnology journal","volume":"27 ","pages":"2460-2472"},"PeriodicalIF":4.4,"publicationDate":"2025-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12180947/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144474176","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mohamed Albahri, Daniel Sauter, Felix Nensa, Georg Lodde, Elisabeth Livingstone, Dirk Schadendorf, Markus Kukuk
{"title":"A new approach combining a whole-slide foundation model and gradient boosting for predicting BRAF mutation status in dermatopathology.","authors":"Mohamed Albahri, Daniel Sauter, Felix Nensa, Georg Lodde, Elisabeth Livingstone, Dirk Schadendorf, Markus Kukuk","doi":"10.1016/j.csbj.2025.06.017","DOIUrl":"10.1016/j.csbj.2025.06.017","url":null,"abstract":"<p><p>Determining the mutation status of proto-oncogene B-Rapidly Accelerated Fibrosarcoma (BRAF) is crucial in melanoma for guiding targeted therapies and improving patient outcomes. While genetic testing has become more accessible, histopathological examination remains central to routine diagnostics, and an image-based strategy could further streamline the associated time and cost. In this study, we propose a new machine learning framework that integrates a large-scale, pretrained foundation model (Prov-GigaPath) with a gradient-boosting classifier (XGBoost) to predict BRAF-V600 mutation status directly from histopathological slides. Our approach was trained and cross-validated on the Skin Cutaneous Melanoma (SKCM) dataset from The Cancer Genome Atlas (TCGA; 275 slides), where the fine-tuned Prov-GigaPath model alone achieved an average Area Under the Curve (AUC) of 0.653 during cross-validation. An additional test on 68 slides from the University Hospital Essen (UHE), Germany, yielded an AUC of 0.697 (95 % CI: 0.553-0.821). Incorporating XGBoost significantly improved performance, reaching an AUC of 0.824 (SD=0.043) during cross-validation and 0.772 (95 % CI: 0.650-0.886) on the independent set-representing a new state-of-the-art for image-only BRAF mutation prediction in melanoma. By employing a weakly supervised, data-efficient pipeline, this method reduces the need for extensive annotations and costly molecular assays. While these results are not intended to replace genetic testing at this stage, they mark a new milestone in predicting BRAF mutation status solely from histopathological slides-a concept not yet fully established in prior research-and underscore the potential for seamlessly integrating automated, AI-driven decision-support tools into diagnostic workflows, thereby expediting personalized therapy decisions and advancing precision oncology.</p>","PeriodicalId":10715,"journal":{"name":"Computational and structural biotechnology journal","volume":"27 ","pages":"2503-2514"},"PeriodicalIF":4.4,"publicationDate":"2025-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12182775/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144474175","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Computational vaccine development against protozoa.","authors":"Omar Hashim, Isabelle Dimier-Poisson","doi":"10.1016/j.csbj.2025.06.011","DOIUrl":"10.1016/j.csbj.2025.06.011","url":null,"abstract":"<p><p>Protozoan parasites remain a major global health and economic burden, particularly in low- and middle-income countries. Conventional strategies such as chemotherapy and vector control face growing limitations due to resistance, toxicity, and implementation challenges. Vaccination represents a sustainable solution, but the complexity of protozoan life cycles and antigenic diversity has hindered vaccine development. Computational vaccinology offers innovative tools to overcome these barriers, combining immuno-informatics, reverse vaccinology, and artificial intelligence to accelerate the identification of immunogenic epitopes and streamline vaccine design. This review explores the current landscape of computational vaccine development against protozoa, highlighting advances in epitope prediction, population-specific vaccine design, and digital twin technologies. Applications include multivalent vaccines targeting conserved antigens across species, personalized formulations based on host immunogenetics, and the emerging use of protozoan vectors in cancer immunotherapy. Despite these promising avenues, significant challenges remain, particularly the need for robust experimental validation, improved delivery systems for short peptides, and greater acceptance of in silico methods by the broader scientific community. We argue that integrating computational tools with experimental immunology, high-throughput genomics, and translational research will be the key to developing safe, effective, and broadly accessible vaccines against protozoan infections. This convergence of disciplines has the potential to not only address neglected tropical diseases but also to establish new paradigms in precision vaccinology and immunotherapy.</p>","PeriodicalId":10715,"journal":{"name":"Computational and structural biotechnology journal","volume":"27 ","pages":"2386-2393"},"PeriodicalIF":4.4,"publicationDate":"2025-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12172979/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144316014","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Emil Stefańczyk, Agata Mitura, Marta Utratna, Magdalena Staniszewska
{"title":"Investigating and evaluating potential antigen binding sites for monoclonal anti-HER2 antibodies: The LightDock approach.","authors":"Emil Stefańczyk, Agata Mitura, Marta Utratna, Magdalena Staniszewska","doi":"10.1016/j.csbj.2025.06.001","DOIUrl":"10.1016/j.csbj.2025.06.001","url":null,"abstract":"<p><p>Monoclonal antibodies targeting HER2, a receptor overexpressed in certain cancer cells, have greatly improved the treatment of HER2-positive cancers. In addition, anti-HER2 antibodies play a critical role in diagnostic applications, enabling accurate detection of HER2 expression levels. Advancing antibody-based therapies and diagnostic tools require a thorough understanding of binding interactions, but it remains challenging due to complex antibody protein structure and its flexibility, particularly within their complementarity-determining regions. In this study we utilized LightDock, a molecular docking tool simulating protein-protein interactions which can incorporate flexibility that allows the <i>in silico</i> analysis of flexible proteins like antibody. Using LightDock we investigated interaction sites between the recently developed by our group anti-HER2 antibodies and their specific antigen HER2 protein. Despite the high variability in the obtained results, a statistics-based approach identified two recurring HER2 regions as potential binding sites and functionally relevant areas in receptor biology. This variability in predicted docking interfaces reflects the inherent complexity of antibody-antigen interactions. This structure based docking approach provides a cost-effective method to analyze antibody-protein interactions and offers preliminary insight into possible epitopes targeted by the novel anti-HER2 antibodies. However, our data indicates that at this time point further validation using experimental techniques will be beneficial to refine and increase the accuracy of the results obtained <i>in silico</i>. This report highlights the value of the computational docking in antibody-protein interaction studies, demonstrating significant potential with present and upcoming advancements in computer-based approaches.</p>","PeriodicalId":10715,"journal":{"name":"Computational and structural biotechnology journal","volume":"27 ","pages":"2515-2525"},"PeriodicalIF":4.4,"publicationDate":"2025-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12182776/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144474153","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Quan Gan, Qi-Wei Ge, Chuanxia Liu, Zhaoman Zhong, Jiaying Wu, Lei Shi, Jin Xu, Chen Li
{"title":"Modeling the therapeutic dynamics of acupuncture and moxibustion: a systems biology approach to treatment optimization.","authors":"Quan Gan, Qi-Wei Ge, Chuanxia Liu, Zhaoman Zhong, Jiaying Wu, Lei Shi, Jin Xu, Chen Li","doi":"10.1016/j.csbj.2025.05.053","DOIUrl":"10.1016/j.csbj.2025.05.053","url":null,"abstract":"<p><p>A key obstacle in advancing acupuncture and moxibustion treatment (AMT) lies in the absence of effective methodologies capable of modeling the body's dynamic physiological changes and predicting treatment outcomes with quantitative precision. Colored Petri nets (CPNs), which have shown significant utility in simulating complex biological systems, offer a promising foundation for modeling AMT due to their capacity to represent hierarchical structures and dynamic behaviors. However, current modeling approaches struggle to address the inherent concurrency and complexity characteristic of AMT processes. To address this, we propose a novel token-guided transition control based on CPNs theory, enabling precise and efficient simulation of AMT systems. Furthermore, we develop a multicriteria evaluation method to quantitatively assess and compare the therapeutic efficacy of various AMT protocols, providing a structured approach for evidence-based decision-making. We validate our proposed model through simulation studies based on clinical cases of Meniere's disease. The simulation results closely align with actual clinical data, supporting the model's reliability and applicability. Finally, randomized simulation experiments have led to the identification of three new AMT strategies with promising therapeutic potential, highlighting the model's capacity to support treatment optimization and clinical innovation. This study introduces a comprehensive framework for dynamic modeling, visual representation, and quantitative evaluation of AMT systems. By offering a systematic and predictive approach to AMT analysis, the proposed method not only enhances understanding of treatment mechanisms but also contributes to the standardization of clinical practice.</p>","PeriodicalId":10715,"journal":{"name":"Computational and structural biotechnology journal","volume":"27 ","pages":"2434-2442"},"PeriodicalIF":4.4,"publicationDate":"2025-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12174566/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144324617","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"High-resolution accurate mass- mass spectrometry based- untargeted metabolomics: Reproducibility and detection power across data-dependent acquisition, data-independent acquisition, and AcquireX.","authors":"Hanane El Boudlali, Laura Lehmicke, Uta Ceglarek","doi":"10.1016/j.csbj.2025.05.046","DOIUrl":"10.1016/j.csbj.2025.05.046","url":null,"abstract":"<p><p>Untargeted metabolomics aims at the unbiased metabolic profiling and biomarker discovery but requires methods with high sensitivity and reproducibility. Here, we compare three acquisition modes-Data-Dependent Acquisition (DDA), Data-Independent Acquisition (DIA), and AcquireX -to evaluate performance and reproducibility in detecting low-abundance metabolites in a complex matrix. A system suitability test (SST) based on 14 eicosanoid standards was implemented to evaluate the suitability of our instrumental setup prior to conducting untargeted metabolomics analyses and monitor long-term system performance. Bovine liver total Lipid Extract (TLE) was spiked with decreasing levels (10-0.01 ng/mL) of the eicosanoid standard mix (StdMix) to compare the detection power of each mode. Reproducibility was evaluated over three independent measurements, spaced one week apart. Chromatographic separation was performed on a C18-Kinetex Core-Shell column and HRAM-MS/MS data were acquired using an Orbitrap Exploris 480. DIA detected and identified the highest number of metabolic features, (averaging 1036 metabolic features over three measurements), followed by DDA (18 % fewer) and AcquireX (37 % fewer). Moreover, DIA demonstrated superior reproducibility, with a coefficient of variance of 10 % across detected compounds over three measurements, compared to 17 % for DDA and 15 % for AcquireX. DIA further exhibited better compound identification consistency, with 61 % overlap between two days, compared to DDA (43 %) and AcquireX (50 %). DIA reproduced fragmentation spectra patterns with high consistency, contributing to higher reproducibility in compound identification. DIA showed the best detection power for all spiking eicosanoids at 10 and 1 ng/mL in TLE matrix. At low spiking levels, 0.1 and 0.01 ng/mL, a general cut-off was observed for the three acquisition modes. None of this assessed acquisition modes was able to detect and/or identify eicosanoids at physiologically relevant concentrations, explaining their frequent omission in routine untargeted analyses.</p>","PeriodicalId":10715,"journal":{"name":"Computational and structural biotechnology journal","volume":"27 ","pages":"2412-2423"},"PeriodicalIF":4.4,"publicationDate":"2025-05-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12173630/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144316015","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Random splicing assisted deep learning for breast cancer cell line classification via Raman spectroscopy.","authors":"Yiheng Liu, Junfeng Liu, Jiayi Wan, Hongke Hao, Guangxing Liu, Xia Huang","doi":"10.1016/j.csbj.2025.05.051","DOIUrl":"10.1016/j.csbj.2025.05.051","url":null,"abstract":"<p><p>Raman spectroscopy extracts rich biochemical information on a single cell, demonstrating significant potential for precise cancer identification. While machine learning enhances spectral analysis efficiency, conventional models remain constrained by data volume. Here, we developed Random Splicing-Convolutional Neural Network (RS-CNN), a deep learning framework that addresses data scarcity through spectral concatenation. By randomly splicing Raman spectra from the same cell line, RS-CNN enhances distinctive spectral features while simultaneously expanding dataset size and improving signal quality. Validation across six breast cancer cell lines demonstrated RS-CNN's superiority over five benchmark models (SVM, LDA, PCA-SVM, PCA-LDA, CNN). With 450 spectra per cell line, RS-CNN achieved 98.63 % classification accuracy compared to conventional models' accuracies of around 85 %. Under data-limited conditions (100 spectra/line), RS-CNN maintained 91.47 % accuracy, outperforming CNN's 70.83 %. The RS-CNN's generalizability was further validated by an independently acquired dataset, achieving at least 94 % classification accuracy. SHAP analysis suggested the spectral region around 980 cm⁻¹ was significant for cancer diagnosis, while the 1158-1160 cm⁻¹and 1603-1607 cm⁻¹ regions were particularly valuable for distinguishing between cancer subtypes. These findings establish RS-CNN as a robust analytical model for clinical Raman diagnostics, particularly valuable in applications requiring high accuracy with limited samples.</p>","PeriodicalId":10715,"journal":{"name":"Computational and structural biotechnology journal","volume":"27 ","pages":"2288-2297"},"PeriodicalIF":4.4,"publicationDate":"2025-05-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12162052/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144282755","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Comprehensive assessment of AlphaFold's predictions of secondary structure and solvent accessibility at the amino acid-level in eukaryotic, bacterial and archaeal proteins.","authors":"Jing Yu, Bi Zhao, Lukasz Kurgan","doi":"10.1016/j.csbj.2025.05.047","DOIUrl":"10.1016/j.csbj.2025.05.047","url":null,"abstract":"<p><p>Numerous sequence-based predictors of the amino acid (AA)-level solvent accessibility (SA) and secondary structure (SS) of proteins have been developed. We empirically investigated whether these two key characteristics of AA-level structure can be accurately predicted from putative structures generated by the popular AlphaFold2. We compared AlphaFold2's results against several representative SS and SA predictors on a large test dataset that covers five distinct taxonomic groups (animals, plants, fungi, bacteria, and archaea). We used a broad collection of metrics that evaluate predictions of the numeric and binary (buried vs. solvent exposed) SA and the 3-state SS at both AA- and SS-region levels. We found that AlphaFold2 generated very accurate results, with high average Q<sub>3</sub> accuracy of 0.928 for the SS prediction and high Pearson Correlation Coefficient (PCC) of 0.815 between its putative and native SA values. AlphaFold2 significantly and consistently outperforms the considered predictors of SA and SS across the five taxonomic groups and both AA and region level evaluations. Moreover, we demonstrated that AlphaFold2 nearly perfectly reconstructs distributions of the sizes and numbers of the SS regions. We also showed that AlphaFold2 substantially improves over the SS and SA predictors when tested on a low sequence similarity test dataset, although its results and results of two other predictors suffer a modest drop in the quality of predicting SS regions. Altogether, our results suggest that AlphaFold2 makes very accurate predictions of SS and SA, which can be easily extracted from 200+ million pre-computed AF2's structure predictions in AlphaFoldDB.</p>","PeriodicalId":10715,"journal":{"name":"Computational and structural biotechnology journal","volume":"27 ","pages":"2443-2449"},"PeriodicalIF":4.4,"publicationDate":"2025-05-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12173809/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144324616","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jiahui Guan, Peilin Xie, Dian Meng, Lantian Yao, Dan Yu, Ying-Chih Chiang, Tzong-Yi Lee, Junwen Wang
{"title":"ToxiPep: Peptide toxicity prediction via fusion of context-aware representation and atomic-level graph.","authors":"Jiahui Guan, Peilin Xie, Dian Meng, Lantian Yao, Dan Yu, Ying-Chih Chiang, Tzong-Yi Lee, Junwen Wang","doi":"10.1016/j.csbj.2025.05.039","DOIUrl":"10.1016/j.csbj.2025.05.039","url":null,"abstract":"<p><p>Peptide-based therapeutics have emerged as a promising avenue in drug development, offering high biocompatibility, specificity, and efficacy. However, the potential toxicity of peptides remains a significant challenge, necessitating the development of robust toxicity prediction methods. In this study, we introduce ToxiPep, a novel dual-model framework for peptide toxicity prediction that integrates sequence-based contextual information with atomic-level structural features. This framework combines BiGRU and Transformer to capture local and global sequence dependencies while leveraging multi-scale CNNs to extract refined structural features from molecular graphs derived from peptide SMILES representations. A cross-attention mechanism aligns and fuses these two feature modalities, enabling the model to capture intricate relationships between sequence and structural information. ToxiPep outperforms several state-of-the-art tools, including ToxinPred2, CSM-Toxin, PepNet, and ToxinPred3, on both internal and independent test sets. Additionally, interpretability analyses reveal that ToxiPep identifies key amino acids along with their structural features, providing insights into the molecular mechanisms of peptide toxicity. To facilitate broader accessibility, we have also developed a web server for convenient user access. Overall, this framework has the potential to accelerate the identification of safer therapeutic peptides, offering new opportunities for peptide-based drug development in precision medicine.</p>","PeriodicalId":10715,"journal":{"name":"Computational and structural biotechnology journal","volume":"27 ","pages":"2347-2358"},"PeriodicalIF":4.4,"publicationDate":"2025-05-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12171765/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144316016","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}