Jose María Belmonte, Miguel Blanquer, Gregorio Bernabé, Fernando Jiménez, José Manuel García
{"title":"Survival risk prediction in hematopoietic stem cell transplantation for multiple myeloma.","authors":"Jose María Belmonte, Miguel Blanquer, Gregorio Bernabé, Fernando Jiménez, José Manuel García","doi":"10.1515/jib-2024-0053","DOIUrl":"https://doi.org/10.1515/jib-2024-0053","url":null,"abstract":"<p><p>This paper investigates the application of <i>Survival Analysis</i> (SA) techniques to forecast outcomes after <i>autologous Hematopoietic Stem Cell Transplantation</i> (aHSCT) for <i>Multiple Myeloma</i> (MM). By leveraging six SA models, we examine their predictive capabilities, measured through the <i>Concordance Index</i> (C-index) metric. Beyond evaluating model performance, we analyze feature importance using permutation and SHAP methods, highlighting key clinical factors such as treatment history, disease stage, and prior disease progression or relapse as critical predictors of survival. The findings suggest that while all models performed well based on the C-index, a detailed examination revealed variations in how each model processed data. Specifically, the Coxnet and Random Survival Forest models exhibited a more thorough use of clinical variables, whereas the gradient boosting models appeared to rely on a narrower range of features, potentially limiting their ability to differentiate between patients with comparable profiles. Risk predictions categorized patients into low, moderate, and high-risk levels. For lower-risk patients, the procedure showed positive outcomes, while higher-risk individuals were predicted to have limited survival benefits, recommending alternative treatments. Lastly, we propose future research to expand these models into time-to-event estimations, offering additional support for decision-making by predicting patient life expectancy post-transplant, considering their pre-transplant clinical attributes.</p>","PeriodicalId":53625,"journal":{"name":"Journal of Integrative Bioinformatics","volume":" ","pages":""},"PeriodicalIF":1.5,"publicationDate":"2025-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144210142","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Vladislav V Shilenok, Irina V Shilenok, Vladislav O Soldatov, Yuriy L Orlov, Ksenia A Kobzeva, Alexey V Deykin, Olga Yu Bushueva
{"title":"Bioinformatic analysis of the regulatory potential of tagging SNPs provides evidence of the involvement of genes encoding the heat-resistant obscure (Hero) proteins in the pathogenesis of cardiovascular diseases.","authors":"Vladislav V Shilenok, Irina V Shilenok, Vladislav O Soldatov, Yuriy L Orlov, Ksenia A Kobzeva, Alexey V Deykin, Olga Yu Bushueva","doi":"10.1515/jib-2024-0043","DOIUrl":"10.1515/jib-2024-0043","url":null,"abstract":"<p><p>Although multiple aspects of molecular pathology underlying cardiovascular diseases (CVDs) have been revealed, the complete picture has yet to be elucidated. In this respect, annotation of the novel links between genes and atherosclerosis is of great importance for cardiovascular medicine. Aligning with our previous research, we aimed to analyze the cardiovascular predisposition contribution of the genes encoding Hero-proteins, polypeptides with chaperone activity. Following bioinformatic sources were utilized to annotate data regarding the cardiovascular contribution of Hero-proteins and their genes: SNPinfo Web Server, The Cardiovascular Disease Knowledge Portal, GTEx Portal, HaploReg, rSNPBase, RegulomeDB, atSNP, Gene Ontology, QTLbase, and the Blood eQTL browser. Almost all analyzed genes were characterized by a very high regulatory potential of tag SNPs (except <i>BEX3</i>). Multiple substantial impacts of the analyzed SNPs on histone modifications, eQTL effects on CVD-related genes, and binding to transcription factors involved in biological processes pathogenetically significant for CVDs have been discovered. Here we provide <i>in silico</i> evidence of the involvement of genes <i>C9orf16 (BBLN)</i>, <i>C11orf58</i>, <i>SERBP1</i>, <i>SERF2</i>, and <i>C19orf53</i> in CVDs and their risk factors (high blood pressure, dyslipidemia, obesity, arrhythmias, etc.), thus revealing Hero-proteins as putative actors in the pathobiology of the heart and vessels.</p>","PeriodicalId":53625,"journal":{"name":"Journal of Integrative Bioinformatics","volume":" ","pages":""},"PeriodicalIF":1.8,"publicationDate":"2025-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12327200/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144210140","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Danuta Schüler, Matthias Lange, Thomas Altmann, Maria Cuacos, Daniel Arend, John Charles D'Auria, Anne Fiebig, Jochen Kumlehn, Kerstin Neumann, Michael Melzer, Elena Rey-Mazón, Hardy Rolletschek, Uwe Scholz, Evelin Willner, Jochen C Reif
{"title":"Data management in balance - a decade of balancing pragmatism, sustainability and innovation at plant research center IPK Gatersleben.","authors":"Danuta Schüler, Matthias Lange, Thomas Altmann, Maria Cuacos, Daniel Arend, John Charles D'Auria, Anne Fiebig, Jochen Kumlehn, Kerstin Neumann, Michael Melzer, Elena Rey-Mazón, Hardy Rolletschek, Uwe Scholz, Evelin Willner, Jochen C Reif","doi":"10.1515/jib-2025-0012","DOIUrl":"10.1515/jib-2025-0012","url":null,"abstract":"<p><p>The Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) Gatersleben is a leading international plant science institute specializing in biodiversity and crop plant performance research. Over the last decade, all phases of the research data lifecycle were implemented as a continuous process in conjunction with information technology, standardization, and sustainable research data management (RDM) processes. Under the leadership of a team of data stewards, a research data infrastructure, process landscape, capacity building, and governance structures were successfully established. As a result, a generic research data infrastructure was created to serve the principles of good scientific practice, archiving research data in an accessible and sustainable manner, even before the FAIR criteria were formulated. In this paper, we discuss success stories as well as pitfalls and summarize the experiences from 15 years of operating a central RDM infrastructure. We present measures for agile requirements engineering, technical and organizational implementation, governance, training, and roll-out. We show the benefits of a participatory approach across all departments, personnel roles, and researcher profiles through pilot working groups and data management champions. As a result, an ambidextrous approach to data management was implemented, referring to the ability to efficiently combine operational needs, support daily tasks in compliance with the FAIR criteria, while remaining open to adopting technical innovations in an agile manner.</p>","PeriodicalId":53625,"journal":{"name":"Journal of Integrative Bioinformatics","volume":" ","pages":""},"PeriodicalIF":1.8,"publicationDate":"2025-05-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12327199/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144210141","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Alba Nogueira-Rodríguez, Daniel Glez-Peña, Cristina P Vieira, Jorge Vieira, Hugo López-Fernández
{"title":"Towards a more accurate and reliable evaluation of machine learning protein-protein interaction prediction model performance in the presence of unavoidable dataset biases.","authors":"Alba Nogueira-Rodríguez, Daniel Glez-Peña, Cristina P Vieira, Jorge Vieira, Hugo López-Fernández","doi":"10.1515/jib-2024-0054","DOIUrl":"https://doi.org/10.1515/jib-2024-0054","url":null,"abstract":"<p><p>The characterization of protein-protein interactions (PPIs) is fundamental to understand cellular functions. Although machine learning methods in this task have historically reported prediction accuracies up to 95 %, including those only using raw protein sequences, it has been highlighted that this could be overestimated due to the use of random splits and metrics that do not take into account potential biases in the datasets. Here, we propose a per-protein utility metric, pp_MCC, able to show a drop in the performance in both random and unseen-protein splits scenarios. We tested ML models based on sequence embeddings. The pp_MCC metric evidences a reduced performance even in a random split, reaching levels similar to those shown by the raw MCC metric computed over an unseen protein split, and drops even further when the pp_MCC is used in an unseen protein split scenario. Thus, the metric is able to give a more realistic performance estimation while allowing to use random splits, which could be interesting for more protein-centric studies. Given the low adjusted performance obtained, there seems to be room for improvement when using only primary sequence information, suggesting the need of inclusion of complementary protein data, accompanied with the use of the pp_MCC metric.</p>","PeriodicalId":53625,"journal":{"name":"Journal of Integrative Bioinformatics","volume":" ","pages":""},"PeriodicalIF":1.5,"publicationDate":"2025-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143754930","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Daniel Pérez-Rodríguez, Roberto C Agís-Balboa, Hugo López-Fernández
{"title":"Fcodes update: a kinship encoding framework with F-Tree GUI & LLM inference.","authors":"Daniel Pérez-Rodríguez, Roberto C Agís-Balboa, Hugo López-Fernández","doi":"10.1515/jib-2024-0046","DOIUrl":"https://doi.org/10.1515/jib-2024-0046","url":null,"abstract":"<p><p>Family structures play a crucial role in personal development, social dynamics, and mental health. Traditional systems for encoding genealogical data, such as Ahnentafel and the Register System, offer methods to document lineage but face limitations, particularly in accommodating horizontal relationships or handling changes in family datasets. Modern computational systems like LINKAGE and PED, while powerful for genetic analysis, lack human readability and are challenging to apply in fields where unstructured, narrative data is common, such as sociology or psychiatry. This paper aims to bridge this gap by enhancing Fcodes, a flexible and intuitive algorithm for encoding kinship relationships that is suited for both manual and computational use. Building on our previous work, we present improvements to the Fcodes core algorithm and command-line interface (CLI), as well as the development of F-Tree, a new graphical user interface (GUI) to streamline the encoding process. Additionally, we introduce a method for estimating the coefficient of inbreeding using Fcodes and explore the application of artificial intelligence, namely large language models (LLMs), to automatically infer family relationships from narrative text. These advancements highlight the potential of Fcodes in a wide range of research contexts, from social studies to genetics and mental health research.</p>","PeriodicalId":53625,"journal":{"name":"Journal of Integrative Bioinformatics","volume":" ","pages":""},"PeriodicalIF":1.5,"publicationDate":"2025-03-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143744460","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
João Capela, João Cheixo, Dick de Ridder, Oscar Dias, Miguel Rocha
{"title":"Predicting precursors of plant specialized metabolites using DeepMol automated machine learning.","authors":"João Capela, João Cheixo, Dick de Ridder, Oscar Dias, Miguel Rocha","doi":"10.1515/jib-2024-0050","DOIUrl":"https://doi.org/10.1515/jib-2024-0050","url":null,"abstract":"<p><p>Plants produce specialized metabolites, which play critical roles in defending against biotic and abiotic stresses. Due to their chemical diversity and bioactivity, these compounds have significant economic implications, particularly in the pharmaceutical and agrotechnology sectors. Despite their importance, the biosynthetic pathways of these metabolites remain largely unresolved. Automating the prediction of their precursors, derived from primary metabolism, is essential for accelerating pathway discovery. Using DeepMol's automated machine learning engine, we found that regularized linear classifiers offer optimal, accurate, and interpretable models for this task, outperforming state-of-the-art models while providing chemical insights into their predictions. The pipeline and models are available at the repository: https://github.com/jcapels/SMPrecursorPredictor.</p>","PeriodicalId":53625,"journal":{"name":"Journal of Integrative Bioinformatics","volume":" ","pages":""},"PeriodicalIF":1.5,"publicationDate":"2025-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143658772","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Zahra Mosalanejad, Seyed Nooreddin Faraji, Mohammad Reza Rahbar, Ahmad Gholami
{"title":"Designing an optimized theta-defensin peptide for HIV therapy using in-silico approaches.","authors":"Zahra Mosalanejad, Seyed Nooreddin Faraji, Mohammad Reza Rahbar, Ahmad Gholami","doi":"10.1515/jib-2023-0053","DOIUrl":"10.1515/jib-2023-0053","url":null,"abstract":"<p><p>The glycoproteins 41 (gp41) of human immunodeficiency virus (HIV), located on the virus's external surface, form six-helix bundles that facilitate viral entry into the host cell. Theta defensins, cyclic peptides, inhibit the formation of these bundles by binding to the GP41 CHR region. RC101, a synthetic analog of theta-defensin molecules, exhibits activity against various HIV subtypes. Molecular docking of the CHR and RC101 was done using MDockPeP and Hawdock server. The type of bonds and the essential amino acids in binding were identified using AlphaFold3, CHIMERA, RING, and CYTOSCAPE. Mutable amino acids within the peptide were determined using the CUPSAT and Duet. Thirty-two new peptides were designed, and their interaction with the CHR of the gp41 was analyzed. The physicochemical properties, toxicity, allergenicity, and antigenicity of peptides were also investigated. Most of the designed peptides exhibited higher binding affinities to the target compared to RC101; notably, peptides 1 and 4 had the highest binding affinity and demonstrated a greater percentage of interactions with critical amino acids of CHR. Peptides A and E displayed the best physiochemical properties among designed peptides. The designed peptides may present a new generation of anti-HIV drugs, which may reduce the likelihood of drug resistance.</p>","PeriodicalId":53625,"journal":{"name":"Journal of Integrative Bioinformatics","volume":" ","pages":""},"PeriodicalIF":1.8,"publicationDate":"2025-03-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12327201/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143651943","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Irvan Faizal, Darrian Chandra, Tarwadi, Sabar Pambudi, Astutiati Nurhasanah, Rizky Priambodo, Muhammad Yusuf
{"title":"Immunoinformatics-guided design of a multiepitope peptide vaccine targeting the receptor-binding domain of SARS-CoV-2 spike glycoprotein: insights from Indonesian samples.","authors":"Irvan Faizal, Darrian Chandra, Tarwadi, Sabar Pambudi, Astutiati Nurhasanah, Rizky Priambodo, Muhammad Yusuf","doi":"10.1515/jib-2024-0025","DOIUrl":"https://doi.org/10.1515/jib-2024-0025","url":null,"abstract":"<p><p>The emergence of new variants of SARS-CoV-2, including Alpha, Beta, Gamma, Delta, Omicron variants, and XBB sub-variants, contributes to the number of coronavirus cases worldwide. SARS-CoV-2 is a positive RNA virus with a genome of 29.9 kb that encodes four structural proteins: spike glycoprotein (S), envelope glycoprotein (E), membrane glycoprotein (M), and nucleocapsid glycoprotein (N). These proteins are vital for viral activity, with the S protein facilitating attachment and membrane fusion through the receptor-binding domain (RBD) located in the S1 subunit. The RBD recognizes and binds to the human angiotensin-converting enzyme 2 (ACE-2) protein. An immunoinformatic-aided design of a peptide-based multiepitope vaccine candidate targeting the RBD glycoprotein is constructed from the SARS-CoV-2 sequence data base from various regions of Indonesia (Jakarta, West Java, and Bali). The results show that the RBD region of with accession ID EPI_ISL_15982641 from West Java had the highest antigenicity of 0.5904. This isolate is non-toxic and non-allergenic and shows a total of 18 LBL epitopes, 72 CTL epitopes, and 98 HTL epitopes. The epitope that has the best overall binding affinity was GCHNKCAY for MHC-I and GGCVFSYVGCHNKCAYWV for MHC-II which show a binding affinity of -13.6 and -15.5 (kcal/mol), respectively. Therefore, this study aims to design an epitope vaccine candidate based on samples from Indonesia that has good characteristics and may have the potential to stimulate an immune response against SARS-CoV-2.</p>","PeriodicalId":53625,"journal":{"name":"Journal of Integrative Bioinformatics","volume":" ","pages":""},"PeriodicalIF":1.5,"publicationDate":"2025-01-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142973272","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mohammad Javad Bazyari, Seyed Hamid Aghaee-Bakhtiari
{"title":"MiRNA target enrichment analysis of co-expression network modules reveals important miRNAs and their roles in breast cancer progression.","authors":"Mohammad Javad Bazyari, Seyed Hamid Aghaee-Bakhtiari","doi":"10.1515/jib-2022-0036","DOIUrl":"10.1515/jib-2022-0036","url":null,"abstract":"<p><p>Breast cancer has the highest incidence and is the fifth cause of death in cancers. Progression is one of the important features of breast cancer which makes it a life-threatening cancer. MicroRNAs are small RNA molecules that have pivotal roles in the regulation of gene expression and they control different properties in breast cancer such as progression. Recently, systems biology offers novel approaches to study complicated biological systems like miRNAs to find their regulatory roles. One of these approaches is analysis of weighted co-expression network in which genes with similar expression patterns are considered as a single module. Because the genes in one module have similar expression, it is rational to think the same regulatory elements such as miRNAs control their expression. Herein, we use WGCNA to find important modules related to breast cancer progression and use hypergeometric test to perform miRNA target enrichment analysis and find important miRNAs. Also, we use negative correlation between miRNA expression and modules as the second filter to ensure choosing the right candidate miRNAs regarding to important modules. We found hsa-mir-23b, hsa-let-7b and hsa-mir-30a are important miRNAs in breast cancer and also investigated their roles in breast cancer progression.</p>","PeriodicalId":53625,"journal":{"name":"Journal of Integrative Bioinformatics","volume":" ","pages":""},"PeriodicalIF":1.5,"publicationDate":"2024-12-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11698623/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142883636","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Exploring the therapeutic potential of <i>Asparagus africanus</i> in polycystic ovarian syndrome: a computational analysis.","authors":"Sania Riaz, Fatima Haider, Rizwan- Ur-Rehman, Aqsa Zafar","doi":"10.1515/jib-2024-0019","DOIUrl":"10.1515/jib-2024-0019","url":null,"abstract":"<p><p>PCOS is a multifaceted condition characterized by ovarian abnormalities, metabolic disorders, anovulation, and hormonal imbalances. In response to the growing demand for treatments with fewer side effects, the exploration of herbal-origin drugs has gained prominence. <i>Asparagus africanus</i>, a traditional medicinal plant that exhibits anti-inflammatory, antioxidant, and anti-androgenic properties may have a cure for PCOS. The plant has rich biochemical profile prompted its exploration as a potential source for drug development. The aim of this study is to investigate the potential therapeutic efficacy of <i>A. africanus</i> in the management of PCOS through molecular docking studies with Luteinizing Hormone Receptor and Follicle-Stimulating Hormone Receptor proteins. The identified compounds underwent molecular docking against key proteins associated with PCOS, namely Luteinizing Hormone Receptor and Follicle-Stimulating Hormone Receptor. The results underscored the lead compound's superiority, demonstrating favorable pharmacokinetics, ADME characteristics, and strong molecular binding without any observed toxicity in comparison to standard drug. This study, by leveraging natural compounds sourced from <i>A. africanus</i>, provides valuable insights and advances towards developing more effective and safer treatments for PCOS. The findings contribute to the evolving landscape of PCOS therapeutics, emphasizing the potential of herbal-origin drugs in mitigating the complexities of this syndrome.</p>","PeriodicalId":53625,"journal":{"name":"Journal of Integrative Bioinformatics","volume":" ","pages":""},"PeriodicalIF":1.5,"publicationDate":"2024-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11698622/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142808595","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}