{"title":"A novel lossless encoding algorithm for data compression-genomics data as an exemplar.","authors":"Anas Al-Okaily, Abdelghani Tbakhi","doi":"10.3389/fbinf.2024.1489704","DOIUrl":"10.3389/fbinf.2024.1489704","url":null,"abstract":"<p><p>Data compression is a challenging and increasingly important problem. As the amount of data generated daily continues to increase, efficient transmission and storage have never been more critical. In this study, a novel encoding algorithm is proposed, motivated by the compression of DNA data and associated characteristics. The proposed algorithm follows a divide-and-conquer approach by scanning the whole genome, classifying subsequences based on similarities in their content, and binning similar subsequences together. The data is then compressed into each bin independently. This approach is different than the currently known approaches: entropy, dictionary, predictive, or transform-based methods. Proof-of-concept performance was evaluated using a benchmark dataset with seventeen genomes ranging in size from kilobytes to gigabytes. The results showed a considerable improvement in the compression of each genome, preserving several megabytes compared to state-of-the-art tools. Moreover, the algorithm can be applied to the compression of other data types include mainly text, numbers, images, audio, and video which are being generated daily and unprecedentedly in massive volumes.</p>","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":"4 ","pages":"1489704"},"PeriodicalIF":2.8,"publicationDate":"2025-01-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11799261/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143366862","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Innovative CDR grafting and computational methods for PD-1 specific nanobody design.","authors":"Jagadeeswara Reddy Devasani, Girijasankar Guntuku, Nalini Panatula, Murali Krishna Kumar Muthyala, Mary Sulakshana Palla, Teruna J Siahaan","doi":"10.3389/fbinf.2024.1488331","DOIUrl":"10.3389/fbinf.2024.1488331","url":null,"abstract":"<p><strong>Introduction: </strong>The development of nanobodies targeting Programmed Cell Death Protein-1 (PD-1) offers a promising approach in cancer immunotherapy. This study aims to design and characterize a PD-1-specific nanobody using an integrated computational and experimental approach.</p><p><strong>Methods: </strong>An <i>in silico</i> design strategy was employed, involving Complementarity-Determining Region (CDR) grafting to construct the nanobody sequence. The three-dimensional structure of the nanobody was predicted using AlphaFold2, and molecular docking simulations via ClusPro were conducted to evaluate binding interactions with PD-1. Physicochemical properties, including stability and solubility, were analyzed using web-based tools, while molecular dynamics (MD) simulations assessed stability under physiological conditions. The nanobody was produced and purified using Ni-NTA chromatography, and experimental validation was performed through Western blotting, ELISA, and dot blot analysis.</p><p><strong>Results: </strong>Computational findings demonstrated favorable binding interactions, stability, and physicochemical properties of the nanobody. Experimental results confirmed the nanobody's specific binding affinity to PD-1, with ELISA and dot blot analyses providing evidence of robust interaction.</p><p><strong>Discussion: </strong>This study highlights the potential of combining computational and experimental approaches for engineering nanobodies. The engineered PD-1 nanobody exhibits promising characteristics, making it a strong candidate for further testing in cancer immunotherapy applications.</p>","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":"4 ","pages":"1488331"},"PeriodicalIF":2.8,"publicationDate":"2025-01-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11782559/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143082446","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Wang Wenlun, Yu Chaohang, Huang Yan, Li Wenbin, Zhou Nanqing, Hu Qianmin, Wu Shengcai, Yuan Qing, Yu Shirui, Zhang Feng, Zhu Lingyun
{"title":"Developing a ceRNA-based lncRNA-miRNA-mRNA regulatory network to uncover roles in skeletal muscle development.","authors":"Wang Wenlun, Yu Chaohang, Huang Yan, Li Wenbin, Zhou Nanqing, Hu Qianmin, Wu Shengcai, Yuan Qing, Yu Shirui, Zhang Feng, Zhu Lingyun","doi":"10.3389/fbinf.2024.1494717","DOIUrl":"https://doi.org/10.3389/fbinf.2024.1494717","url":null,"abstract":"<p><p>The precise role of lncRNAs in skeletal muscle development and atrophy remain elusive. We conducted a bioinformatic analysis of 26 GEO datasets from mouse studies, encompassing embryonic development, postnatal growth, regeneration, cell proliferation, and differentiation, using R and relevant packages (limma et al.). LncRNA-miRNA relationships were predicted using miRcode and lncBaseV2, with miRNA-mRNA pairs identified via miRcode, miRDB, and Targetscan7. Based on the ceRNA theory, we constructed and visualized the lncRNA-miRNA-mRNA regulatory network using ggalluvial among other R packages. GO, Reactome, KEGG, and GSEA explored interactions in muscle development and regeneration. We identified five candidate lncRNAs (Xist, Gas5, Pvt1, Airn, and Meg3) as potential mediators in these processes and microgravity-induced muscle wasting. Additionally, we created a detailed lncRNA-miRNA-mRNA regulatory network, including interactions such as lncRNA Xist/miR-126/IRS1, lncRNA Xist/miR-486-5p/GAB2, lncRNA Pvt1/miR-148/RAB34, and lncRNA Gas5/miR-455-5p/SOCS3. Significant signaling pathway changes (PI3K/Akt, MAPK, NF-κB, cell cycle, AMPK, Hippo, and cAMP) were observed during muscle development, regeneration, and atrophy. Despite bioinformatics challenges, our research underscores the significant roles of lncRNAs in muscle protein synthesis, degradation, cell proliferation, differentiation, function, and metabolism under both normal and microgravity conditions. This study offers new insights into the molecular mechanisms governing skeletal muscle development and regeneration.</p>","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":"4 ","pages":"1494717"},"PeriodicalIF":2.8,"publicationDate":"2025-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11774864/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143070043","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"hsa-miR-548d-3p: a potential microRNA to target nucleocapsid and/or capsid genes in multiple members of the Flaviviridae family.","authors":"H W Cayatineto, S T Hakim","doi":"10.3389/fbinf.2024.1487292","DOIUrl":"10.3389/fbinf.2024.1487292","url":null,"abstract":"<p><strong>Introduction: </strong>Flaviviridae comprise a group of enveloped, positive-stranded RNA viruses that are mainly transmitted through either mosquitoes or tick bites and/or contaminated blood, blood products, or other body secretions. These viruses cause diseases ranging from mild to severe and are considered important human pathogens. MicroRNAs (miRNAs) are non-coding molecules involved in growth, development, cell proliferation, protein synthesis, apoptosis, and pathogenesis. These small molecules are even being used as gene suppressors in antiviral therapeutics, inhibiting viral replication. In the current study, we used bioinformatic tools to predict a possible miRNA sequence that could be complementary to the nucleocapsid (NP) and/or capsid (CP) gene of the Flaviviridae family and provide an inhibitory solution.</p><p><strong>Methods: </strong>Bioinformatics is a field of science that includes tremendous computational analysis, logarithms, and sequence alignments. To predict the right alignments between miRNA and viral mRNA genomes, we used computational databases such as miRBase, NCBI, and Basic Alignment Search Tool-nucleotides (BLAST-n).</p><p><strong>Results: </strong>Of the 2,600 mature miRNAs, hsa-miR-548d-3p revealed complementary sequences with the flavivirus capsid gene and bovine viral diarrhea virus (BVDV) capsid gene and was selected as a possible candidate to inhibit flaviviruses.</p><p><strong>Conclusion: </strong>Although more detailed <i>in vitro</i> and <i>in vivo</i> studies are required to test the possible inhibitory effects of hsa-miR-548d-3p against flaviviruses, this computational study may be the first step to study further, developing a novel therapeutic for lethal viruses within the Flaviviridae family using suggested candidate miRNAs.</p>","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":"4 ","pages":"1487292"},"PeriodicalIF":2.8,"publicationDate":"2025-01-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11772435/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143060951","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Detection of reproducible liver cancer specific ligand-receptor signaling in blood.","authors":"Aram Safrastyan, Damian Wollny","doi":"10.3389/fbinf.2024.1332782","DOIUrl":"10.3389/fbinf.2024.1332782","url":null,"abstract":"<p><p>Cell-cell communication mediated by ligand-receptor interactions (LRI) is critical to coordinating diverse biological processes in homeostasis and disease. Lately, our understanding of these processes has greatly expanded through the inference of cellular communication, utilizing RNA extracted from bulk tissue or individual cells. Considering the challenge of obtaining tissue biopsies for these approaches, we considered the potential of studying cell-free RNA obtained from blood. To test the feasibility of this approach, we used the BulkSignalR algorithm across 295 cell-free RNA samples and compared the LRI profiles across multiple cancer types and healthy donors. Interestingly, we detected specific and reproducible LRIs particularly in the blood of liver cancer patients compared to healthy donors. We found an increase in the magnitude of hepatocyte interactions, notably hepatocyte autocrine interactions in liver cancer patients. Additionally, a robust panel of 30 liver cancer-specific LRIs presents a bridge linking liver cancer pathogenesis to discernible blood markers. In summary, our approach shows the plausibility of detecting liver LRIs in blood and builds upon the biological understanding of cell-free transcriptomes.</p>","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":"4 ","pages":"1332782"},"PeriodicalIF":2.8,"publicationDate":"2025-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11754192/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143030392","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Editorial: Multi-omics approaches in the study of human disease mechanisms.","authors":"Dapeng Wang, Giuseppe Agapito","doi":"10.3389/fbinf.2024.1546680","DOIUrl":"10.3389/fbinf.2024.1546680","url":null,"abstract":"","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":"4 ","pages":"1546680"},"PeriodicalIF":2.8,"publicationDate":"2025-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11747011/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143017425","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Vineeta Pandey, Aarshi Srivastava, Ramwant Gupta, Haitham E M Zaki, Muhammad Shafiq Shahid, Rajarshi K Gaur
{"title":"<i>In silico</i> identification of chilli genome encoded MicroRNAs targeting the 16S rRNA and <i>secA</i> genes of \"<i>Candidatus</i> phytoplasma trifolii<i>\"</i>.","authors":"Vineeta Pandey, Aarshi Srivastava, Ramwant Gupta, Haitham E M Zaki, Muhammad Shafiq Shahid, Rajarshi K Gaur","doi":"10.3389/fbinf.2024.1493712","DOIUrl":"10.3389/fbinf.2024.1493712","url":null,"abstract":"<p><p>Phytoplasma, a potentially hazardous pathogen associated with witches' broom, is an economically harmful disease-producing bacteria that damages chilli cultivation. Phytoplasma-infected plants display various symptoms that indicate significant disruptions in normal plant physiology and behaviour. Diseases caused by phytoplasma are widespread and have a major economic impact on crop quality and yield. This work focuses on identifying and examining chilli microRNAs (miRNAs) as potential targets against the 16S rRNA and <i>secA</i> gene of \"<i>Candidatus</i> Phytoplasma trifolii\" (\"<i>Ca</i>. P. trifolii\") through plant miRNA prediction algorithms. Mature chilli miRNAs (CA-miRNAs) were collected and used to hybridise the 16S rRNA and <i>secA</i> genes. A total of four common CA-miRNAs were picked according to genetic consensus. Three algorithms applied in the present study suggested that the physiologically relevant, top-ranked miR169b_2 has a possibly specific site at nucleotide position 1,006 for targeting the '<i>Ca</i>. P. trifolii' 16S rRNA gene. The circos algorithm was then utilised to create the miRNA-mRNA regulatory network. The free energy between the miRNA:mRNA duplex was also computed, and the best value of -17.46 kcal/mol was obtained for CA-miR166c_2. Currently, there are no suitable commercial '<i>Ca</i>. P. trifolii'-resistant chilli crops. As a result, the expected biological data provide useful evidence for developing '<i>Ca</i>. P. trifolii'-resistant chilli plants.</p>","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":"4 ","pages":"1493712"},"PeriodicalIF":2.8,"publicationDate":"2025-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11743513/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143017424","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Benjamin Dubois, Mathieu Delitte, Salomé Lengrand, Claude Bragard, Anne Legrève, Frédéric Debode
{"title":"PRONAME: a user-friendly pipeline to process long-read nanopore metabarcoding data by generating high-quality consensus sequences.","authors":"Benjamin Dubois, Mathieu Delitte, Salomé Lengrand, Claude Bragard, Anne Legrève, Frédéric Debode","doi":"10.3389/fbinf.2024.1483255","DOIUrl":"https://doi.org/10.3389/fbinf.2024.1483255","url":null,"abstract":"<p><strong>Background: </strong>The study of sample taxonomic composition has evolved from direct observations and labor-intensive morphological studies to different DNA sequencing methodologies. Most of these studies leverage the metabarcoding approach, which involves the amplification of a small taxonomically-informative portion of the genome and its subsequent high-throughput sequencing. Recent advances in sequencing technology brought by Oxford Nanopore Technologies have revolutionized the field, enabling portability, affordable cost and long-read sequencing, therefore leading to a significant increase in taxonomic resolution. However, Nanopore sequencing data exhibit a particular profile, with a higher error rate compared with Illumina sequencing, and existing bioinformatics pipelines for the analysis of such data are scarce and often insufficient, requiring specialized tools to accurately process long-read sequences.</p><p><strong>Results: </strong>We present PRONAME (PROcessing NAnopore MEtabarcoding data), an open-source, user-friendly pipeline optimized for processing raw Nanopore sequencing data. PRONAME includes precompiled databases for complete 16S sequences (Silva138 and Greengenes2) and a newly developed and curated database dedicated to bacterial 16S-ITS-23S operon sequences. The user can also provide a custom database if desired, therefore enabling the analysis of metabarcoding data for any domain of life. The pipeline significantly improves sequence accuracy, implementing innovative error-correction strategies and taking advantage of the new sequencing chemistry to produce high-quality duplex reads. Evaluations using a mock community have shown that PRONAME delivers consensus sequences demonstrating at least 99.5% accuracy with standard settings (and up to 99.7%), making it a robust tool for genomic analysis of complex multi-species communities.</p><p><strong>Conclusion: </strong>PRONAME meets the challenges of long-read Nanopore data processing, offering greater accuracy and versatility than existing pipelines. By integrating Nanopore-specific quality filtering, clustering and error correction, PRONAME produces high-precision consensus sequences. This brings the accuracy of Nanopore sequencing close to that of Illumina sequencing, while taking advantage of the benefits of long-read technologies.</p>","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":"4 ","pages":"1483255"},"PeriodicalIF":2.8,"publicationDate":"2024-12-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11695402/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142933996","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Completing a molecular timetree of primates.","authors":"Jack M Craig, S Blair Hedges, Sudhir Kumar","doi":"10.3389/fbinf.2024.1495417","DOIUrl":"10.3389/fbinf.2024.1495417","url":null,"abstract":"<p><p>Primates, consisting of apes, monkeys, tarsiers, and lemurs, are among the most charismatic and well-studied animals on Earth, yet there is no taxonomically complete molecular timetree for the group. Combining the latest large-scale genomic primate phylogeny of 205 recognized species with the 400-species literature consensus tree available from TimeTree.org yields a phylogeny of just 405 primates, with 50 species still missing despite having molecular sequence data in the NCBI GenBank. In this study, we assemble a timetree of 455 primates, incorporating every species for which molecular data are available. We use a synthetic approach consisting of a literature review for published timetrees, <i>de novo</i> dating of untimed trees, and assembly of timetrees from novel alignments. The resulting near-complete molecular timetree of primates allows testing of two long-standing alternate hypotheses for the origins of primate biodiversity: whether species richness arises at a constant rate, in which case older clades have more species, or whether some clades exhibit faster rates of speciation than others, in which case, these fast clades would be more species-rich. Consistent with other large-scale macroevolutionary analyses, we found that the speciation rate is similar across the primate tree of life, albeit with some variation in smaller clades.</p>","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":"4 ","pages":"1495417"},"PeriodicalIF":2.8,"publicationDate":"2024-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11683086/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142908053","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Unlocking the future of complex human diseases prediction: multi-omics risk score breakthrough.","authors":"Benson R Kidenya, Gerald Mboowa","doi":"10.3389/fbinf.2024.1510352","DOIUrl":"10.3389/fbinf.2024.1510352","url":null,"abstract":"","PeriodicalId":73066,"journal":{"name":"Frontiers in bioinformatics","volume":"4 ","pages":"1510352"},"PeriodicalIF":2.8,"publicationDate":"2024-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11682975/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142908057","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}