Database: The Journal of Biological Databases and Curation最新文献

筛选
英文 中文
GeniePool 2.0: advancing variant analysis through CHM13-T2T, AlphaMissense, gnomAD V4 integration, and variant co-occurrence queries.
IF 3.4 4区 生物学
Database: The Journal of Biological Databases and Curation Pub Date : 2024-12-27 DOI: 10.1093/database/baae130
Grisha Weintraub, Noam Hadar, Ehud Gudes, Shlomi Dolev, Ohad S Birk
{"title":"GeniePool 2.0: advancing variant analysis through CHM13-T2T, AlphaMissense, gnomAD V4 integration, and variant co-occurrence queries.","authors":"Grisha Weintraub, Noam Hadar, Ehud Gudes, Shlomi Dolev, Ohad S Birk","doi":"10.1093/database/baae130","DOIUrl":"10.1093/database/baae130","url":null,"abstract":"<p><p>Originally developed to meet the challenges of genomic data deluge, GeniePool emerged as a pioneering platform, enabling efficient storage, accessibility, and analysis of vast genomic datasets, enabled due to its data lake architecture. Building on this foundation, GeniePool 2.0 advances genomic analysis through the integration of cutting-edge variant databases, such as CHM13-T2T, AlphaMissense, and gnomAD V4, coupled with the capability for variant co-occurrence queries. This evolution offers an unprecedented level of granularity and scope in genomic analyses, from enhancing our understanding of variant pathogenicity and phenotypic associations to facilitating research collaborations. The introduction of CHM13-T2T provides a more accurate reference for human genetic variation, AlphaMissense enriches the platform with protein-level impact predictions of missense mutations, and gnomAD V4 offers a comprehensive view of human genetic diversity. Additionally, the innovative feature for variant co-occurrence analysis is pivotal for exploring the combined effects of genetic variations, advancing our comprehension of compound heterozygosity, epistasis, and polygenic risk factors in disease pathogenesis. GeniePool 2.0 is a comprehensive and scalable platform, which aims to enhance genomic data analysis and contribute to genomic research, potentially supporting new discoveries and clinical innovations. Database URL: https://GeniePool.link.</p>","PeriodicalId":10923,"journal":{"name":"Database: The Journal of Biological Databases and Curation","volume":"2024 ","pages":""},"PeriodicalIF":3.4,"publicationDate":"2024-12-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11673193/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142892502","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
AneRBC dataset: a benchmark dataset for computer-aided anemia diagnosis using RBC images.
IF 3.4 4区 生物学
Database: The Journal of Biological Databases and Curation Pub Date : 2024-12-25 DOI: 10.1093/database/baae120
Muhammad Shahzad, Syed Hamad Shirazi, Muhammad Yaqoob, Zakir Khan, Assad Rasheed, Israr Ahmed Sheikh, Asad Hayat, Huiyu Zhou
{"title":"AneRBC dataset: a benchmark dataset for computer-aided anemia diagnosis using RBC images.","authors":"Muhammad Shahzad, Syed Hamad Shirazi, Muhammad Yaqoob, Zakir Khan, Assad Rasheed, Israr Ahmed Sheikh, Asad Hayat, Huiyu Zhou","doi":"10.1093/database/baae120","DOIUrl":"https://doi.org/10.1093/database/baae120","url":null,"abstract":"<p><p>Visual analysis of peripheral blood smear slides using medical image analysis is required to diagnose red blood cell (RBC) morphological deformities caused by anemia. The absence of a complete anaemic RBC dataset has hindered the training and testing of deep convolutional neural networks (CNNs) for computer-aided analysis of RBC morphology. We introduce a benchmark RBC image dataset named Anemic RBC (AneRBC) to overcome this problem. This dataset is divided into two versions: AneRBC-I and AneRBC-II. AneRBC-I contains 1000 microscopic images, including 500 healthy and 500 anaemic images with 1224 × 960 pixel resolution, along with manually generated ground truth of each image. Each image contains approximately 1550 RBC elements, including normocytes, microcytes, macrocytes, elliptocytes, and target cells, resulting in a total of approximately 1 550 000 RBC elements. The dataset also includes each image's complete blood count and morphology reports to validate the CNN model results with clinical data. Under the supervision of a team of expert pathologists, the annotation, labeling, and ground truth for each image were generated. Due to the high resolution, each image was divided into 12 subimages with ground truth and incorporated into AneRBC-II. AneRBC-II comprises a total of 12 000 images, comprising 6000 original and 6000 anaemic RBC images. Four state-of-the-art CNN models were applied for segmentation and classification to validate the proposed dataset. Database URL: https://data.mendeley.com/preview/hms3sjzt7f?a=4d0ba42a-cc6f-4777-adc4-2552e80db22b.</p>","PeriodicalId":10923,"journal":{"name":"Database: The Journal of Biological Databases and Curation","volume":"2024 ","pages":""},"PeriodicalIF":3.4,"publicationDate":"2024-12-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142892479","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
MiCK: a database of gut microbial genes linked with chemoresistance in cancer patients.
IF 3.4 4区 生物学
Database: The Journal of Biological Databases and Curation Pub Date : 2024-12-21 DOI: 10.1093/database/baae124
Muhammad Shahzaib, Muhammad Muaz, Muhammad Hasnain Zubair, Masood Ur Rehman Kayani
{"title":"MiCK: a database of gut microbial genes linked with chemoresistance in cancer patients.","authors":"Muhammad Shahzaib, Muhammad Muaz, Muhammad Hasnain Zubair, Masood Ur Rehman Kayani","doi":"10.1093/database/baae124","DOIUrl":"10.1093/database/baae124","url":null,"abstract":"<p><p>Cancer remains a global health challenge, with significant morbidity and mortality rates. In 2020, cancer caused nearly 10 million deaths, making it the second leading cause of death worldwide. The emergence of chemoresistance has become a major hurdle in successfully treating cancer patients. Recently, human gut microbes have been recognized for their role in modulating drug efficacy through their metabolites, ultimately leading to chemoresistance. The currently available databases are limited to knowledge regarding the interactions between gut microbiome and drugs. However, a database containing the human gut microbial gene sequences, and their effect on the efficacy of chemotherapy for cancer patients has not yet been developed. To address this challenge, we present the Microbial Chemoresistance Knowledgebase (MiCK), a comprehensive database that catalogs microbial gene sequences associated with chemoresistance. MiCK contains 1.6 million sequences of 29 gene types linked to chemoresistance and drug metabolism, curated manually from recent literature and sequence databases. The database can support downstream analysis as it provides a user-friendly web interface for sequence search and download functionalities. MiCK aims to facilitate the understanding and mitigation of chemoresistance in cancers by serving as a valuable resource for researchers. Database URL: https://microbialchemreskb.com/.</p>","PeriodicalId":10923,"journal":{"name":"Database: The Journal of Biological Databases and Curation","volume":"2024 ","pages":""},"PeriodicalIF":3.4,"publicationDate":"2024-12-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11662283/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142871629","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
JTIS: enhancing biomedical document-level relation extraction through joint training with intermediate steps.
IF 3.4 4区 生物学
Database: The Journal of Biological Databases and Curation Pub Date : 2024-12-19 DOI: 10.1093/database/baae125
Jiru Li, Dinghao Pan, Zhihao Yang, Yuanyuan Sun, Hongfei Lin, Jian Wang
{"title":"JTIS: enhancing biomedical document-level relation extraction through joint training with intermediate steps.","authors":"Jiru Li, Dinghao Pan, Zhihao Yang, Yuanyuan Sun, Hongfei Lin, Jian Wang","doi":"10.1093/database/baae125","DOIUrl":"10.1093/database/baae125","url":null,"abstract":"<p><p>Biomedical Relation Extraction (RE) is central to Biomedical Natural Language Processing and is crucial for various downstream applications. Existing RE challenges in the field of biology have primarily focused on intra-sentential analysis. However, with the rapid increase in the volume of literature and the complexity of relationships between biomedical entities, it often becomes necessary to consider multiple sentences to fully extract the relationship between a pair of entities. Current methods often fail to fully capture the complex semantic structures of information in documents, thereby affecting extraction accuracy. Therefore, unlike traditional RE methods that rely on sentence-level analysis and heuristic rules, our method focuses on extracting entity relationships from biomedical literature titles and abstracts and classifying relations that are novel findings. In our method, a multitask training approach is employed for fine-tuning a Pre-trained Language Model in the field of biology. Based on a broad spectrum of carefully designed tasks, our multitask method not only extracts relations of better quality due to more effective supervision but also achieves a more accurate classification of whether the entity pairs are novel findings. Moreover, by applying a model ensemble method, we further enhance our model's performance. The extensive experiments demonstrate that our method achieves significant performance improvements, i.e. surpassing the existing baseline by 3.94% in RE and 3.27% in Triplet Novel Typing in F1 score on BioRED, confirming its effectiveness in handling complex biomedical literature RE tasks. Database URL: https://codalab.lisn.upsaclay.fr/competitions/13377#learn_the_details-dataset.</p>","PeriodicalId":10923,"journal":{"name":"Database: The Journal of Biological Databases and Curation","volume":"2024 ","pages":""},"PeriodicalIF":3.4,"publicationDate":"2024-12-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11658465/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142863576","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
scEccDNAdb: an integrated single-cell eccDNA resource for human and mouse.
IF 3.4 4区 生物学
Database: The Journal of Biological Databases and Curation Pub Date : 2024-12-18 DOI: 10.1093/database/baae126
Wenqing Wang, Xinyu Zhao, Tianyu Ma, Tengwei Zhong, Junnuo Zheng, Zhiyun Guo
{"title":"scEccDNAdb: an integrated single-cell eccDNA resource for human and mouse.","authors":"Wenqing Wang, Xinyu Zhao, Tianyu Ma, Tengwei Zhong, Junnuo Zheng, Zhiyun Guo","doi":"10.1093/database/baae126","DOIUrl":"10.1093/database/baae126","url":null,"abstract":"<p><p>Extrachromosomal circular DNA (eccDNA), an extrachromosomal circular structured DNA, is extensively found in eukaryotes. Investigating eccDNA at the single-cell level is crucial for understanding cellular heterogeneity, evolution, development, and specific cellular functions. However, high-throughput identification methods for single-cell eccDNA are complex, and the lack of mature, widely applicable technologies has resulted in limited resources. To address this gap, we built scEccDNAdb, a database based on single-cell whole-genome sequencing data. It contains 3 195 464 single-cell eccDNA entries from human and mouse samples, with annotations including oncogenes, typical enhancers, super-enhancers, CCCTC-binding factor-binding sites, single nucleotide polymorphisms, chromatin accessibility, expression quantitative trait loci, transcription factor binding sites, motifs, and structural variants. Additionally, it provides nine online analysis and visualization tools, which enable the creation of publication-quality figures through user-uploaded files. Overall, scEccDNAdb is a comprehensive database for analyzing single-cell eccDNA data across diverse cell types, tissues, and species. Database URL: https://lcbb.swjtu.edu.cn/scEccDNAdb/.</p>","PeriodicalId":10923,"journal":{"name":"Database: The Journal of Biological Databases and Curation","volume":"2024 ","pages":""},"PeriodicalIF":3.4,"publicationDate":"2024-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11654243/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142853293","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
AthRiboNC: an Arabidopsis database for ncRNAs with coding potential revealed from ribosome profiling.
IF 3.4 4区 生物学
Database: The Journal of Biological Databases and Curation Pub Date : 2024-12-17 DOI: 10.1093/database/baae123
Yi Shen, Liya Liu, Enyan Liu, Sida Li, Yuriy Orlov, Vladimir Ivanisenko, Ming Chen
{"title":"AthRiboNC: an Arabidopsis database for ncRNAs with coding potential revealed from ribosome profiling.","authors":"Yi Shen, Liya Liu, Enyan Liu, Sida Li, Yuriy Orlov, Vladimir Ivanisenko, Ming Chen","doi":"10.1093/database/baae123","DOIUrl":"10.1093/database/baae123","url":null,"abstract":"<p><p>Non-coding RNAs (ncRNAs) are traditionally considered incapable of encoding proteins, but new evidence suggests that small open reading frames (sORFs) within ncRNAs can actually encode biologically functional small peptides. Despite growing recognition of their importance, a systematic exploration of plant ncRNAs with coding potential has remained largely uncharted territory, especially in the context of their translational activities. By collecting and analyzing Ribo-Seq data from 226 Arabidopsis thaliana samples, we have integrated extensive information on Arabidopsis ncRNAs with coding potential and developed the AthRiboNC database, a novel and dedicated database that consolidates extensive information on ncRNAs with coding potential in Arabidopsis. AthRiboNC covers detailed information on 2743 long non-coding RNAs, 255 microRNAs, and 1871 circular RNA in Arabidopsis, along with 40 162 ORFs identified from these ncRNAs. The database also constructs co-expression networks for ncRNAs with coding potential, revealing correlations and potential biological function interpretations. With a commitment to accessibility and ease-of-use, AthRiboNC features a clear and intuitive interface. We hope that AthRiboNC will serve as a valuable resource for exploring the coding potential of plant ncRNAs. Database URL: https://bis.zju.edu.cn/athribonc.</p>","PeriodicalId":10923,"journal":{"name":"Database: The Journal of Biological Databases and Curation","volume":"2024 ","pages":""},"PeriodicalIF":3.4,"publicationDate":"2024-12-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11651143/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142846024","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Probe my Pathway (PmP): a portal to explore the chemical coverage of the human Reactome. Probe my Pathway (PmP):探索人类反应组化学覆盖范围的门户网站。
IF 3.4 4区 生物学
Database: The Journal of Biological Databases and Curation Pub Date : 2024-12-05 DOI: 10.1093/database/baae116
Haejin Angela Kwak, Lihua Liu, Matthieu Schapira
{"title":"Probe my Pathway (PmP): a portal to explore the chemical coverage of the human Reactome.","authors":"Haejin Angela Kwak, Lihua Liu, Matthieu Schapira","doi":"10.1093/database/baae116","DOIUrl":"10.1093/database/baae116","url":null,"abstract":"<p><p>Deciphering pathway-phenotype associations is critical for a system-wide understanding of cells and the chemistry of life. An approach to reach this goal is to systematically modulate pathways pharmacologically. The targeted and controlled regulation of an increasing number of proteins is becoming possible, thanks to the growing list of chemical probes and chemogenomic compounds available to cell biologists, but no resource is available that directly maps these chemical tools on cellular pathways. To fill this gap, we developed Probe my Pathway (PmP), a database where high-quality chemical probes and well-characterized sets of chemogenomic compounds are mapped on all the human pathways of the Reactome database. The web interface allows users to browse the data via icicle charts or search the data for compounds, proteins, or pathways. Chemists can rapidly find pathways with low chemical coverage or explore the structural chemistry of ligands targeting specific cellular machineries. Cell biologists can look for chemical probes targeting different proteins in the same pathway or find which pathways are targeted by chemical probes of interest. PmP is updated annually and will grow with the expanding chemical tool kit produced by Target 2035 and other efforts. Database URL: https://apps.thesgc.org/pmp/.</p>","PeriodicalId":10923,"journal":{"name":"Database: The Journal of Biological Databases and Curation","volume":"2024 ","pages":""},"PeriodicalIF":3.4,"publicationDate":"2024-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11630241/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142827744","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Toward clearer recognition and easier usefulness: development of a cross-lingual atherosclerotic cerebrovascular disease ontology. 更清晰的识别和更方便的使用:开发跨语言的动脉粥样硬化性脑血管疾病本体论。
IF 3.4 4区 生物学
Database: The Journal of Biological Databases and Curation Pub Date : 2024-12-05 DOI: 10.1093/database/baae117
Hetong Ma, Liu Shen, Jiayang Wang, Shilong Wang, Min Wang, Meng Wang, Zixiao Li, Jiao Li
{"title":"Toward clearer recognition and easier usefulness: development of a cross-lingual atherosclerotic cerebrovascular disease ontology.","authors":"Hetong Ma, Liu Shen, Jiayang Wang, Shilong Wang, Min Wang, Meng Wang, Zixiao Li, Jiao Li","doi":"10.1093/database/baae117","DOIUrl":"10.1093/database/baae117","url":null,"abstract":"<p><p>Atherosclerotic cerebrovascular disease could result in a great number of deaths and disabilities. However, it did not acquire enough attention. Less information, statistics, or data on the disease has been revealed. Thus, no systematic concept datasets were released to help clinicians clarify the scope, assist research, and offer maximized value. This study aimed to develop a cross-lingual atherosclerotic cerebrovascular disease ontology; describe the workflow, schema, hierarchical structure, and the highlighted content; design a brand-new rehabilitation ontology; implement the ontology evaluation; and illustrate the application scenarios in real-world scenarios. We implemented nine steps based on the Ontology Development 101 methodologies combined with expert opinions. The ontology included collection and specification of clinical requirements, background investigation and knowledge acquisition, ontology selection and reuse, scope identification, schema definition, concept extraction, concept extension, ontology verification, and ontology evaluation. We evaluated the proposed ontology in the literature classification task. The current ontology included 10 top-level classes, respectively, clinical manifestation, comorbidity, complication, diagnosis, model of atherosclerotic cerebrovascular disease, pathogenesis, prevention, rehabilitation, risk factor, and treatment. There are 1715 concepts in the 11-level ontology, covering 4588 Chinese terms, 6617 English terms, and 972 definitions. The ontology could be applied in real-world scenarios such as information retrieval, new expression discovery, named entity recognition, and knowledge fusion, and the use case proved that it could offer satisfying support to related medical scenarios. The ontology was proven to be useful in text classification tasks, and the weight-F1 score could reach >80% combined with the pretrained model. The proposed ontology provided a clear set of cross-lingual concepts and terms with an explicit hierarchical structure, helping scientific researchers to quickly retrieve relevant medical literature, assisting data scientists to efficiently identify relevant contents in electronic health records, and providing a clear domain framework for academic reference. Database URL: https://bioportal.bioontology.org/ontologies/ACVD_ONTOLOGY.</p>","PeriodicalId":10923,"journal":{"name":"Database: The Journal of Biological Databases and Curation","volume":"2024 ","pages":""},"PeriodicalIF":3.4,"publicationDate":"2024-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11630243/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142827747","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The text2term tool to map free-text descriptions of biomedical terms to ontologies.
IF 3.4 4区 生物学
Database: The Journal of Biological Databases and Curation Pub Date : 2024-11-28 DOI: 10.1093/database/baae119
Rafael S Gonçalves, Jason Payne, Amelia Tan, Carmen Benitez, Jamie Haddock, Robert Gentleman
{"title":"The text2term tool to map free-text descriptions of biomedical terms to ontologies.","authors":"Rafael S Gonçalves, Jason Payne, Amelia Tan, Carmen Benitez, Jamie Haddock, Robert Gentleman","doi":"10.1093/database/baae119","DOIUrl":"10.1093/database/baae119","url":null,"abstract":"<p><p>There is an ongoing need for scalable tools to aid researchers in both retrospective and prospective standardization of discrete entity types-such as disease names, cell types, or chemicals-that are used in metadata associated with biomedical data. When metadata are not well-structured or precise, the associated data are harder to find and are often burdensome to reuse, analyze, or integrate with other datasets due to the upfront curation effort required to make the data usable-typically through retrospective standardization and cleaning of the (meta)data. With the goal of facilitating the task of standardizing metadata-either in bulk or in a one-by-one fashion, e.g. to support autocompletion of biomedical entities in forms-we have developed an open-source tool called text2term that maps free-text descriptions of biomedical entities to controlled terms in ontologies. The tool is highly configurable and can be used in multiple ways that cater to different users and expertise levels-it is available on Python Package Index and can be used programmatically as any Python package; it can also be used via a command-line interface or via our hosted, graphical user interface-based web application or by deploying a local instance of our interactive application using Docker. Database URL: https://pypi.org/project/text2term.</p>","PeriodicalId":10923,"journal":{"name":"Database: The Journal of Biological Databases and Curation","volume":"2024 ","pages":""},"PeriodicalIF":3.4,"publicationDate":"2024-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11604108/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142750183","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Genome-wide identification of SSR markers from coding regions for endangered Argania spinosa L. skeels and construction of SSR database: AsSSRdb. 从濒危刺阿干树(Argania spinosa L. skeels)编码区鉴定全基因组 SSR 标记并构建 SSR 数据库:AsSSRdb.
IF 3.4 4区 生物学
Database: The Journal of Biological Databases and Curation Pub Date : 2024-11-27 DOI: 10.1093/database/baae118
Karim Rabeh, Najoua Mghazli, Fatima Gaboun, Abdelkarim Filali-Maltouf, Laila Sbabou, Bouchra Belkadi
{"title":"Genome-wide identification of SSR markers from coding regions for endangered Argania spinosa L. skeels and construction of SSR database: AsSSRdb.","authors":"Karim Rabeh, Najoua Mghazli, Fatima Gaboun, Abdelkarim Filali-Maltouf, Laila Sbabou, Bouchra Belkadi","doi":"10.1093/database/baae118","DOIUrl":"10.1093/database/baae118","url":null,"abstract":"<p><p>Microsatellites [simple sequence repeats (SSRs)] are one of the most widely used sources of genetic markers, particularly prevalent in plants. Despite their importance in various applications, a comprehensive genome-wide identification of coding sequence (CDS)-associated SSR markers in the Argania spinosa L. genome has yet to be conducted. In this study, 66 280 CDSs containing 5351 SSRs within 4535 A. spinosa L. CDSs were identified. Among these, tri-nucleotide motifs (58.96%) were the most common, followed by hexa-nucleotide (15.71%) and di-nucleotide motifs (13.32%). The predominant SSR motif in the tri-nucleotide category was AAG (24.4%), while AG (94.1%) was the most abundant among di-nucleotide repeats. Furthermore, the extracted CDSs containing SSRs were subjected to functional annotation; 3396 CDSs (74.88%) exhibited homology with known proteins, 3341 CDSs (73.7%) were assigned Gene Ontology terms, 1004 CDSs were annotated with Enzyme Commission numbers, and 832 (18.3%) were annotated with KEGG pathways. A total of 3475 primer pairs were designed, out of which 3264 were successfully validated in silico against the A. spinosa L. genome, with 99.6% representing high-resolution markers yielding no more than three products. Additionally, the SSR markers demonstrated a low rate of transferability through in-silico verification in two species within the Sapotaceae family. Furthermore, we developed an online database, the \"Argania spinosa L. SSR database: https://as-fmmdb.shinyapps.io/asssrdb/\" (AsSSRdb) to provide access to the CDS-associated SSRs identified in this study. Overall, this research provides valuable marker resources for DNA fingerprinting, genetic studies, and molecular breeding in argan and related species. Database URL: https://as-fmmdb.shinyapps.io/asssrdb/.</p>","PeriodicalId":10923,"journal":{"name":"Database: The Journal of Biological Databases and Curation","volume":"2024 ","pages":""},"PeriodicalIF":3.4,"publicationDate":"2024-11-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11602033/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142738569","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信