Database: The Journal of Biological Databases and Curation最新文献

筛选
英文 中文
DisGeNet: a disease-centric interaction database among diseases and various associated genes. DisGeNet:疾病和各种相关基因之间以疾病为中心的相互作用数据库。
IF 3.4 4区 生物学
Database: The Journal of Biological Databases and Curation Pub Date : 2025-01-11 DOI: 10.1093/database/baae122
Yaxuan Hu, Xingli Guo, Yao Yun, Liang Lu, Xiaotai Huang, Songwei Jia
{"title":"DisGeNet: a disease-centric interaction database among diseases and various associated genes.","authors":"Yaxuan Hu, Xingli Guo, Yao Yun, Liang Lu, Xiaotai Huang, Songwei Jia","doi":"10.1093/database/baae122","DOIUrl":"10.1093/database/baae122","url":null,"abstract":"<p><p>The pathogenesis of complex diseases is intricately linked to various genes and network medicine has enhanced understanding of diseases. However, most network-based approaches ignore interactions mediated by noncoding RNAs (ncRNAs) and most databases only focus on the association between genes and diseases. Based on the mentioned questions, we have developed DisGeNet, a database focuses not only on the disease-associated genes but also on the interactions among genes. Here, the associations between diseases and various genes, as well as the interactions among these genes are integrated into a disease-centric network. As a result, there are a total of 502 688 interactions/associations involving 6697 diseases, 5780 lncRNAs (long noncoding RNAs), 16 135 protein-coding genes, and 2610 microRNAs stored in DisGeNet. These interactions/associations can be categorized as protein-protein, lncRNA-disease, microRNA-gene, microRNA-disease, gene-disease, and microRNA-lncRNA. Furthermore, as users input name/ID of diseases/genes for search, the interactions/associations about the search content can be browsed as a list or viewed in a local network-view. Database URL: https://disgenet.cn/.</p>","PeriodicalId":10923,"journal":{"name":"Database: The Journal of Biological Databases and Curation","volume":"2025 ","pages":""},"PeriodicalIF":3.4,"publicationDate":"2025-01-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11724190/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142964005","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
HoloFood Data Portal: holo-omic datasets for analysing host-microbiota interactions in animal production. 全息食品数据门户:用于分析动物生产中宿主-微生物群相互作用的全息数据集。
IF 3.4 4区 生物学
Database: The Journal of Biological Databases and Curation Pub Date : 2025-01-11 DOI: 10.1093/database/baae112
Alexander B Rogers, Varsha Kale, Germana Baldi, Antton Alberdi, M Thomas P Gilbert, Dipayan Gupta, Morten T Limborg, Sen Li, Thomas Payne, Bent Petersen, Jacob A Rasmussen, Lorna Richardson, Robert D Finn
{"title":"HoloFood Data Portal: holo-omic datasets for analysing host-microbiota interactions in animal production.","authors":"Alexander B Rogers, Varsha Kale, Germana Baldi, Antton Alberdi, M Thomas P Gilbert, Dipayan Gupta, Morten T Limborg, Sen Li, Thomas Payne, Bent Petersen, Jacob A Rasmussen, Lorna Richardson, Robert D Finn","doi":"10.1093/database/baae112","DOIUrl":"10.1093/database/baae112","url":null,"abstract":"<p><p>The HoloFood project used a hologenomic approach to understand the impact of host-microbiota interactions on salmon and chicken production by analysing multiomic data, phenotypic characteristics, and associated metadata in response to novel feeds. The project's raw data, derived analyses, and metadata are deposited in public, open archives (BioSamples, European Nucleotide Archive, MetaboLights, and MGnify), so making use of these diverse data types may require access to multiple resources. This is especially complex where analysis pipelines produce derived outputs such as functional profiles or genome catalogues. The HoloFood Data Portal is a web resource that simplifies access to the project datasets. For example, users can conveniently access multiomic datasets derived from the same individual or retrieve host phenotypic data with a linked gut microbiome sample. Project-specific metagenome-assembled genome and viral catalogues are also provided, linking to broader datasets in MGnify. The portal stores only data necessary to provide these relationships, with possible linking to the underlying repositories. The portal showcases a model approach for how future multiomics datasets can be made available. Database URL:  https://www.holofooddata.org.</p>","PeriodicalId":10923,"journal":{"name":"Database: The Journal of Biological Databases and Curation","volume":"2025 ","pages":""},"PeriodicalIF":3.4,"publicationDate":"2025-01-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11724189/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142964008","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
GeniePool 2.0: advancing variant analysis through CHM13-T2T, AlphaMissense, gnomAD V4 integration, and variant co-occurrence queries. GeniePool 2.0:通过CHM13-T2T、AlphaMissense、gnomAD V4集成和变体共现查询推进变体分析。
IF 3.4 4区 生物学
Database: The Journal of Biological Databases and Curation Pub Date : 2024-12-27 DOI: 10.1093/database/baae130
Grisha Weintraub, Noam Hadar, Ehud Gudes, Shlomi Dolev, Ohad S Birk
{"title":"GeniePool 2.0: advancing variant analysis through CHM13-T2T, AlphaMissense, gnomAD V4 integration, and variant co-occurrence queries.","authors":"Grisha Weintraub, Noam Hadar, Ehud Gudes, Shlomi Dolev, Ohad S Birk","doi":"10.1093/database/baae130","DOIUrl":"10.1093/database/baae130","url":null,"abstract":"<p><p>Originally developed to meet the challenges of genomic data deluge, GeniePool emerged as a pioneering platform, enabling efficient storage, accessibility, and analysis of vast genomic datasets, enabled due to its data lake architecture. Building on this foundation, GeniePool 2.0 advances genomic analysis through the integration of cutting-edge variant databases, such as CHM13-T2T, AlphaMissense, and gnomAD V4, coupled with the capability for variant co-occurrence queries. This evolution offers an unprecedented level of granularity and scope in genomic analyses, from enhancing our understanding of variant pathogenicity and phenotypic associations to facilitating research collaborations. The introduction of CHM13-T2T provides a more accurate reference for human genetic variation, AlphaMissense enriches the platform with protein-level impact predictions of missense mutations, and gnomAD V4 offers a comprehensive view of human genetic diversity. Additionally, the innovative feature for variant co-occurrence analysis is pivotal for exploring the combined effects of genetic variations, advancing our comprehension of compound heterozygosity, epistasis, and polygenic risk factors in disease pathogenesis. GeniePool 2.0 is a comprehensive and scalable platform, which aims to enhance genomic data analysis and contribute to genomic research, potentially supporting new discoveries and clinical innovations. Database URL: https://GeniePool.link.</p>","PeriodicalId":10923,"journal":{"name":"Database: The Journal of Biological Databases and Curation","volume":"2024 ","pages":""},"PeriodicalIF":3.4,"publicationDate":"2024-12-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11673193/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142892502","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
AneRBC dataset: a benchmark dataset for computer-aided anemia diagnosis using RBC images. AneRBC数据集:使用红细胞图像进行计算机辅助贫血诊断的基准数据集。
IF 3.4 4区 生物学
Database: The Journal of Biological Databases and Curation Pub Date : 2024-12-25 DOI: 10.1093/database/baae120
Muhammad Shahzad, Syed Hamad Shirazi, Muhammad Yaqoob, Zakir Khan, Assad Rasheed, Israr Ahmed Sheikh, Asad Hayat, Huiyu Zhou
{"title":"AneRBC dataset: a benchmark dataset for computer-aided anemia diagnosis using RBC images.","authors":"Muhammad Shahzad, Syed Hamad Shirazi, Muhammad Yaqoob, Zakir Khan, Assad Rasheed, Israr Ahmed Sheikh, Asad Hayat, Huiyu Zhou","doi":"10.1093/database/baae120","DOIUrl":"https://doi.org/10.1093/database/baae120","url":null,"abstract":"<p><p>Visual analysis of peripheral blood smear slides using medical image analysis is required to diagnose red blood cell (RBC) morphological deformities caused by anemia. The absence of a complete anaemic RBC dataset has hindered the training and testing of deep convolutional neural networks (CNNs) for computer-aided analysis of RBC morphology. We introduce a benchmark RBC image dataset named Anemic RBC (AneRBC) to overcome this problem. This dataset is divided into two versions: AneRBC-I and AneRBC-II. AneRBC-I contains 1000 microscopic images, including 500 healthy and 500 anaemic images with 1224 × 960 pixel resolution, along with manually generated ground truth of each image. Each image contains approximately 1550 RBC elements, including normocytes, microcytes, macrocytes, elliptocytes, and target cells, resulting in a total of approximately 1 550 000 RBC elements. The dataset also includes each image's complete blood count and morphology reports to validate the CNN model results with clinical data. Under the supervision of a team of expert pathologists, the annotation, labeling, and ground truth for each image were generated. Due to the high resolution, each image was divided into 12 subimages with ground truth and incorporated into AneRBC-II. AneRBC-II comprises a total of 12 000 images, comprising 6000 original and 6000 anaemic RBC images. Four state-of-the-art CNN models were applied for segmentation and classification to validate the proposed dataset. Database URL: https://data.mendeley.com/preview/hms3sjzt7f?a=4d0ba42a-cc6f-4777-adc4-2552e80db22b.</p>","PeriodicalId":10923,"journal":{"name":"Database: The Journal of Biological Databases and Curation","volume":"2024 ","pages":""},"PeriodicalIF":3.4,"publicationDate":"2024-12-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142892479","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
MiCK: a database of gut microbial genes linked with chemoresistance in cancer patients. MiCK:与癌症患者化疗耐药相关的肠道微生物基因数据库。
IF 3.4 4区 生物学
Database: The Journal of Biological Databases and Curation Pub Date : 2024-12-21 DOI: 10.1093/database/baae124
Muhammad Shahzaib, Muhammad Muaz, Muhammad Hasnain Zubair, Masood Ur Rehman Kayani
{"title":"MiCK: a database of gut microbial genes linked with chemoresistance in cancer patients.","authors":"Muhammad Shahzaib, Muhammad Muaz, Muhammad Hasnain Zubair, Masood Ur Rehman Kayani","doi":"10.1093/database/baae124","DOIUrl":"10.1093/database/baae124","url":null,"abstract":"<p><p>Cancer remains a global health challenge, with significant morbidity and mortality rates. In 2020, cancer caused nearly 10 million deaths, making it the second leading cause of death worldwide. The emergence of chemoresistance has become a major hurdle in successfully treating cancer patients. Recently, human gut microbes have been recognized for their role in modulating drug efficacy through their metabolites, ultimately leading to chemoresistance. The currently available databases are limited to knowledge regarding the interactions between gut microbiome and drugs. However, a database containing the human gut microbial gene sequences, and their effect on the efficacy of chemotherapy for cancer patients has not yet been developed. To address this challenge, we present the Microbial Chemoresistance Knowledgebase (MiCK), a comprehensive database that catalogs microbial gene sequences associated with chemoresistance. MiCK contains 1.6 million sequences of 29 gene types linked to chemoresistance and drug metabolism, curated manually from recent literature and sequence databases. The database can support downstream analysis as it provides a user-friendly web interface for sequence search and download functionalities. MiCK aims to facilitate the understanding and mitigation of chemoresistance in cancers by serving as a valuable resource for researchers. Database URL: https://microbialchemreskb.com/.</p>","PeriodicalId":10923,"journal":{"name":"Database: The Journal of Biological Databases and Curation","volume":"2024 ","pages":""},"PeriodicalIF":3.4,"publicationDate":"2024-12-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11662283/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142871629","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
JTIS: enhancing biomedical document-level relation extraction through joint training with intermediate steps. JTIS:通过中间步骤联合训练,加强生物医学文献级关系提取。
IF 3.4 4区 生物学
Database: The Journal of Biological Databases and Curation Pub Date : 2024-12-19 DOI: 10.1093/database/baae125
Jiru Li, Dinghao Pan, Zhihao Yang, Yuanyuan Sun, Hongfei Lin, Jian Wang
{"title":"JTIS: enhancing biomedical document-level relation extraction through joint training with intermediate steps.","authors":"Jiru Li, Dinghao Pan, Zhihao Yang, Yuanyuan Sun, Hongfei Lin, Jian Wang","doi":"10.1093/database/baae125","DOIUrl":"10.1093/database/baae125","url":null,"abstract":"<p><p>Biomedical Relation Extraction (RE) is central to Biomedical Natural Language Processing and is crucial for various downstream applications. Existing RE challenges in the field of biology have primarily focused on intra-sentential analysis. However, with the rapid increase in the volume of literature and the complexity of relationships between biomedical entities, it often becomes necessary to consider multiple sentences to fully extract the relationship between a pair of entities. Current methods often fail to fully capture the complex semantic structures of information in documents, thereby affecting extraction accuracy. Therefore, unlike traditional RE methods that rely on sentence-level analysis and heuristic rules, our method focuses on extracting entity relationships from biomedical literature titles and abstracts and classifying relations that are novel findings. In our method, a multitask training approach is employed for fine-tuning a Pre-trained Language Model in the field of biology. Based on a broad spectrum of carefully designed tasks, our multitask method not only extracts relations of better quality due to more effective supervision but also achieves a more accurate classification of whether the entity pairs are novel findings. Moreover, by applying a model ensemble method, we further enhance our model's performance. The extensive experiments demonstrate that our method achieves significant performance improvements, i.e. surpassing the existing baseline by 3.94% in RE and 3.27% in Triplet Novel Typing in F1 score on BioRED, confirming its effectiveness in handling complex biomedical literature RE tasks. Database URL: https://codalab.lisn.upsaclay.fr/competitions/13377#learn_the_details-dataset.</p>","PeriodicalId":10923,"journal":{"name":"Database: The Journal of Biological Databases and Curation","volume":"2024 ","pages":""},"PeriodicalIF":3.4,"publicationDate":"2024-12-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11658465/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142863576","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
scEccDNAdb: an integrated single-cell eccDNA resource for human and mouse. scEccDNAdb:人类和小鼠的集成单细胞eccDNA资源。
IF 3.4 4区 生物学
Database: The Journal of Biological Databases and Curation Pub Date : 2024-12-18 DOI: 10.1093/database/baae126
Wenqing Wang, Xinyu Zhao, Tianyu Ma, Tengwei Zhong, Junnuo Zheng, Zhiyun Guo
{"title":"scEccDNAdb: an integrated single-cell eccDNA resource for human and mouse.","authors":"Wenqing Wang, Xinyu Zhao, Tianyu Ma, Tengwei Zhong, Junnuo Zheng, Zhiyun Guo","doi":"10.1093/database/baae126","DOIUrl":"10.1093/database/baae126","url":null,"abstract":"<p><p>Extrachromosomal circular DNA (eccDNA), an extrachromosomal circular structured DNA, is extensively found in eukaryotes. Investigating eccDNA at the single-cell level is crucial for understanding cellular heterogeneity, evolution, development, and specific cellular functions. However, high-throughput identification methods for single-cell eccDNA are complex, and the lack of mature, widely applicable technologies has resulted in limited resources. To address this gap, we built scEccDNAdb, a database based on single-cell whole-genome sequencing data. It contains 3 195 464 single-cell eccDNA entries from human and mouse samples, with annotations including oncogenes, typical enhancers, super-enhancers, CCCTC-binding factor-binding sites, single nucleotide polymorphisms, chromatin accessibility, expression quantitative trait loci, transcription factor binding sites, motifs, and structural variants. Additionally, it provides nine online analysis and visualization tools, which enable the creation of publication-quality figures through user-uploaded files. Overall, scEccDNAdb is a comprehensive database for analyzing single-cell eccDNA data across diverse cell types, tissues, and species. Database URL: https://lcbb.swjtu.edu.cn/scEccDNAdb/.</p>","PeriodicalId":10923,"journal":{"name":"Database: The Journal of Biological Databases and Curation","volume":"2024 ","pages":""},"PeriodicalIF":3.4,"publicationDate":"2024-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11654243/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142853293","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
AthRiboNC: an Arabidopsis database for ncRNAs with coding potential revealed from ribosome profiling. AthRiboNC:一个从核糖体分析中发现具有编码潜力的ncrna的拟南芥数据库。
IF 3.4 4区 生物学
Database: The Journal of Biological Databases and Curation Pub Date : 2024-12-17 DOI: 10.1093/database/baae123
Yi Shen, Liya Liu, Enyan Liu, Sida Li, Yuriy Orlov, Vladimir Ivanisenko, Ming Chen
{"title":"AthRiboNC: an Arabidopsis database for ncRNAs with coding potential revealed from ribosome profiling.","authors":"Yi Shen, Liya Liu, Enyan Liu, Sida Li, Yuriy Orlov, Vladimir Ivanisenko, Ming Chen","doi":"10.1093/database/baae123","DOIUrl":"10.1093/database/baae123","url":null,"abstract":"<p><p>Non-coding RNAs (ncRNAs) are traditionally considered incapable of encoding proteins, but new evidence suggests that small open reading frames (sORFs) within ncRNAs can actually encode biologically functional small peptides. Despite growing recognition of their importance, a systematic exploration of plant ncRNAs with coding potential has remained largely uncharted territory, especially in the context of their translational activities. By collecting and analyzing Ribo-Seq data from 226 Arabidopsis thaliana samples, we have integrated extensive information on Arabidopsis ncRNAs with coding potential and developed the AthRiboNC database, a novel and dedicated database that consolidates extensive information on ncRNAs with coding potential in Arabidopsis. AthRiboNC covers detailed information on 2743 long non-coding RNAs, 255 microRNAs, and 1871 circular RNA in Arabidopsis, along with 40 162 ORFs identified from these ncRNAs. The database also constructs co-expression networks for ncRNAs with coding potential, revealing correlations and potential biological function interpretations. With a commitment to accessibility and ease-of-use, AthRiboNC features a clear and intuitive interface. We hope that AthRiboNC will serve as a valuable resource for exploring the coding potential of plant ncRNAs. Database URL: https://bis.zju.edu.cn/athribonc.</p>","PeriodicalId":10923,"journal":{"name":"Database: The Journal of Biological Databases and Curation","volume":"2024 ","pages":""},"PeriodicalIF":3.4,"publicationDate":"2024-12-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11651143/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142846024","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Probe my Pathway (PmP): a portal to explore the chemical coverage of the human Reactome. Probe my Pathway (PmP):探索人类反应组化学覆盖范围的门户网站。
IF 3.4 4区 生物学
Database: The Journal of Biological Databases and Curation Pub Date : 2024-12-05 DOI: 10.1093/database/baae116
Haejin Angela Kwak, Lihua Liu, Matthieu Schapira
{"title":"Probe my Pathway (PmP): a portal to explore the chemical coverage of the human Reactome.","authors":"Haejin Angela Kwak, Lihua Liu, Matthieu Schapira","doi":"10.1093/database/baae116","DOIUrl":"10.1093/database/baae116","url":null,"abstract":"<p><p>Deciphering pathway-phenotype associations is critical for a system-wide understanding of cells and the chemistry of life. An approach to reach this goal is to systematically modulate pathways pharmacologically. The targeted and controlled regulation of an increasing number of proteins is becoming possible, thanks to the growing list of chemical probes and chemogenomic compounds available to cell biologists, but no resource is available that directly maps these chemical tools on cellular pathways. To fill this gap, we developed Probe my Pathway (PmP), a database where high-quality chemical probes and well-characterized sets of chemogenomic compounds are mapped on all the human pathways of the Reactome database. The web interface allows users to browse the data via icicle charts or search the data for compounds, proteins, or pathways. Chemists can rapidly find pathways with low chemical coverage or explore the structural chemistry of ligands targeting specific cellular machineries. Cell biologists can look for chemical probes targeting different proteins in the same pathway or find which pathways are targeted by chemical probes of interest. PmP is updated annually and will grow with the expanding chemical tool kit produced by Target 2035 and other efforts. Database URL: https://apps.thesgc.org/pmp/.</p>","PeriodicalId":10923,"journal":{"name":"Database: The Journal of Biological Databases and Curation","volume":"2024 ","pages":""},"PeriodicalIF":3.4,"publicationDate":"2024-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11630241/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142827744","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Toward clearer recognition and easier usefulness: development of a cross-lingual atherosclerotic cerebrovascular disease ontology. 更清晰的识别和更方便的使用:开发跨语言的动脉粥样硬化性脑血管疾病本体论。
IF 3.4 4区 生物学
Database: The Journal of Biological Databases and Curation Pub Date : 2024-12-05 DOI: 10.1093/database/baae117
Hetong Ma, Liu Shen, Jiayang Wang, Shilong Wang, Min Wang, Meng Wang, Zixiao Li, Jiao Li
{"title":"Toward clearer recognition and easier usefulness: development of a cross-lingual atherosclerotic cerebrovascular disease ontology.","authors":"Hetong Ma, Liu Shen, Jiayang Wang, Shilong Wang, Min Wang, Meng Wang, Zixiao Li, Jiao Li","doi":"10.1093/database/baae117","DOIUrl":"10.1093/database/baae117","url":null,"abstract":"<p><p>Atherosclerotic cerebrovascular disease could result in a great number of deaths and disabilities. However, it did not acquire enough attention. Less information, statistics, or data on the disease has been revealed. Thus, no systematic concept datasets were released to help clinicians clarify the scope, assist research, and offer maximized value. This study aimed to develop a cross-lingual atherosclerotic cerebrovascular disease ontology; describe the workflow, schema, hierarchical structure, and the highlighted content; design a brand-new rehabilitation ontology; implement the ontology evaluation; and illustrate the application scenarios in real-world scenarios. We implemented nine steps based on the Ontology Development 101 methodologies combined with expert opinions. The ontology included collection and specification of clinical requirements, background investigation and knowledge acquisition, ontology selection and reuse, scope identification, schema definition, concept extraction, concept extension, ontology verification, and ontology evaluation. We evaluated the proposed ontology in the literature classification task. The current ontology included 10 top-level classes, respectively, clinical manifestation, comorbidity, complication, diagnosis, model of atherosclerotic cerebrovascular disease, pathogenesis, prevention, rehabilitation, risk factor, and treatment. There are 1715 concepts in the 11-level ontology, covering 4588 Chinese terms, 6617 English terms, and 972 definitions. The ontology could be applied in real-world scenarios such as information retrieval, new expression discovery, named entity recognition, and knowledge fusion, and the use case proved that it could offer satisfying support to related medical scenarios. The ontology was proven to be useful in text classification tasks, and the weight-F1 score could reach >80% combined with the pretrained model. The proposed ontology provided a clear set of cross-lingual concepts and terms with an explicit hierarchical structure, helping scientific researchers to quickly retrieve relevant medical literature, assisting data scientists to efficiently identify relevant contents in electronic health records, and providing a clear domain framework for academic reference. Database URL: https://bioportal.bioontology.org/ontologies/ACVD_ONTOLOGY.</p>","PeriodicalId":10923,"journal":{"name":"Database: The Journal of Biological Databases and Curation","volume":"2024 ","pages":""},"PeriodicalIF":3.4,"publicationDate":"2024-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11630243/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142827747","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信