Data and Text Mining in Bioinformatics最新文献

筛选
英文 中文
Pathway-based classification of brain activities for alzheimer's disease analysis 基于通路的脑活动分类用于阿尔茨海默病分析
Data and Text Mining in Bioinformatics Pub Date : 2013-11-01 DOI: 10.1145/2512089.2512093
Jongan Lee, Younghoon Kim, Y. Jeong, D. Na, Kwang-H. Lee, Doheon Lee
{"title":"Pathway-based classification of brain activities for alzheimer's disease analysis","authors":"Jongan Lee, Younghoon Kim, Y. Jeong, D. Na, Kwang-H. Lee, Doheon Lee","doi":"10.1145/2512089.2512093","DOIUrl":"https://doi.org/10.1145/2512089.2512093","url":null,"abstract":"The advent of resting-state (RS) functional magnetic resonance imaging (fMRI) technology has made it possible to classify Alzheimer's disease (AD) states based on the quantitative activity indices of brain regions. Current connectivity-based classification techniques suffer from limited reproducibility due to the need for prior knowledge on discriminative brain regions and intrinsic heterogeneity in the course of AD progression. Actually, similar challenges have been already addressed in molecular bioinformatics communities. They have achieved higher and reproducible classification accuracy and have identified interpretable markers by incorporating molecular pathway information in their classification. We have adopted a similar strategy to the RS-fMRI-based AD classification problem. After collecting various functional brain pathways from literature, we have quantified which pathways show significantly different activity levels between AD patients and healthy subjects. Moreover, discriminatory pathways between AD patients and healthy subjects may facilitate the interpretation of functional alterations in the course of AD progression.","PeriodicalId":143937,"journal":{"name":"Data and Text Mining in Bioinformatics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2013-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125305255","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Exploring the effectiveness of medical entity recognition for clinical information retrieval 探索医学实体识别在临床信息检索中的有效性
Data and Text Mining in Bioinformatics Pub Date : 2013-11-01 DOI: 10.1145/2512089.2512091
J. Cogley, N. Stokes, J. Carthy
{"title":"Exploring the effectiveness of medical entity recognition for clinical information retrieval","authors":"J. Cogley, N. Stokes, J. Carthy","doi":"10.1145/2512089.2512091","DOIUrl":"https://doi.org/10.1145/2512089.2512091","url":null,"abstract":"The growth of medical and clinical textual datasets has fostered research interests in methods for storing, retrieving and extracting of pertinent data. In more recent years, shared tasks and more comprehensive data sharing agreements have seen a further growth in the research area spanning Natural Language Processing (NLP) and Information Retrieval (IR) to aid the world of healthcare. Frequently NLP applications such as Medical Entity Recognition (MER), are motivated within the context of improving IR system performance. In this paper, we investigate the application of MER to a clinical retrieval system in the context of shared tasks in the respective areas. Namely, we aim to add structure to previously unstructured clinical reports and query sets. We evaluate the performance of MER on the query set, highlighting issues in constructing queries in a clinical setting. Further to this, we evaluate the performance of structuring queries on a retrieval dataset. We find that while structuring queries improves performance on complex queries that contain many term dependencies, there is a larger issue of linguistic variation found in clinical texts that must also be addressed.","PeriodicalId":143937,"journal":{"name":"Data and Text Mining in Bioinformatics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2013-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116317584","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
BoDBES: a boosted dictionary-based biomedical entity spotter BoDBES:一个基于字典的生物医学实体识别器
Data and Text Mining in Bioinformatics Pub Date : 2013-11-01 DOI: 10.1145/2512089.2512098
Min Song, Wook-Shin Han, Hwanjo Yu
{"title":"BoDBES: a boosted dictionary-based biomedical entity spotter","authors":"Min Song, Wook-Shin Han, Hwanjo Yu","doi":"10.1145/2512089.2512098","DOIUrl":"https://doi.org/10.1145/2512089.2512098","url":null,"abstract":"To measure the impact of the difference sources on the performance of entity extraction, we used three different data sources: 1) GENIA, 2) Mesh Tree, and 3) UMLS. The performance is also measured by F1. In the performance comparision among three approaches on the dictionary with GENIA+MeSH, BoDBES is slightly better than SPED in all three datasets whereas the context only option shows the worst performance.","PeriodicalId":143937,"journal":{"name":"Data and Text Mining in Bioinformatics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2013-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116331173","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
BSML: bio-synergy modeling language for multi-component and multi-target analysis BSML:用于多组分和多目标分析的生物协同建模语言
Data and Text Mining in Bioinformatics Pub Date : 2013-11-01 DOI: 10.1145/2512089.2512097
W. Hwang, Jaejoon Choi, J. Jung, Doheon Lee
{"title":"BSML: bio-synergy modeling language for multi-component and multi-target analysis","authors":"W. Hwang, Jaejoon Choi, J. Jung, Doheon Lee","doi":"10.1145/2512089.2512097","DOIUrl":"https://doi.org/10.1145/2512089.2512097","url":null,"abstract":"Multi-compound drugs are considered as the most promising solution to overcome the limited efficacy and off-target effect of drugs. However, identifying promising multiple compounds by experimental tests requires overwhelming costs and a number of tests. Systems biology-based approaches are regarded as one of the most promising strategy. To predict responses of drugs in biological systems is one of aims of Systems biology.\u0000 We made Bio-Synergy Modeling Language (BSML) for modeling biological systems, which are multi-scale systems. BSML contains context information that covers spatial scales, temporal scales, and condition information, such as disease. We have applied BSML to generate type 2 diabetes (T2D) model, which involves malfunctions of numerous organs such as pancreas, liver, and muscle. We have extracted 12,522 T2D-related rules from public databases automatically. We simulated responses of single drugs and combination drugs on the T2D model by Petri nets. The results of our simulation show candidate T2D drugs and how combination drugs could act on whole-body scales. We expect that our work would provide an insight for identifying promising combination drugs and mechanisms of combination drugs on whole body scales.","PeriodicalId":143937,"journal":{"name":"Data and Text Mining in Bioinformatics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2013-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127505436","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Breast and prostate cancer expression similarity analysis by iterative SVM based ensemble gene selection 基于迭代支持向量机的集合基因选择的乳腺癌和前列腺癌表达相似性分析
Data and Text Mining in Bioinformatics Pub Date : 2013-11-01 DOI: 10.1145/2512089.2512099
Darius Coelho, Lee Sael
{"title":"Breast and prostate cancer expression similarity analysis by iterative SVM based ensemble gene selection","authors":"Darius Coelho, Lee Sael","doi":"10.1145/2512089.2512099","DOIUrl":"https://doi.org/10.1145/2512089.2512099","url":null,"abstract":"Epidemiologic and phenotypic evidences indicate that breast and prostate cancers have high pathological similarities. Analysis of pathological similarities between cancers can be beneficial in several aspects such as enabling the knowledge transfer between the cancer studies. To gain knowledge of the similarity between the breast and prostate cancer pathology, common genes that are affected by the two carcinomas are investigated. Gene expression data extracted from RNA-seq experiments, provided through TCGA consortium, is used for gene selection. Gene selection was performed using an iterative SVM based ensemble feature selection approach. Iterative SVM-based gene selection methods enable correlated gene expressions to be considered simultaneously and ensemble approach stabilizes the selection. As results of the analysis, two genes, Transglutaminase 4 (TGM4) and complement component 4A (C4A), were selected as commonly altered genes. Direct relationships of the two genes to the two cancers are not confirmed. However, TGM4 is known to be associated with adenocarcinomas and C4A with ovarian cancer. Thus provides evidence that they maybe pathologically important genes for the two cancers.","PeriodicalId":143937,"journal":{"name":"Data and Text Mining in Bioinformatics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2013-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121337792","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Efficient local ligand-binding site search using landmark mds 基于地标mds的高效局部配体结合位点搜索
Data and Text Mining in Bioinformatics Pub Date : 2013-11-01 DOI: 10.1145/2512089.2512092
Sungchul Kim, Lee Sael, Hwanjo Yu
{"title":"Efficient local ligand-binding site search using landmark mds","authors":"Sungchul Kim, Lee Sael, Hwanjo Yu","doi":"10.1145/2512089.2512092","DOIUrl":"https://doi.org/10.1145/2512089.2512092","url":null,"abstract":"In this work, we propose a new local binding site search system, called Fast Patch-Surfer, for extending previous work, Patch-Surfer. Patch-Surfer efficiently retrieves top-k similar proteins based on new representation of proteins capturing features of their local ligand-binding site and newly defined distance function. However, further speed up is needed since in practical setting of computing dissimilarity between proteins, there are possibilities for simultaneous multiple user access on the database. We address this need for further speed up in local ligand-binding site search by exploiting landmark MultiDimensional Scaling (MDS), which is an efficient version of MDS being popularly used for representing high-dimensional dataset. According to the result, using our method, the searching time is reduced up to 99%, and it retrieves almost 80% of exact top-k results.","PeriodicalId":143937,"journal":{"name":"Data and Text Mining in Bioinformatics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2013-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130223362","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Combining dictionaries and ontologies for drug name recognition in biomedical texts 结合字典和本体论在生物医学文本中的药物名称识别
Data and Text Mining in Bioinformatics Pub Date : 2013-11-01 DOI: 10.1145/2512089.2512100
Daniel Sánchez-Cisneros, Paloma Martínez, Isabel Segura-Bedmar
{"title":"Combining dictionaries and ontologies for drug name recognition in biomedical texts","authors":"Daniel Sánchez-Cisneros, Paloma Martínez, Isabel Segura-Bedmar","doi":"10.1145/2512089.2512100","DOIUrl":"https://doi.org/10.1145/2512089.2512100","url":null,"abstract":"Two approaches have been commonly used for recognizing Drug Name Entities in biomedical texts: machine learning-based and domain specific resources-based approaches. In this work we focus on the second one by combining (1) a dictionary-based approach that collects terms from different pharmacological data sources such as DrugBank, MeSH, RxNorm and ATC index; and (2) an ontology-based approach that maps each text unit of a source text into one or more domain-specific concepts, providing rich semantic knowledge of domain name entities using Metamap and Mgrep analyzer. The aim is to take advantage of the best of each resource used. The combined system obtains an F1 measure of 0, 667 over exact matching span evaluation.","PeriodicalId":143937,"journal":{"name":"Data and Text Mining in Bioinformatics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2013-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115154761","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
Bayesian variable selection for linear regression in high dimensional microarray data 高维微阵列数据线性回归的贝叶斯变量选择
Data and Text Mining in Bioinformatics Pub Date : 2013-11-01 DOI: 10.1145/2512089.2512094
Wellington Cabrera, C. Ordonez, D. S. Matusevich, V. Baladandayuthapani
{"title":"Bayesian variable selection for linear regression in high dimensional microarray data","authors":"Wellington Cabrera, C. Ordonez, D. S. Matusevich, V. Baladandayuthapani","doi":"10.1145/2512089.2512094","DOIUrl":"https://doi.org/10.1145/2512089.2512094","url":null,"abstract":"Variable selection is a fundamental problem in Bayesian statistics whose solution requires exploring a combinatorial search space. We study the solution of variable selection with a well-known MCMC method, which requires thousands of iterations. We present several algorithmic optimizations to accelerate the MCMC method to make it work efficiently inside a database system. Our optimizations include sufficient statistics, variable preselection, hash tables and calling a linear algebra library. We present experiments with very high dimensional microarray data sets to predict cancer survival time. We discuss encouraging findings, identifying specific genes likely to predict the survival time for brain cancer patients. We also show our DBMS-based algorithm is orders of magnitude faster than the R statistical package. Our work shows a DBMS is a promising platform to analyze microarray data.","PeriodicalId":143937,"journal":{"name":"Data and Text Mining in Bioinformatics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2013-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123517563","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Translating a trillion points of data into therapies, diagnostics, and new insights into disease 将一万亿点数据转化为治疗、诊断和对疾病的新见解
Data and Text Mining in Bioinformatics Pub Date : 2013-11-01 DOI: 10.1145/2512089.2512102
A. Butte
{"title":"Translating a trillion points of data into therapies, diagnostics, and new insights into disease","authors":"A. Butte","doi":"10.1145/2512089.2512102","DOIUrl":"https://doi.org/10.1145/2512089.2512102","url":null,"abstract":"There is an urgent need to translate genome-era discoveries into clinical utility, but the difficulties in making bench-to-bedside translations have been well described. The nascent field of translational bioinformatics may help. Dr. Butte's lab at Stanford builds and applies tools that convert more than a trillion points of molecular, clinical, and epidemiological data - measured by researchers and clinicians over the past decade - into diagnostics, therapeutics, and new insights into disease. Dr. Butte, a bioinformatician and pediatric endocrinologist, will highlight his lab's work on using publicly-available molecular measurements to find new uses for drugs including drug repositioning for inflammatory bowel disease, discovering new treatable inflammatory mechanisms of disease in type 2 diabetes, and the evaluation of patients presenting with whole genomes sequenced.","PeriodicalId":143937,"journal":{"name":"Data and Text Mining in Bioinformatics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2013-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117332127","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Refining health outcomes of interest using formal concept analysis and semantic query expansion 使用形式概念分析和语义查询扩展来细化感兴趣的健康结果
Data and Text Mining in Bioinformatics Pub Date : 2013-11-01 DOI: 10.1145/2512089.2512095
Olivier Curé, H. Maurer, N. Shah, P. LePendu
{"title":"Refining health outcomes of interest using formal concept analysis and semantic query expansion","authors":"Olivier Curé, H. Maurer, N. Shah, P. LePendu","doi":"10.1145/2512089.2512095","DOIUrl":"https://doi.org/10.1145/2512089.2512095","url":null,"abstract":"Clinicians and researchers using Electronic Health Records (EHRs) often search for, extract, and analyze groups of patients by defining a Health Outcome of Interest (HOI), which may include a set of diseases, conditions, signs, or symptoms. In our work on pharmacovigilance using clinical notes, for example, we use a method that operates over many (potentially hundreds) of ontologies at once, expands the input query, and increases the search space over clinical text as well as structured data. This method requires specifying an initial set of seed concepts, based on concept unique identifiers from the UMLS Metathesaurus. In some cases, such as for progressive multifocal leukoencephalopathy, the seed query is easy to specify, but in other cases this task can be more subtle and requires manual-intensive work, such as for chronic obstructive pulmonary disease. The challenge in defining an HOI arises because medical and health terminologies are numerous and complex. We have developed a method consisting of a cooperation between Semantic Query Expansion, to leverage the hierarchical structure of ontologies, and Formal Concept Analysis, to organize, reason, and prune discovered concepts in an efficient manner over a large number of ontologies. Together, they assist the user, through a RESTful API and a web-based graphical user interface, in defining their seed query and in refining the expanded search space that it encompasses. In this context, end-user interactions mainly consist in accepting or rejecting system propositions and can be ceased on the user's will. We use this approach for text-mining clinical notes from EHRs, but they are equally applicable for cohort building tools in general. A preliminary evaluation of this work, on the i2b2 Obesity NLP reference set, emphasizes positive results for sensitivity and specificity measures which are slightly improving existing results on this gold standard. This experimentation also highlights that our semi-automatic approach provides fast processing times (in the order of milliseconds to few seconds) for the generation of several thousands of potential terms. The most promising aspect of this approach is the discovery of potentially positive results from false negative concepts discovered by our method. In future works, we aim to conduct user driven evaluation of the Web interface, analyze the acceptance/rejection of physicians in several practical scenarios and use active learning over past query refinements to improve future queries.","PeriodicalId":143937,"journal":{"name":"Data and Text Mining in Bioinformatics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2013-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126990145","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信