The DIEGO Lab Graph Based Gene Normalization System

R. Sullivan, Robert Leaman, Graciela Gonzalez
{"title":"The DIEGO Lab Graph Based Gene Normalization System","authors":"R. Sullivan, Robert Leaman, Graciela Gonzalez","doi":"10.1109/ICMLA.2011.140","DOIUrl":null,"url":null,"abstract":"Gene entity normalization, the mapping of a gene mention in free text to a unique identifier, is one of the primary subtasks in the biomedical information extraction pipeline. Gene entity normalization provides many challenges, specifically with the high ambiguity of gene names and the many-to-many relationship between gene names and identifiers. Drawing inspiration from recent work in word sense disambiguation, this paper presents a gene entity normalization system based on entity relationship graphs. This system creates a concept graph from the possible entities and their relationships within a full-text document, and takes advantage of a node ranking algorithm to rank and score each potential candidate entity. This system is a prototype to represent a specific approach to gene normalization, and the results reflect this. However, this system demonstrates that the relationship graph-based approach, an approach grounded in a theoretical basis, can potentially be useful for gene normalization and possibly for the normalization of various biomedical entities.","PeriodicalId":439926,"journal":{"name":"2011 10th International Conference on Machine Learning and Applications and Workshops","volume":"40 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2011 10th International Conference on Machine Learning and Applications and Workshops","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICMLA.2011.140","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 6

Abstract

Gene entity normalization, the mapping of a gene mention in free text to a unique identifier, is one of the primary subtasks in the biomedical information extraction pipeline. Gene entity normalization provides many challenges, specifically with the high ambiguity of gene names and the many-to-many relationship between gene names and identifiers. Drawing inspiration from recent work in word sense disambiguation, this paper presents a gene entity normalization system based on entity relationship graphs. This system creates a concept graph from the possible entities and their relationships within a full-text document, and takes advantage of a node ranking algorithm to rank and score each potential candidate entity. This system is a prototype to represent a specific approach to gene normalization, and the results reflect this. However, this system demonstrates that the relationship graph-based approach, an approach grounded in a theoretical basis, can potentially be useful for gene normalization and possibly for the normalization of various biomedical entities.
基于DIEGO Lab图的基因归一化系统
基因实体归一化是将自由文本中提到的基因映射到唯一标识符,是生物医学信息提取管道中的主要子任务之一。基因实体规范化带来了许多挑战,特别是基因名称的高度模糊性以及基因名称与标识符之间的多对多关系。借鉴近年来词义消歧研究成果,提出了一种基于实体关系图的基因实体归一化系统。该系统根据全文文档中可能的实体及其关系创建概念图,并利用节点排名算法对每个潜在的候选实体进行排名和评分。该系统是一个原型,代表了一种特定的基因规范化方法,结果反映了这一点。然而,该系统表明,基于关系图的方法,一种基于理论基础的方法,可以潜在地用于基因规范化,也可能用于各种生物医学实体的规范化。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信