TILD:利用文献数据中的标题信息识别癌症相关基因的策略

Jeongwoo Kim, H. Kim, Yunku Yeu, Mincheol Shin, Sanghyun Park
{"title":"TILD:利用文献数据中的标题信息识别癌症相关基因的策略","authors":"Jeongwoo Kim, H. Kim, Yunku Yeu, Mincheol Shin, Sanghyun Park","doi":"10.1145/2665970.2665992","DOIUrl":null,"url":null,"abstract":"After genome project in 1990s, researches which are involved with gene have been progressed. These studies unearthed that gene is cause of disease, and relations between gene and disease are important. In this reason, we proposed a strategy called TILD that identifies cancer-related genes using title information in literature data. To implement our method, we selected cancer-specific literature data from the online database. We then extracted genes using text mining. In the next step, we classified into two kinds for extracted genes using title information. If genes are located in title, then they are classified as hub genes. In the contrast, if genes are located in body, then they are classified as sub genes which are connected with hub genes. We iterated the processes for each paper to construct the cancer-specific local gene network. In the last step, we constructed global cancer-specific gene network by integrating all local gene network, and calculated a score for each gene based on analysis of the global gene network. We assumed that genes in title have meaningful relations with cancer, and other genes in the body are related with the title genes. For validation, we compared with other methods for the top 20 genes inferred by each approach. Our approach found more cancer-related genes than comparable methods.","PeriodicalId":143937,"journal":{"name":"Data and Text Mining in Bioinformatics","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2014-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"TILD: A Strategy to Identify Cancer-related Genes Using Title Information in Literature Data\",\"authors\":\"Jeongwoo Kim, H. Kim, Yunku Yeu, Mincheol Shin, Sanghyun Park\",\"doi\":\"10.1145/2665970.2665992\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"After genome project in 1990s, researches which are involved with gene have been progressed. These studies unearthed that gene is cause of disease, and relations between gene and disease are important. In this reason, we proposed a strategy called TILD that identifies cancer-related genes using title information in literature data. To implement our method, we selected cancer-specific literature data from the online database. We then extracted genes using text mining. In the next step, we classified into two kinds for extracted genes using title information. If genes are located in title, then they are classified as hub genes. In the contrast, if genes are located in body, then they are classified as sub genes which are connected with hub genes. We iterated the processes for each paper to construct the cancer-specific local gene network. In the last step, we constructed global cancer-specific gene network by integrating all local gene network, and calculated a score for each gene based on analysis of the global gene network. We assumed that genes in title have meaningful relations with cancer, and other genes in the body are related with the title genes. For validation, we compared with other methods for the top 20 genes inferred by each approach. Our approach found more cancer-related genes than comparable methods.\",\"PeriodicalId\":143937,\"journal\":{\"name\":\"Data and Text Mining in Bioinformatics\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2014-11-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Data and Text Mining in Bioinformatics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2665970.2665992\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Data and Text Mining in Bioinformatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2665970.2665992","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

20世纪90年代基因组计划后,涉及基因的研究有了新的进展。这些研究揭示了基因是疾病的原因,基因与疾病之间的关系是重要的。因此,我们提出了一种名为TILD的策略,利用文献数据中的标题信息识别癌症相关基因。为了实现我们的方法,我们从在线数据库中选择了癌症特异性文献数据。然后我们使用文本挖掘提取基因。在接下来的步骤中,我们使用标题信息将提取的基因分为两类。如果基因位于标题中,则将其分类为枢纽基因。相反,如果基因位于体内,则将其归类为亚基因,亚基因与枢纽基因相连。我们为每篇论文重复了构建癌症特异性局部基因网络的过程。最后一步,我们通过整合所有局部基因网络构建全球癌症特异性基因网络,并在分析全球基因网络的基础上计算每个基因的得分。我们假设标题中的基因与癌症有意义的关系,而体内的其他基因也与标题基因有关。为了验证,我们将每种方法推断的前20个基因与其他方法进行了比较。我们的方法比同类方法发现了更多的癌症相关基因。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
TILD: A Strategy to Identify Cancer-related Genes Using Title Information in Literature Data
After genome project in 1990s, researches which are involved with gene have been progressed. These studies unearthed that gene is cause of disease, and relations between gene and disease are important. In this reason, we proposed a strategy called TILD that identifies cancer-related genes using title information in literature data. To implement our method, we selected cancer-specific literature data from the online database. We then extracted genes using text mining. In the next step, we classified into two kinds for extracted genes using title information. If genes are located in title, then they are classified as hub genes. In the contrast, if genes are located in body, then they are classified as sub genes which are connected with hub genes. We iterated the processes for each paper to construct the cancer-specific local gene network. In the last step, we constructed global cancer-specific gene network by integrating all local gene network, and calculated a score for each gene based on analysis of the global gene network. We assumed that genes in title have meaningful relations with cancer, and other genes in the body are related with the title genes. For validation, we compared with other methods for the top 20 genes inferred by each approach. Our approach found more cancer-related genes than comparable methods.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信