Mining Proteome Research Reports: A Bird's Eye View.

IF 4 Q2 BIOCHEMISTRY & MOLECULAR BIOLOGY
Proteomes Pub Date : 2021-06-10 DOI:10.3390/proteomes9020029
Jagajjit Sahu
{"title":"Mining Proteome Research Reports: A Bird's Eye View.","authors":"Jagajjit Sahu","doi":"10.3390/proteomes9020029","DOIUrl":null,"url":null,"abstract":"<p><p>The complexity of data has burgeoned to such an extent that scientists of every realm are encountering the incessant challenge of data management. Modern-day analytical approaches with the help of free source tools and programming languages have facilitated access to the context of the various domains as well as specific works reported. Here, with this article, an attempt has been made to provide a systematic analysis of all the available reports at PubMed on Proteome using text mining. The work is comprised of scientometrics as well as information extraction to provide the publication trends as well as frequent keywords, bioconcepts and most importantly gene-gene co-occurrence network. Out of 33,028 PMIDs collected initially, the segregation of 24,350 articles under 28 Medical Subject Headings (MeSH) was analyzed and plotted. Keyword link network and density visualizations were provided for the top 1000 frequent Mesh keywords. PubTator was used, and 322,026 bioconcepts were able to extracted under 10 classes (such as Gene, Disease, CellLine, etc.). Co-occurrence networks were constructed for PMID-bioconcept as well as bioconcept-bioconcept associations. Further, for creation of subnetwork with respect to gene-gene co-occurrence, a total of 11,100 unique genes participated with mTOR and AKT showing the highest (64) number of connections. The gene p53 was the most popular one in the network in accordance with both the degree and weighted degree centrality, which were 425 and 1414, respectively. The present piece of study is an amalgam of bibliometrics and scientific data mining methods looking deeper into the whole scale analysis of available literature on proteome.</p>","PeriodicalId":20877,"journal":{"name":"Proteomes","volume":"9 2","pages":""},"PeriodicalIF":4.0000,"publicationDate":"2021-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.3390/proteomes9020029","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proteomes","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3390/proteomes9020029","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"BIOCHEMISTRY & MOLECULAR BIOLOGY","Score":null,"Total":0}
引用次数: 1

Abstract

The complexity of data has burgeoned to such an extent that scientists of every realm are encountering the incessant challenge of data management. Modern-day analytical approaches with the help of free source tools and programming languages have facilitated access to the context of the various domains as well as specific works reported. Here, with this article, an attempt has been made to provide a systematic analysis of all the available reports at PubMed on Proteome using text mining. The work is comprised of scientometrics as well as information extraction to provide the publication trends as well as frequent keywords, bioconcepts and most importantly gene-gene co-occurrence network. Out of 33,028 PMIDs collected initially, the segregation of 24,350 articles under 28 Medical Subject Headings (MeSH) was analyzed and plotted. Keyword link network and density visualizations were provided for the top 1000 frequent Mesh keywords. PubTator was used, and 322,026 bioconcepts were able to extracted under 10 classes (such as Gene, Disease, CellLine, etc.). Co-occurrence networks were constructed for PMID-bioconcept as well as bioconcept-bioconcept associations. Further, for creation of subnetwork with respect to gene-gene co-occurrence, a total of 11,100 unique genes participated with mTOR and AKT showing the highest (64) number of connections. The gene p53 was the most popular one in the network in accordance with both the degree and weighted degree centrality, which were 425 and 1414, respectively. The present piece of study is an amalgam of bibliometrics and scientific data mining methods looking deeper into the whole scale analysis of available literature on proteome.

Abstract Image

Abstract Image

Abstract Image

挖掘蛋白质组研究报告:鸟瞰图。
数据的复杂性已经发展到如此程度,以至于每个领域的科学家都面临着数据管理的不断挑战。在免费源代码工具和编程语言的帮助下,现代分析方法促进了对各个领域以及所报告的具体工作的上下文的访问。在本文中,我们尝试使用文本挖掘技术对PubMed上关于蛋白质组学的所有可用报告进行系统分析。该工作由科学计量学和信息提取组成,以提供出版趋势以及常见关键词,生物概念和最重要的基因-基因共现网络。在最初收集的33,028份pmid中,对28个医学主题目(MeSH)下的24,350篇文章进行了分析和绘制。为前1000个频繁出现的Mesh关键词提供了关键词链接网络和密度可视化。使用PubTator,能够提取10类(如Gene, Disease, CellLine等)下的322,026个生物概念。构建了PMID-bioconcept以及bioconcept-bioconcept关联的共现网络。此外,对于基因-基因共发生的子网络的创建,共有11,100个独特的基因参与其中,mTOR和AKT的连接数量最多(64个)。从度中心性和加权度中心性来看,p53基因是网络中最受欢迎的基因,分别为425和1414。目前的研究是文献计量学和科学数据挖掘方法的结合,深入研究了蛋白质组学的现有文献的整体规模分析。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Proteomes
Proteomes Biochemistry, Genetics and Molecular Biology-Clinical Biochemistry
CiteScore
6.50
自引率
3.00%
发文量
37
审稿时长
11 weeks
期刊介绍: Proteomes (ISSN 2227-7382) is an open access, peer reviewed journal on all aspects of proteome science. Proteomes covers the multi-disciplinary topics of structural and functional biology, protein chemistry, cell biology, methodology used for protein analysis, including mass spectrometry, protein arrays, bioinformatics, HTS assays, etc. Our aim is to encourage scientists to publish their experimental and theoretical results in as much detail as possible. Therefore, there is no restriction on the length of papers. Scope: -whole proteome analysis of any organism -disease/pharmaceutical studies -comparative proteomics -protein-ligand/protein interactions -structure/functional proteomics -gene expression -methodology -bioinformatics -applications of proteomics
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信