基于优势关系和邻域相似性中心性的必需蛋白发现。

IF 4.7 3区 医学 Q1 MEDICAL INFORMATICS
Health Information Science and Systems Pub Date : 2023-11-16 eCollection Date: 2023-12-01 DOI:10.1007/s13755-023-00252-9
Gaoshi Li, Xinlong Luo, Zhipeng Hu, Jingli Wu, Wei Peng, Jiafei Liu, Xiaoshu Zhu
{"title":"基于优势关系和邻域相似性中心性的必需蛋白发现。","authors":"Gaoshi Li, Xinlong Luo, Zhipeng Hu, Jingli Wu, Wei Peng, Jiafei Liu, Xiaoshu Zhu","doi":"10.1007/s13755-023-00252-9","DOIUrl":null,"url":null,"abstract":"<p><p>Essential proteins play a vital role in development and reproduction of cells. The identification of essential proteins helps to understand the basic survival of cells. Due to time-consuming, costly and inefficient with biological experimental methods for discovering essential proteins, computational methods have gained increasing attention. In the initial stage, essential proteins are mainly identified by the centralities based on protein-protein interaction (PPI) networks, which limit their identification rate due to many false positives in PPI networks. In this study, a purified PPI network is firstly introduced to reduce the impact of false positives in the PPI network. Secondly, by analyzing the similarity relationship between a protein and its neighbors in the PPI network, a new centrality called neighborhood similarity centrality (NSC) is proposed. Thirdly, based on the subcellular localization and orthologous data, the protein subcellular localization score and ortholog score are calculated, respectively. Fourthly, by analyzing a large number of methods based on multi-feature fusion, it is found that there is a special relationship among features, which is called dominance relationship, then, a novel model based on dominance relationship is proposed. Finally, NSC, subcellular localization score, and ortholog score are fused by the dominance relationship model, and a new method called NSO is proposed. In order to verify the performance of NSO, the seven representative methods (ION, NCCO, E_POC, SON, JDC, PeC, WDC) are compared on yeast datasets. The experimental results show that the NSO method has higher identification rate than other methods.</p>","PeriodicalId":46312,"journal":{"name":"Health Information Science and Systems","volume":"11 1","pages":"55"},"PeriodicalIF":4.7000,"publicationDate":"2023-11-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10654316/pdf/","citationCount":"0","resultStr":"{\"title\":\"Essential proteins discovery based on dominance relationship and neighborhood similarity centrality.\",\"authors\":\"Gaoshi Li, Xinlong Luo, Zhipeng Hu, Jingli Wu, Wei Peng, Jiafei Liu, Xiaoshu Zhu\",\"doi\":\"10.1007/s13755-023-00252-9\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Essential proteins play a vital role in development and reproduction of cells. The identification of essential proteins helps to understand the basic survival of cells. Due to time-consuming, costly and inefficient with biological experimental methods for discovering essential proteins, computational methods have gained increasing attention. In the initial stage, essential proteins are mainly identified by the centralities based on protein-protein interaction (PPI) networks, which limit their identification rate due to many false positives in PPI networks. In this study, a purified PPI network is firstly introduced to reduce the impact of false positives in the PPI network. Secondly, by analyzing the similarity relationship between a protein and its neighbors in the PPI network, a new centrality called neighborhood similarity centrality (NSC) is proposed. Thirdly, based on the subcellular localization and orthologous data, the protein subcellular localization score and ortholog score are calculated, respectively. Fourthly, by analyzing a large number of methods based on multi-feature fusion, it is found that there is a special relationship among features, which is called dominance relationship, then, a novel model based on dominance relationship is proposed. Finally, NSC, subcellular localization score, and ortholog score are fused by the dominance relationship model, and a new method called NSO is proposed. In order to verify the performance of NSO, the seven representative methods (ION, NCCO, E_POC, SON, JDC, PeC, WDC) are compared on yeast datasets. The experimental results show that the NSO method has higher identification rate than other methods.</p>\",\"PeriodicalId\":46312,\"journal\":{\"name\":\"Health Information Science and Systems\",\"volume\":\"11 1\",\"pages\":\"55\"},\"PeriodicalIF\":4.7000,\"publicationDate\":\"2023-11-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10654316/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Health Information Science and Systems\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1007/s13755-023-00252-9\",\"RegionNum\":3,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2023/12/1 0:00:00\",\"PubModel\":\"eCollection\",\"JCR\":\"Q1\",\"JCRName\":\"MEDICAL INFORMATICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Health Information Science and Systems","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1007/s13755-023-00252-9","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2023/12/1 0:00:00","PubModel":"eCollection","JCR":"Q1","JCRName":"MEDICAL INFORMATICS","Score":null,"Total":0}
引用次数: 0

摘要

必需蛋白质在细胞的发育和繁殖中起着至关重要的作用。鉴定必需蛋白质有助于了解细胞的基本生存。由于生物实验方法发现必需蛋白质耗时、成本高、效率低,计算方法越来越受到人们的重视。在初始阶段,主要通过基于蛋白质-蛋白质相互作用(PPI)网络的中心性来识别必需蛋白质,由于PPI网络中存在许多假阳性,限制了它们的识别率。本研究首次引入纯化的PPI网络,以减少PPI网络中假阳性的影响。其次,通过分析蛋白质在PPI网络中的相似关系,提出了一种新的中心性,称为邻域相似中心性(NSC)。第三,基于亚细胞定位和同源数据,分别计算蛋白质亚细胞定位评分和同源评分;第四,通过分析大量基于多特征融合的方法,发现特征之间存在一种特殊的关系,即优势关系,并提出了一种基于优势关系的多特征融合模型。最后,利用优势关系模型融合NSC、亚细胞定位评分和同源评分,提出了一种新的NSO方法。为了验证NSO的性能,在酵母数据集上比较了7种具有代表性的方法(ION、NCCO、E_POC、SON、JDC、PeC、WDC)。实验结果表明,NSO方法比其他方法具有更高的识别率。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Essential proteins discovery based on dominance relationship and neighborhood similarity centrality.

Essential proteins play a vital role in development and reproduction of cells. The identification of essential proteins helps to understand the basic survival of cells. Due to time-consuming, costly and inefficient with biological experimental methods for discovering essential proteins, computational methods have gained increasing attention. In the initial stage, essential proteins are mainly identified by the centralities based on protein-protein interaction (PPI) networks, which limit their identification rate due to many false positives in PPI networks. In this study, a purified PPI network is firstly introduced to reduce the impact of false positives in the PPI network. Secondly, by analyzing the similarity relationship between a protein and its neighbors in the PPI network, a new centrality called neighborhood similarity centrality (NSC) is proposed. Thirdly, based on the subcellular localization and orthologous data, the protein subcellular localization score and ortholog score are calculated, respectively. Fourthly, by analyzing a large number of methods based on multi-feature fusion, it is found that there is a special relationship among features, which is called dominance relationship, then, a novel model based on dominance relationship is proposed. Finally, NSC, subcellular localization score, and ortholog score are fused by the dominance relationship model, and a new method called NSO is proposed. In order to verify the performance of NSO, the seven representative methods (ION, NCCO, E_POC, SON, JDC, PeC, WDC) are compared on yeast datasets. The experimental results show that the NSO method has higher identification rate than other methods.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
11.30
自引率
5.00%
发文量
30
期刊介绍: Health Information Science and Systems is a multidisciplinary journal that integrates artificial intelligence/computer science/information technology with health science and services, embracing information science research coupled with topics related to the modeling, design, development, integration and management of health information systems, smart health, artificial intelligence in medicine, and computer aided diagnosis, medical expert systems. The scope includes: i.) smart health, artificial Intelligence in medicine, computer aided diagnosis, medical image processing, medical expert systems ii.) medical big data, medical/health/biomedicine information resources such as patient medical records, devices and equipments, software and tools to capture, store, retrieve, process, analyze, optimize the use of information in the health domain, iii.) data management, data mining, and knowledge discovery, all of which play a key role in decision making, management of public health, examination of standards, privacy and security issues, iv.) development of new architectures and applications for health information systems.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信