{"title":"基于体外人类蛋白质组的DIA-NN无库搜索错误发现率评估。","authors":"Kongxin Gu, Masanaga Kenko, Koji Ogawa, Naoki Goshima, Takeshi Masuda, Shingo Ito, Sumio Ohtsuki","doi":"10.1021/acs.jproteome.5c00036","DOIUrl":null,"url":null,"abstract":"<p><p>Recently, deep-learning-based <i>in silico</i> spectral libraries have gained increasing attention. Several data-independent acquisition (DIA) software tools have integrated this feature, known as a library-free search, thereby making DIA analysis more accessible. However, controlling the false discovery rate (FDR) is challenging owing to the vast amount of peptide information in <i>in silico</i> libraries. In this study, we introduced a stringent method to evaluate FDR control using DIA software. Recombinant proteins were synthesized from full-length human cDNA libraries and analyzed by using liquid chromatography-mass spectrometry and DIA software. The results were compared with known protein sequences to calculate the FDR. Notably, we compared the identification performance of DIA-NN versions 1.8.1, 1.9.2, and 2.1.0. Versions 1.9.2 and 2.10 identified more peptides than version 1.8.1, and versions 1.9.2 and 2.1.0 used a more conservative identification approach, thus significantly improving the FDR control. Across the synthesized recombinant protein mixtures, the average FDR at the precursor level was 0.538% for version 1.8.1, 0.389% for version 1.9.2, and 0.385% for version 2.1.0; at the protein level, the FDRs were 2.85%, 1.81%, and 1.81%, respectively. Collectively, our data set provides valuable insights for comparing FDR controls across DIA software and aiding bioinformaticians in enhancing their tools.</p>","PeriodicalId":48,"journal":{"name":"Journal of Proteome Research","volume":" ","pages":"3874-3883"},"PeriodicalIF":3.6000,"publicationDate":"2025-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Evaluation of the False Discovery Rate in Library-Free Search by DIA-NN Using <i>In Vitro</i> Human Proteome.\",\"authors\":\"Kongxin Gu, Masanaga Kenko, Koji Ogawa, Naoki Goshima, Takeshi Masuda, Shingo Ito, Sumio Ohtsuki\",\"doi\":\"10.1021/acs.jproteome.5c00036\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Recently, deep-learning-based <i>in silico</i> spectral libraries have gained increasing attention. Several data-independent acquisition (DIA) software tools have integrated this feature, known as a library-free search, thereby making DIA analysis more accessible. However, controlling the false discovery rate (FDR) is challenging owing to the vast amount of peptide information in <i>in silico</i> libraries. In this study, we introduced a stringent method to evaluate FDR control using DIA software. Recombinant proteins were synthesized from full-length human cDNA libraries and analyzed by using liquid chromatography-mass spectrometry and DIA software. The results were compared with known protein sequences to calculate the FDR. Notably, we compared the identification performance of DIA-NN versions 1.8.1, 1.9.2, and 2.1.0. Versions 1.9.2 and 2.10 identified more peptides than version 1.8.1, and versions 1.9.2 and 2.1.0 used a more conservative identification approach, thus significantly improving the FDR control. Across the synthesized recombinant protein mixtures, the average FDR at the precursor level was 0.538% for version 1.8.1, 0.389% for version 1.9.2, and 0.385% for version 2.1.0; at the protein level, the FDRs were 2.85%, 1.81%, and 1.81%, respectively. Collectively, our data set provides valuable insights for comparing FDR controls across DIA software and aiding bioinformaticians in enhancing their tools.</p>\",\"PeriodicalId\":48,\"journal\":{\"name\":\"Journal of Proteome Research\",\"volume\":\" \",\"pages\":\"3874-3883\"},\"PeriodicalIF\":3.6000,\"publicationDate\":\"2025-08-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Proteome Research\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://doi.org/10.1021/acs.jproteome.5c00036\",\"RegionNum\":2,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2025/7/18 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q1\",\"JCRName\":\"BIOCHEMICAL RESEARCH METHODS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Proteome Research","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1021/acs.jproteome.5c00036","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/7/18 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}
Evaluation of the False Discovery Rate in Library-Free Search by DIA-NN Using In Vitro Human Proteome.
Recently, deep-learning-based in silico spectral libraries have gained increasing attention. Several data-independent acquisition (DIA) software tools have integrated this feature, known as a library-free search, thereby making DIA analysis more accessible. However, controlling the false discovery rate (FDR) is challenging owing to the vast amount of peptide information in in silico libraries. In this study, we introduced a stringent method to evaluate FDR control using DIA software. Recombinant proteins were synthesized from full-length human cDNA libraries and analyzed by using liquid chromatography-mass spectrometry and DIA software. The results were compared with known protein sequences to calculate the FDR. Notably, we compared the identification performance of DIA-NN versions 1.8.1, 1.9.2, and 2.1.0. Versions 1.9.2 and 2.10 identified more peptides than version 1.8.1, and versions 1.9.2 and 2.1.0 used a more conservative identification approach, thus significantly improving the FDR control. Across the synthesized recombinant protein mixtures, the average FDR at the precursor level was 0.538% for version 1.8.1, 0.389% for version 1.9.2, and 0.385% for version 2.1.0; at the protein level, the FDRs were 2.85%, 1.81%, and 1.81%, respectively. Collectively, our data set provides valuable insights for comparing FDR controls across DIA software and aiding bioinformaticians in enhancing their tools.
期刊介绍:
Journal of Proteome Research publishes content encompassing all aspects of global protein analysis and function, including the dynamic aspects of genomics, spatio-temporal proteomics, metabonomics and metabolomics, clinical and agricultural proteomics, as well as advances in methodology including bioinformatics. The theme and emphasis is on a multidisciplinary approach to the life sciences through the synergy between the different types of "omics".