Enhancing late postmortem interval prediction: a pilot study integrating proteomics and machine learning to distinguish human bone remains over 15 years.

IF 4.3 2区 生物学 Q1 BIOLOGY
Camila Garcés-Parra, Pablo Saldivia, Mauricio Hernández, Elena Uribe, Juan Román, Marcela Torrejón, José L Gutiérrez, Guillermo Cabrera-Vives, María de Los Ángeles García-Robles, William Aguilar, Miguel Soto, Estefanía Tarifeño-Saldivia
{"title":"Enhancing late postmortem interval prediction: a pilot study integrating proteomics and machine learning to distinguish human bone remains over 15 years.","authors":"Camila Garcés-Parra, Pablo Saldivia, Mauricio Hernández, Elena Uribe, Juan Román, Marcela Torrejón, José L Gutiérrez, Guillermo Cabrera-Vives, María de Los Ángeles García-Robles, William Aguilar, Miguel Soto, Estefanía Tarifeño-Saldivia","doi":"10.1186/s40659-024-00552-8","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Determining the postmortem interval (PMI) accurately remains a significant challenge in forensic sciences, especially for intervals greater than 5 years (late PMI). Traditional methods often fail due to the extensive degradation of soft tissues, necessitating reliance on bone material examinations. The precision in estimating PMIs diminishes with time, particularly for intervals between 1 and 5 years, dropping to about 50% accuracy. This study aims to address this issue by identifying key protein biomarkers through proteomics and machine learning, ultimately enhancing the accuracy of PMI estimation for intervals exceeding 15 years.</p><p><strong>Methods: </strong>Proteomic analysis was conducted using LC-MS/MS on skeletal remains, specifically focusing on the tibia and ribs. Protein identification was performed using two strategies: a tryptic-specific search and a semitryptic search, the latter being particularly beneficial in cases of natural protein degradation. The Random Forest algorithm was used to model protein abundance data, enabling the prediction of PMI. A thorough screening process, combining importance scores and SHAP values, was employed to identify the most informative proteins for model's training and accuracy.</p><p><strong>Results: </strong>A minimal set of three biomarkers-K1C13, PGS1, and CO3A1-was identified, significantly improving the prediction accuracy between PMIs of 15 and 20 years. The model, based on protein abundance data from semitryptic peptides in tibia samples, achieved sustained 100% accuracy across 100 iterations. In contrast, non-supervised methods like PCA and MCA did not yield comparable results. Additionally, the use of semitryptic peptides outperformed tryptic peptides, particularly in tibia proteomes, suggesting their potential reliability in late PMI prediction.</p><p><strong>Conclusions: </strong>Despite limitations such as sample size and PMI range, this study demonstrates the feasibility of combining proteomics and machine learning for accurate late PMI predictions. Future research should focus on broader PMI ranges and various bone types to further refine and standardize forensic proteomic methodologies for PMI estimation.</p>","PeriodicalId":9084,"journal":{"name":"Biological Research","volume":"57 1","pages":"75"},"PeriodicalIF":4.3000,"publicationDate":"2024-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11515459/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Biological Research","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1186/s40659-024-00552-8","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

Background: Determining the postmortem interval (PMI) accurately remains a significant challenge in forensic sciences, especially for intervals greater than 5 years (late PMI). Traditional methods often fail due to the extensive degradation of soft tissues, necessitating reliance on bone material examinations. The precision in estimating PMIs diminishes with time, particularly for intervals between 1 and 5 years, dropping to about 50% accuracy. This study aims to address this issue by identifying key protein biomarkers through proteomics and machine learning, ultimately enhancing the accuracy of PMI estimation for intervals exceeding 15 years.

Methods: Proteomic analysis was conducted using LC-MS/MS on skeletal remains, specifically focusing on the tibia and ribs. Protein identification was performed using two strategies: a tryptic-specific search and a semitryptic search, the latter being particularly beneficial in cases of natural protein degradation. The Random Forest algorithm was used to model protein abundance data, enabling the prediction of PMI. A thorough screening process, combining importance scores and SHAP values, was employed to identify the most informative proteins for model's training and accuracy.

Results: A minimal set of three biomarkers-K1C13, PGS1, and CO3A1-was identified, significantly improving the prediction accuracy between PMIs of 15 and 20 years. The model, based on protein abundance data from semitryptic peptides in tibia samples, achieved sustained 100% accuracy across 100 iterations. In contrast, non-supervised methods like PCA and MCA did not yield comparable results. Additionally, the use of semitryptic peptides outperformed tryptic peptides, particularly in tibia proteomes, suggesting their potential reliability in late PMI prediction.

Conclusions: Despite limitations such as sample size and PMI range, this study demonstrates the feasibility of combining proteomics and machine learning for accurate late PMI predictions. Future research should focus on broader PMI ranges and various bone types to further refine and standardize forensic proteomic methodologies for PMI estimation.

加强死后晚期间隔预测:一项将蛋白质组学和机器学习相结合的试点研究,以区分 15 年来的人类骨骼遗骸。
背景:准确确定死后间隔期(PMI)仍然是法医学面临的一项重大挑战,尤其是间隔期超过 5 年(晚期 PMI)的情况。由于软组织的广泛退化,传统方法经常失效,因此必须依赖骨材料检查。估计 PMI 的精确度会随着时间的推移而降低,特别是 1 至 5 年的间隔,精确度会下降到 50%左右。本研究旨在通过蛋白质组学和机器学习识别关键蛋白质生物标志物来解决这一问题,最终提高超过 15 年的 PMI 估计的准确性:采用 LC-MS/MS 对骨骼遗骸进行了蛋白质组学分析,尤其侧重于胫骨和肋骨。蛋白质鉴定采用两种策略:胰蛋白酶特异性搜索和半胰蛋白酶搜索,后者在蛋白质自然降解的情况下尤其有效。使用随机森林算法对蛋白质丰度数据进行建模,从而预测 PMI。结合重要性评分和SHAP值进行了全面筛选,以确定对模型训练和准确性最有参考价值的蛋白质:结果:确定了最小的三个生物标志物--K1C13、PGS1 和 CO3A1,显著提高了 15 至 20 岁 PMI 之间的预测准确性。该模型基于胫骨样本中半隐态肽的蛋白质丰度数据,经过 100 次迭代,准确率持续保持在 100%。相比之下,PCA 和 MCA 等非监督方法的结果无法与之相比。此外,半隐肽的使用效果优于胰蛋白酶肽,尤其是在胫骨蛋白质组中,这表明半隐肽在后期PMI预测中具有潜在的可靠性:尽管存在样本量和 PMI 范围等限制,但这项研究证明了将蛋白质组学与机器学习相结合以准确预测晚期 PMI 的可行性。未来的研究应侧重于更广泛的 PMI 范围和各种骨骼类型,以进一步完善和规范用于 PMI 估计的法医蛋白质组学方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Biological Research
Biological Research 生物-生物学
CiteScore
10.10
自引率
0.00%
发文量
33
审稿时长
>12 weeks
期刊介绍: Biological Research is an open access, peer-reviewed journal that encompasses diverse fields of experimental biology, such as biochemistry, bioinformatics, biotechnology, cell biology, cancer, chemical biology, developmental biology, evolutionary biology, genetics, genomics, immunology, marine biology, microbiology, molecular biology, neuroscience, plant biology, physiology, stem cell research, structural biology and systems biology.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信