DeepForest-HTP: A novel deep forest approach for predicting antihypertensive peptides

IF 4.9 2区 医学 Q1 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS
Qiyuan Bai , Hao Chen , Wenshuo Li , Lei Li , Junhao Li , Zhen Gao , Yuan Li , Xuhua Li , Bing Song
{"title":"DeepForest-HTP: A novel deep forest approach for predicting antihypertensive peptides","authors":"Qiyuan Bai ,&nbsp;Hao Chen ,&nbsp;Wenshuo Li ,&nbsp;Lei Li ,&nbsp;Junhao Li ,&nbsp;Zhen Gao ,&nbsp;Yuan Li ,&nbsp;Xuhua Li ,&nbsp;Bing Song","doi":"10.1016/j.cmpb.2024.108514","DOIUrl":null,"url":null,"abstract":"<div><div>Hypertension is a major preventable risk factor for cardiovascular disease, affecting over 1.5 billion adults worldwide. Antihypertensive peptides (AHTPs) have gained attention as a natural therapeutic option with minimal side effects. This study proposes a Deep Forest-based machine learning framework for AHTP prediction, leveraging a multi-granularity cascade structure to enhance classification accuracy. We integrated data from BIOPEP-UWM and three previously used datasets, totaling 2000 peptide sequences, and introduced novel feature extraction methods to build a comprehensive dataset for model training.</div><div>This study represents the first application of Deep Forest for AHTP identification, demonstrating substantial classification performance advantages over traditional methods (e.g., SVM, CNN, and XGBoost) as well as recent mainstream prediction models (Ensemble-AHTPpred, CNN-SVM Ensemble, and mAHTPred). Requiring no complex manual feature engineering, the model adapts flexibly to various data needs, offering a novel perspective for efficient AHTP prediction and promising utility in hypertension management.</div><div>On the benchmark dataset, the model achieved high accuracy, sensitivity, and AUC, providing a robust tool for identifying safe and effective AHTPs. However, future efforts should incorporate larger and more diverse independent validation datasets to further improve the model and enhance its generalizability. Additionally, the model's predictive accuracy relies on known AHTP targets and sequence features, potentially limiting its ability to detect AHTPs with uncharacterized or atypical properties.</div></div>","PeriodicalId":10624,"journal":{"name":"Computer methods and programs in biomedicine","volume":"258 ","pages":"Article 108514"},"PeriodicalIF":4.9000,"publicationDate":"2024-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer methods and programs in biomedicine","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0169260724005078","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0

Abstract

Hypertension is a major preventable risk factor for cardiovascular disease, affecting over 1.5 billion adults worldwide. Antihypertensive peptides (AHTPs) have gained attention as a natural therapeutic option with minimal side effects. This study proposes a Deep Forest-based machine learning framework for AHTP prediction, leveraging a multi-granularity cascade structure to enhance classification accuracy. We integrated data from BIOPEP-UWM and three previously used datasets, totaling 2000 peptide sequences, and introduced novel feature extraction methods to build a comprehensive dataset for model training.
This study represents the first application of Deep Forest for AHTP identification, demonstrating substantial classification performance advantages over traditional methods (e.g., SVM, CNN, and XGBoost) as well as recent mainstream prediction models (Ensemble-AHTPpred, CNN-SVM Ensemble, and mAHTPred). Requiring no complex manual feature engineering, the model adapts flexibly to various data needs, offering a novel perspective for efficient AHTP prediction and promising utility in hypertension management.
On the benchmark dataset, the model achieved high accuracy, sensitivity, and AUC, providing a robust tool for identifying safe and effective AHTPs. However, future efforts should incorporate larger and more diverse independent validation datasets to further improve the model and enhance its generalizability. Additionally, the model's predictive accuracy relies on known AHTP targets and sequence features, potentially limiting its ability to detect AHTPs with uncharacterized or atypical properties.
DeepForest-HTP:预测抗高血压肽的新型深度森林方法。
高血压是心血管疾病的主要可预防风险因素,影响着全球超过 15 亿成年人。抗高血压肽(AHTPs)作为一种副作用极小的天然疗法受到了人们的关注。本研究提出了一种基于深林的机器学习框架来预测 AHTP,利用多粒度级联结构来提高分类准确性。我们整合了来自 BIOPEP-UWM 和之前使用过的三个数据集的数据,共计 2000 个肽序列,并引入了新颖的特征提取方法,以建立一个用于模型训练的综合数据集。这项研究是 Deep Forest 在 AHTP 鉴定中的首次应用,与传统方法(如 SVM、CNN 和 XGBoost)以及最近的主流预测模型(Ensemble-AHTPpred、CNN-SVM Ensemble 和 mAHTPred)相比,Deep Forest 的分类性能具有很大优势。该模型无需复杂的人工特征工程,可灵活适应各种数据需求,为高效的 AHTP 预测提供了新的视角,并有望在高血压管理中发挥作用。在基准数据集上,该模型实现了较高的准确度、灵敏度和 AUC,为识别安全有效的 AHTPs 提供了一个强大的工具。不过,今后的工作应纳入更大、更多样化的独立验证数据集,以进一步改进该模型并提高其通用性。此外,该模型的预测准确性依赖于已知的 AHTP 靶点和序列特征,可能会限制其检测具有未表征或非典型特性的 AHTP 的能力。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Computer methods and programs in biomedicine
Computer methods and programs in biomedicine 工程技术-工程:生物医学
CiteScore
12.30
自引率
6.60%
发文量
601
审稿时长
135 days
期刊介绍: To encourage the development of formal computing methods, and their application in biomedical research and medical practice, by illustration of fundamental principles in biomedical informatics research; to stimulate basic research into application software design; to report the state of research of biomedical information processing projects; to report new computer methodologies applied in biomedical areas; the eventual distribution of demonstrable software to avoid duplication of effort; to provide a forum for discussion and improvement of existing software; to optimize contact between national organizations and regional user groups by promoting an international exchange of information on formal methods, standards and software in biomedicine. Computer Methods and Programs in Biomedicine covers computing methodology and software systems derived from computing science for implementation in all aspects of biomedical research and medical practice. It is designed to serve: biochemists; biologists; geneticists; immunologists; neuroscientists; pharmacologists; toxicologists; clinicians; epidemiologists; psychiatrists; psychologists; cardiologists; chemists; (radio)physicists; computer scientists; programmers and systems analysts; biomedical, clinical, electrical and other engineers; teachers of medical informatics and users of educational software.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信