Marwa Matboli , Gouda I. Diab , Maha Saad , Abdelrahman Khaled , Marian Roushdy , Marwa Ali , Hind A. ELsawi , Ibrahim H. Aboughaleb
{"title":"Machine-Learning-Based Identification of Key Feature RNA-Signature Linked to Diagnosis of Hepatocellular Carcinoma","authors":"Marwa Matboli , Gouda I. Diab , Maha Saad , Abdelrahman Khaled , Marian Roushdy , Marwa Ali , Hind A. ELsawi , Ibrahim H. Aboughaleb","doi":"10.1016/j.jceh.2024.101456","DOIUrl":null,"url":null,"abstract":"<div><h3>Background</h3><p>Hepatocellular carcinoma (HCC) is the third prime cause of malignancy-related mortality worldwide. Early and accurate identification of HCC is crucial for good prognosis, efficacy of therapy, and survival rates of the patients. We aimed to develop a machine-learning model incorporating differentially expressed RNA signatures with laboratory parameters to construct an RNA signature-based diagnostic model for HCC.</p></div><div><h3>Methods</h3><p>We have used five classifiers (KNN, RF, SVM, LGBM, and DNNs) to predict the liver disease (HCC). The classifiers were trained on 187 samples and then tested on 80 samples. The model included 22 features (age, sex, smoking, cirrhosis, non-cirrhosis, albumin, ALT, AST bilirubin (total and direct), INR, AFP, HBV Ag, HCV Abs, RQmiR-1298, RQmiR-1262, RQmiR-106b-3p, RQmRNARAB11A, and RQSTAT1, RQmRNAATG12, RQLnc-WRAP53, RQLncRNA- RP11-513I15.6).</p></div><div><h3>Results</h3><p>LGBM achieved the highest accuracy of 98.75% in predicting HCC among all models surpassing Random Forest (96.25%), DNN (91.25%), SVC (88.75%), and KNN (87.50%).</p></div><div><h3>Conclusion</h3><p>Our machine-learning model incorporating the expression data of RAB11A/STAT1/ATG12/miR-1262/miR-1298/miR-106b-3p/lncRNA-RP11-513I15.6/lncRNA-WRAP53 signature and clinical data represents a potential novel diagnostic model for HCC.</p></div>","PeriodicalId":15479,"journal":{"name":"Journal of Clinical and Experimental Hepatology","volume":null,"pages":null},"PeriodicalIF":3.3000,"publicationDate":"2024-06-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Clinical and Experimental Hepatology","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0973688324001130","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"GASTROENTEROLOGY & HEPATOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Background
Hepatocellular carcinoma (HCC) is the third prime cause of malignancy-related mortality worldwide. Early and accurate identification of HCC is crucial for good prognosis, efficacy of therapy, and survival rates of the patients. We aimed to develop a machine-learning model incorporating differentially expressed RNA signatures with laboratory parameters to construct an RNA signature-based diagnostic model for HCC.
Methods
We have used five classifiers (KNN, RF, SVM, LGBM, and DNNs) to predict the liver disease (HCC). The classifiers were trained on 187 samples and then tested on 80 samples. The model included 22 features (age, sex, smoking, cirrhosis, non-cirrhosis, albumin, ALT, AST bilirubin (total and direct), INR, AFP, HBV Ag, HCV Abs, RQmiR-1298, RQmiR-1262, RQmiR-106b-3p, RQmRNARAB11A, and RQSTAT1, RQmRNAATG12, RQLnc-WRAP53, RQLncRNA- RP11-513I15.6).
Results
LGBM achieved the highest accuracy of 98.75% in predicting HCC among all models surpassing Random Forest (96.25%), DNN (91.25%), SVC (88.75%), and KNN (87.50%).
Conclusion
Our machine-learning model incorporating the expression data of RAB11A/STAT1/ATG12/miR-1262/miR-1298/miR-106b-3p/lncRNA-RP11-513I15.6/lncRNA-WRAP53 signature and clinical data represents a potential novel diagnostic model for HCC.