Is AI-Based Hepatocellular Carcinoma Prediction Ready for Prime Time?

IF 6 2区 医学 Q1 GASTROENTEROLOGY & HEPATOLOGY
Xinrui Jin, Vincent Wai-Sun Wong, Terry Cheuk-Fung Yip
{"title":"Is AI-Based Hepatocellular Carcinoma Prediction Ready for Prime Time?","authors":"Xinrui Jin,&nbsp;Vincent Wai-Sun Wong,&nbsp;Terry Cheuk-Fung Yip","doi":"10.1111/liv.16165","DOIUrl":null,"url":null,"abstract":"<p>In 2022, 865 300 people were diagnosed with liver cancer, and 757 900 people died of liver cancer, making it the sixth most common cancer and the third leading cause of cancer death globally [<span>1</span>]. Although high-quality randomised controlled trials are lacking, observational studies have consistently shown that hepatocellular carcinoma (HCC) surveillance by means of biannual abdominal ultrasonography with or without serum alpha-fetoprotein testing in high-risk groups can detect early cancer, increase the chance of instituting curative-intent treatments and reduce cancer deaths [<span>2</span>]. Therefore, current guidelines support HCC surveillance in patients with cirrhosis or patients with chronic hepatitis B (CHB) beyond a certain age [<span>3</span>].</p><p>However, such recommendations do not capture numerous modifiable and non-modifiable risk factors of HCC and the fact that HCC can develop in the absence of cirrhosis. Take CHB as an example, demographics (age and sex), family history of HCC, host genetics, virologic factors (viral load, genotypes and variants), environmental exposure (e.g., alcohol and aflatoxin) and comorbidities (e.g., diabetes) are well-established risk factors of HCC. Conversely, contemporary antiviral therapies with entecavir or tenofovir can reverse cirrhosis and reduce the risk of HCC. With this background, researchers have derived and validated a number of HCC risk scores, largely for CHB but also some for different chronic liver diseases [<span>4</span>]. When the incidence of HCC exceeds certain thresholds (often taken as 1% per year), HCC surveillance is deemed cost-effective and should be offered. Our group previously demonstrated that patients receiving antiviral therapies for CHB might have HCC incidence reduced to a level that is below guideline recommendations for surveillance and this can be effectively predicted by the PAGE-B and modified PAGE-B scores [<span>5</span>].</p><p>Most existing HCC risk scores were derived using traditional regression formulas with baseline variables. They are thus limited by superficial handling of complex interactions across parameters. Besides, a one-off assessment at baseline deviates from routine clinical practice, where healthcare providers see patients repeatedly over time and adjust their evaluation on HCC risk. This is an area where artificial intelligence (AI) holds promise. In this issue, Ha, Lee and colleagues present a new AI model for HCC prediction in patients on antiviral therapy for CHB [<span>6</span>].</p><p>Ha and Lee et al. conducted a retrospective, multicentre study, which included patients with CHB who continuously received 5 years of entecavir (ETV) or tenofovir (TFV) therapy. The derivation cohort, consisting of 5908 patients from one medical centre, was used for model training and internal validation, while an independent cohort comprising 562 patients from a different medical centre was utilised to externally validate the model's performance. For model development, Ha and colleagues used 36 variables, including values at baseline, at 5 years of therapy and changes over the 5-year period. Five popular machine learning (ML) methods, namely, adaptive boosting (AdaBoost), extreme gradient boosting (XGBoost), light gradient boosting machine (LightGBM), logistic regression (LR) and random forest and ensemble model, AdaBoost + LR, AdaBoost + random forest, LR + random forest, AdaBoost + LR + random forest, were performed and compared to find the best prediction model. The authors found the ensemble model, which combined LR with a random forest model, outperformed all other single and ensemble models, achieving the highest area under the receiver-operating characteristic curve (AUC) of 0.811. Through ablation studies, the authors highlighted the importance of initially included key factors in enhancing model performance, including the presence of cirrhosis at baseline, as well as absolute and relative changes in laboratory values and Child-Pugh score. As a result, the final ensemble model retained all 36 variables and was named the ‘Machine Learning Algorithm for Prediction of Liver Cancer after 5 Years of Antiviral Therapy’ (MAPL-5) model with low- and high-risk groups. The external validation reaffirmed the robust performance of the MAPL-5 model, yielding high discriminatory power (AUC 0.862). ML algorithms used for the MAPL-5 model architecture can process non-linear relationships and integrate diverse data, enhancing its predictive accuracy compared with traditional models (PPACS, CAGE-B, SAGE-B, AASL, CU-HCC, GAG-HCC, PAGE-B, modified PAGE-B and REACH-B); and allowing clinicians to create personalised surveillance programmes (Table 1).</p><p>Meanwhile, several other studies have also introduced novel models to predict HCC in patients with chronic viral hepatitis, using different ML algorithms [<span>7-10</span>]. All these models showed good discriminatory performance with AUC or c-indices over 0.8, similar to that of MAPL-5. Among them, the HCC ridge score (HCC-RS), artificial intelligence-safe score (AI-Safe-C) and Prediction of Liver cancer using Artificial intelligence-driven model for Network—hepatitis B (PLAN-B) applied single model, while ML-based HCC prediction model and MAPL-5 applied ensemble model [<span>10</span>]. Ensemble model can improve generalisation performance and reduce overfitting [<span>11</span>].</p><p>Additionally, the MAPL-5 and PLAN-B models predicted HCC risk beyond the initial 5 years of therapy (up to 10 years) and the AI-Safe-C Score predicted HCC risk within the first 5 years of therapy. Although longer term models show promise in providing prognostic insights, the MAPL-5 model has only been developed and validated in a relatively small Korean population. In contrast, other studies have applied their models to different Asian or Western populations [<span>8, 9, 12</span>]. To achieve generalisability and establish an optimal cut-off value, it is necessary to validate the models in larger, more ethnically diverse populations.</p><p>A range of variables commonly used in HCC risk prediction include host factors, cirrhosis and related laboratory values, viral activity or other comorbidities. In general, traditional statistical models and AI-based models usually incorporate fewer than 10 variables, focusing on this limited set of essential and easily accessible predictors such as age, sex and the presence of cirrhosis, alongside a few additional clinically relevant variables [<span>13</span>]. In contrast, HCC-RS incorporated an extensive range of variables using baseline data (36 variables and 20 variables across two models) [<span>7</span>]. In addition to the abovementioned risk factors, HCC-RS model included other metabolic biomarkers (e.g., fasting glucose, haemoglobin A<sub>1c</sub> and total cholesterol), comorbidities (e.g., cardiovascular disease, cancers and chronic kidney disease) and medications (e.g., angiotensin-converting enzyme inhibitor/angiotensin II receptor blocker, statins and metformin). Moreover, to increase the accuracy of risk prediction, longitudinal data should be applied. MAPL-5 model captured pre-treatment and on-treatment information as well as their changes, yielding a better performance than existing HCC risk scores.</p><p>The current HCC risk model for patients with CHB can be further improved. However, genomic variables and non-viral factors such as comorbidities and concurrent medications were not included in MAPL-5 model. Nevertheless, few studies on predictive models for HCC risk have considered these additional factors and findings have also been inconclusive regarding their additional roles on model performance [<span>7, 13, 14</span>]. Overall, the simpler models are more practical for resource-limited settings. Models with a wider range of factors allow for more comprehensive risk prediction by considering the patient's full clinical profile. The choice between these models depends on clinical scenarios, data availability and resources at hand.</p><p>Current HCC surveillance recommendation relies on population estimates, for example, the annual HCC risk in cirrhotic patients with CHB and the associated cost-effectiveness. The use of risk stratification tools has been considered a promising approach to transform from a population-based to an individual-level approach using relevant patient characteristics. Meanwhile, tailoring the interval and modalities for HCC surveillance based on individual HCC risk profiles has also been discussed as a reasonable advancement from the universal recommendation of biannual abdominal ultrasonography with or without serum alpha-fetoprotein [<span>15</span>]. Liver societies have taken initiatives in incorporating the intermediate-to-high PAGE-B score as a surveillance criterion for non-cirrhotic patients with CHB [<span>3, 16</span>]. However, questions remain about the adequacy of the currently achieved accuracy of traditional HCC risk scores in guiding surveillance decisions, particularly in determining when surveillance can be safely omitted. Also, current clinical guidelines do not recommend routine calculation of HCC risk scores in monitoring, limiting their widespread use in practice. Consequently, uncertainties persist about the optimal interval for recalculating HCC risk scores and the reliability of repeated risk assessments. These issues highlight the need for further research to enhance the precision and applicability of risk-based HCC surveillance strategies.</p><p>AI-based HCC surveillance coupled with automated risk score calculation can potentially transform surveillance approaches. However, a key challenge for widespread implementation is the lack of standardised data sharing in healthcare systems. Effective training of AI algorithms demands substantial high-quality data and the absence of smooth data exchange hinders the AI model performance and the evaluation of the adequacy of models' predictive accuracy across different healthcare settings and patient groups.</p><p>Additionally, there are concerns regarding the potential for bias in AI algorithms, which need to be carefully addressed before widespread use. Healthcare providers may also question the liability risk of using AI tools [<span>17</span>]. To allow for safe implementation, developers of AI tools need to provide comprehensive user guidance including the model assumptions and target populations for the model, similar to a boxed warning for prescription medications. Effective risk management by post-deployment regular monitoring should be performed to allow for the early detection of systematic errors in the AI tools. In light of these, addressing data sharing challenges, ensuring algorithm transparency and fairness and conducting rigorous validation studies are essential steps towards utilising the full potential of AI in HCC surveillance and ultimately improving patient outcomes.</p><p>All three authors contributed to the literature review and drafting of the manuscript. They approved the final version of the manuscript.</p><p>Vincent Wong served as an advisory board member or consultant for AbbVie, AstraZeneca, Boehringer Ingelheim, Echosens, Gilead Sciences, Intercept, Inventiva, Merck, Novo Nordisk, Pfizer, Sagimet Biosciences, TARGET PharmaSolutions and Visirna and a speaker for Abbott, AbbVie, Echosens, Gilead Sciences, Novo Nordisk and Unilab. He has received a research grant from Gilead Sciences and is a co-founder of Illuminatio Medical Technology. Terry Yip has served as an advisory committee member and a speaker for Gilead Sciences. Xinrui Jin declares that she has no competing interests.</p>","PeriodicalId":18101,"journal":{"name":"Liver International","volume":"45 4","pages":""},"PeriodicalIF":6.0000,"publicationDate":"2025-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/liv.16165","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Liver International","FirstCategoryId":"3","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1111/liv.16165","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"GASTROENTEROLOGY & HEPATOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

In 2022, 865 300 people were diagnosed with liver cancer, and 757 900 people died of liver cancer, making it the sixth most common cancer and the third leading cause of cancer death globally [1]. Although high-quality randomised controlled trials are lacking, observational studies have consistently shown that hepatocellular carcinoma (HCC) surveillance by means of biannual abdominal ultrasonography with or without serum alpha-fetoprotein testing in high-risk groups can detect early cancer, increase the chance of instituting curative-intent treatments and reduce cancer deaths [2]. Therefore, current guidelines support HCC surveillance in patients with cirrhosis or patients with chronic hepatitis B (CHB) beyond a certain age [3].

However, such recommendations do not capture numerous modifiable and non-modifiable risk factors of HCC and the fact that HCC can develop in the absence of cirrhosis. Take CHB as an example, demographics (age and sex), family history of HCC, host genetics, virologic factors (viral load, genotypes and variants), environmental exposure (e.g., alcohol and aflatoxin) and comorbidities (e.g., diabetes) are well-established risk factors of HCC. Conversely, contemporary antiviral therapies with entecavir or tenofovir can reverse cirrhosis and reduce the risk of HCC. With this background, researchers have derived and validated a number of HCC risk scores, largely for CHB but also some for different chronic liver diseases [4]. When the incidence of HCC exceeds certain thresholds (often taken as 1% per year), HCC surveillance is deemed cost-effective and should be offered. Our group previously demonstrated that patients receiving antiviral therapies for CHB might have HCC incidence reduced to a level that is below guideline recommendations for surveillance and this can be effectively predicted by the PAGE-B and modified PAGE-B scores [5].

Most existing HCC risk scores were derived using traditional regression formulas with baseline variables. They are thus limited by superficial handling of complex interactions across parameters. Besides, a one-off assessment at baseline deviates from routine clinical practice, where healthcare providers see patients repeatedly over time and adjust their evaluation on HCC risk. This is an area where artificial intelligence (AI) holds promise. In this issue, Ha, Lee and colleagues present a new AI model for HCC prediction in patients on antiviral therapy for CHB [6].

Ha and Lee et al. conducted a retrospective, multicentre study, which included patients with CHB who continuously received 5 years of entecavir (ETV) or tenofovir (TFV) therapy. The derivation cohort, consisting of 5908 patients from one medical centre, was used for model training and internal validation, while an independent cohort comprising 562 patients from a different medical centre was utilised to externally validate the model's performance. For model development, Ha and colleagues used 36 variables, including values at baseline, at 5 years of therapy and changes over the 5-year period. Five popular machine learning (ML) methods, namely, adaptive boosting (AdaBoost), extreme gradient boosting (XGBoost), light gradient boosting machine (LightGBM), logistic regression (LR) and random forest and ensemble model, AdaBoost + LR, AdaBoost + random forest, LR + random forest, AdaBoost + LR + random forest, were performed and compared to find the best prediction model. The authors found the ensemble model, which combined LR with a random forest model, outperformed all other single and ensemble models, achieving the highest area under the receiver-operating characteristic curve (AUC) of 0.811. Through ablation studies, the authors highlighted the importance of initially included key factors in enhancing model performance, including the presence of cirrhosis at baseline, as well as absolute and relative changes in laboratory values and Child-Pugh score. As a result, the final ensemble model retained all 36 variables and was named the ‘Machine Learning Algorithm for Prediction of Liver Cancer after 5 Years of Antiviral Therapy’ (MAPL-5) model with low- and high-risk groups. The external validation reaffirmed the robust performance of the MAPL-5 model, yielding high discriminatory power (AUC 0.862). ML algorithms used for the MAPL-5 model architecture can process non-linear relationships and integrate diverse data, enhancing its predictive accuracy compared with traditional models (PPACS, CAGE-B, SAGE-B, AASL, CU-HCC, GAG-HCC, PAGE-B, modified PAGE-B and REACH-B); and allowing clinicians to create personalised surveillance programmes (Table 1).

Meanwhile, several other studies have also introduced novel models to predict HCC in patients with chronic viral hepatitis, using different ML algorithms [7-10]. All these models showed good discriminatory performance with AUC or c-indices over 0.8, similar to that of MAPL-5. Among them, the HCC ridge score (HCC-RS), artificial intelligence-safe score (AI-Safe-C) and Prediction of Liver cancer using Artificial intelligence-driven model for Network—hepatitis B (PLAN-B) applied single model, while ML-based HCC prediction model and MAPL-5 applied ensemble model [10]. Ensemble model can improve generalisation performance and reduce overfitting [11].

Additionally, the MAPL-5 and PLAN-B models predicted HCC risk beyond the initial 5 years of therapy (up to 10 years) and the AI-Safe-C Score predicted HCC risk within the first 5 years of therapy. Although longer term models show promise in providing prognostic insights, the MAPL-5 model has only been developed and validated in a relatively small Korean population. In contrast, other studies have applied their models to different Asian or Western populations [8, 9, 12]. To achieve generalisability and establish an optimal cut-off value, it is necessary to validate the models in larger, more ethnically diverse populations.

A range of variables commonly used in HCC risk prediction include host factors, cirrhosis and related laboratory values, viral activity or other comorbidities. In general, traditional statistical models and AI-based models usually incorporate fewer than 10 variables, focusing on this limited set of essential and easily accessible predictors such as age, sex and the presence of cirrhosis, alongside a few additional clinically relevant variables [13]. In contrast, HCC-RS incorporated an extensive range of variables using baseline data (36 variables and 20 variables across two models) [7]. In addition to the abovementioned risk factors, HCC-RS model included other metabolic biomarkers (e.g., fasting glucose, haemoglobin A1c and total cholesterol), comorbidities (e.g., cardiovascular disease, cancers and chronic kidney disease) and medications (e.g., angiotensin-converting enzyme inhibitor/angiotensin II receptor blocker, statins and metformin). Moreover, to increase the accuracy of risk prediction, longitudinal data should be applied. MAPL-5 model captured pre-treatment and on-treatment information as well as their changes, yielding a better performance than existing HCC risk scores.

The current HCC risk model for patients with CHB can be further improved. However, genomic variables and non-viral factors such as comorbidities and concurrent medications were not included in MAPL-5 model. Nevertheless, few studies on predictive models for HCC risk have considered these additional factors and findings have also been inconclusive regarding their additional roles on model performance [7, 13, 14]. Overall, the simpler models are more practical for resource-limited settings. Models with a wider range of factors allow for more comprehensive risk prediction by considering the patient's full clinical profile. The choice between these models depends on clinical scenarios, data availability and resources at hand.

Current HCC surveillance recommendation relies on population estimates, for example, the annual HCC risk in cirrhotic patients with CHB and the associated cost-effectiveness. The use of risk stratification tools has been considered a promising approach to transform from a population-based to an individual-level approach using relevant patient characteristics. Meanwhile, tailoring the interval and modalities for HCC surveillance based on individual HCC risk profiles has also been discussed as a reasonable advancement from the universal recommendation of biannual abdominal ultrasonography with or without serum alpha-fetoprotein [15]. Liver societies have taken initiatives in incorporating the intermediate-to-high PAGE-B score as a surveillance criterion for non-cirrhotic patients with CHB [3, 16]. However, questions remain about the adequacy of the currently achieved accuracy of traditional HCC risk scores in guiding surveillance decisions, particularly in determining when surveillance can be safely omitted. Also, current clinical guidelines do not recommend routine calculation of HCC risk scores in monitoring, limiting their widespread use in practice. Consequently, uncertainties persist about the optimal interval for recalculating HCC risk scores and the reliability of repeated risk assessments. These issues highlight the need for further research to enhance the precision and applicability of risk-based HCC surveillance strategies.

AI-based HCC surveillance coupled with automated risk score calculation can potentially transform surveillance approaches. However, a key challenge for widespread implementation is the lack of standardised data sharing in healthcare systems. Effective training of AI algorithms demands substantial high-quality data and the absence of smooth data exchange hinders the AI model performance and the evaluation of the adequacy of models' predictive accuracy across different healthcare settings and patient groups.

Additionally, there are concerns regarding the potential for bias in AI algorithms, which need to be carefully addressed before widespread use. Healthcare providers may also question the liability risk of using AI tools [17]. To allow for safe implementation, developers of AI tools need to provide comprehensive user guidance including the model assumptions and target populations for the model, similar to a boxed warning for prescription medications. Effective risk management by post-deployment regular monitoring should be performed to allow for the early detection of systematic errors in the AI tools. In light of these, addressing data sharing challenges, ensuring algorithm transparency and fairness and conducting rigorous validation studies are essential steps towards utilising the full potential of AI in HCC surveillance and ultimately improving patient outcomes.

All three authors contributed to the literature review and drafting of the manuscript. They approved the final version of the manuscript.

Vincent Wong served as an advisory board member or consultant for AbbVie, AstraZeneca, Boehringer Ingelheim, Echosens, Gilead Sciences, Intercept, Inventiva, Merck, Novo Nordisk, Pfizer, Sagimet Biosciences, TARGET PharmaSolutions and Visirna and a speaker for Abbott, AbbVie, Echosens, Gilead Sciences, Novo Nordisk and Unilab. He has received a research grant from Gilead Sciences and is a co-founder of Illuminatio Medical Technology. Terry Yip has served as an advisory committee member and a speaker for Gilead Sciences. Xinrui Jin declares that she has no competing interests.

求助全文
约1分钟内获得全文 求助全文
来源期刊
Liver International
Liver International 医学-胃肠肝病学
CiteScore
13.90
自引率
4.50%
发文量
348
审稿时长
2 months
期刊介绍: Liver International promotes all aspects of the science of hepatology from basic research to applied clinical studies. Providing an international forum for the publication of high-quality original research in hepatology, it is an essential resource for everyone working on normal and abnormal structure and function in the liver and its constituent cells, including clinicians and basic scientists involved in the multi-disciplinary field of hepatology. The journal welcomes articles from all fields of hepatology, which may be published as original articles, brief definitive reports, reviews, mini-reviews, images in hepatology and letters to the Editor.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信