{"title":"Machine learning web application for predicting varicose veins utilizing global prevalence data.","authors":"Yury Rusinovich, Volha Rusinovich, Markus Doss","doi":"10.1177/02683555251318154","DOIUrl":null,"url":null,"abstract":"<p><p>AimThis study aimed to develop a web-based machine learning (ML) model to predict the lifetime likelihood of developing varicose veins using global disease prevalence data.MethodsWe utilized data from a systematic review, registered under PROSPERO (CRD42021279513), which included 81 studies on varicose vein prevalence across various geographic regions. The data used to build the ML model included disease prevalence as the outcome (%), along with the following predictors: mean age, gender distribution (%), mean body mass index (BMI) of the study cohort, and the mean gravity field of the study region (mGal), representing variations in Earth's underground mass distribution that influence blood and fluid redistribution in the human body, affecting disease prevalence. After standardizing the outcome and predictors, the model was trained using neural network regression implemented with the TensorFlow.js library and deployed as a web-based ML application.ResultsAfter 406 epochs of training, and upon achieving a validation loss (mean squared error) of 0.9, training was stopped due to no further improvement. The achieved test loss was 0.49, and the mean absolute error (MAE) was 0.56, corresponding to an up to 6.7% difference between the predicted and true disease probabilities (calculated as MAE x σ, where σ is the standard deviation of the mean disease prevalence = 0.56 x 11.9 = 6.7). The likelihood of developing varicose veins, as predicted by the model, showed the strongest correlation with age (0.78), followed by gravity anomaly (0.30), BMI (0.27), and gender (0.15).ConclusionThis study summarizes research on the prevalence of varicose veins by developing a web-based ML model to predict an individual's likelihood of developing the disease. Using data reported in the literature, the ML algorithm provides a non-discriminatory predictive baseline, offering a valuable tool for future investigations into disease epidemiology.</p>","PeriodicalId":94350,"journal":{"name":"Phlebology","volume":" ","pages":"528-535"},"PeriodicalIF":1.5000,"publicationDate":"2025-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Phlebology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1177/02683555251318154","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/29 0:00:00","PubModel":"Epub","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
AimThis study aimed to develop a web-based machine learning (ML) model to predict the lifetime likelihood of developing varicose veins using global disease prevalence data.MethodsWe utilized data from a systematic review, registered under PROSPERO (CRD42021279513), which included 81 studies on varicose vein prevalence across various geographic regions. The data used to build the ML model included disease prevalence as the outcome (%), along with the following predictors: mean age, gender distribution (%), mean body mass index (BMI) of the study cohort, and the mean gravity field of the study region (mGal), representing variations in Earth's underground mass distribution that influence blood and fluid redistribution in the human body, affecting disease prevalence. After standardizing the outcome and predictors, the model was trained using neural network regression implemented with the TensorFlow.js library and deployed as a web-based ML application.ResultsAfter 406 epochs of training, and upon achieving a validation loss (mean squared error) of 0.9, training was stopped due to no further improvement. The achieved test loss was 0.49, and the mean absolute error (MAE) was 0.56, corresponding to an up to 6.7% difference between the predicted and true disease probabilities (calculated as MAE x σ, where σ is the standard deviation of the mean disease prevalence = 0.56 x 11.9 = 6.7). The likelihood of developing varicose veins, as predicted by the model, showed the strongest correlation with age (0.78), followed by gravity anomaly (0.30), BMI (0.27), and gender (0.15).ConclusionThis study summarizes research on the prevalence of varicose veins by developing a web-based ML model to predict an individual's likelihood of developing the disease. Using data reported in the literature, the ML algorithm provides a non-discriminatory predictive baseline, offering a valuable tool for future investigations into disease epidemiology.
目的:本研究旨在开发基于网络的机器学习(ML)模型,利用全球疾病患病率数据预测静脉曲张的终生可能性。方法:我们利用了在PROSPERO (CRD42021279513)下注册的一项系统综述的数据,其中包括81项关于不同地理区域静脉曲张患病率的研究。用于构建ML模型的数据包括疾病患病率作为结果(%),以及以下预测因子:研究队列的平均年龄、性别分布(%)、平均体重指数(BMI)和研究区域的平均重力场(mGal),代表地球地下质量分布的变化影响人体血液和流体的再分配,从而影响疾病患病率。在标准化结果和预测因子后,使用TensorFlow.js库实现的神经网络回归对模型进行训练,并将其部署为基于web的ML应用程序。结果:经过406次训练,验证损失(均方误差)为0.9后,由于没有进一步的改善,训练停止。获得的检验损失为0.49,平均绝对误差(MAE)为0.56,对应于预测和真实疾病概率之间高达6.7%的差异(计算为MAE x σ,其中σ是平均疾病患病率的标准差= 0.56 x 11.9 = 6.7)。根据模型预测,发生静脉曲张的可能性与年龄的相关性最强(0.78),其次是重力异常(0.30)、BMI(0.27)和性别(0.15)。结论:本研究总结了关于静脉曲张患病率的研究,通过开发基于网络的ML模型来预测个体发展该疾病的可能性。利用文献中报道的数据,ML算法提供了一个非歧视性的预测基线,为未来的疾病流行病学调查提供了一个有价值的工具。