Vo Anh Khoa , Pham Minh Quan , Ja’Niyah Allen , Kbenesh W. Blayneh
{"title":"Efficient relaxation scheme for the SIR and related compartmental models","authors":"Vo Anh Khoa , Pham Minh Quan , Ja’Niyah Allen , Kbenesh W. Blayneh","doi":"10.1016/j.jocs.2024.102478","DOIUrl":"10.1016/j.jocs.2024.102478","url":null,"abstract":"<div><div>In this paper, we introduce a novel numerical approach for approximating the Susceptible–Infectious–Recovered (SIR) model in epidemiology. Our method enhances the existing linearization procedure by incorporating a suitable relaxation term to tackle the transcendental equation of nonlinear type. Developed within the continuous framework, our relaxation method is explicit and easy to implement, relying on a sequence of linear differential equations. This approach yields accurate approximations in both discrete and analytical forms. Through rigorous analysis, we prove that, with an appropriate choice of the relaxation parameter, our numerical scheme is non-negativity-preserving; moreover, it is strongly convergent to the true solution. We also extend the applicability of our relaxation method to handle some variations of the traditional SIR model. Finally, we present numerical examples using simulated data to demonstrate the effectiveness of our proposed method.</div></div>","PeriodicalId":48907,"journal":{"name":"Journal of Computational Science","volume":"84 ","pages":"Article 102478"},"PeriodicalIF":3.1,"publicationDate":"2024-11-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142745886","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An eXplainable machine learning framework for predicting the impact of pesticide exposure in lung cancer prognosis","authors":"Nitha V.R., Vinod Chandra S.S.","doi":"10.1016/j.jocs.2024.102476","DOIUrl":"10.1016/j.jocs.2024.102476","url":null,"abstract":"<div><div>Lung cancer, the second most prevalent and lethal cancer, is caused by aberrant and uncontrolled cell division in the lungs. Once lung cancer spreads to surrounding tissues or organs, the likelihood of recovery declines; hence, early illness detection is vital. Machine learning has shown significant potential in several healthcare applications. Examining various factors and trends in the data, the machine learning model can predict lung cancer menace by pinpointing those more susceptible to the illness. Among the various causes of lung cancer, pesticide is a major contributor. ‘Pesticide’ refers to any chemical used in agriculture to manage pests like weeds and insects. Numerous health hazards, including the possibility of developing cancer, have been linked to exposure to specific pesticides. Our objective is to obtain the trust of medical professionals and patients depending on how interpretable machine learning models are in healthcare. This paper deals with implementing the proposed study by utilizing a public dataset from a Thai case study to predict the risk of lung cancer caused by pesticide exposure. Since the dataset was highly imbalanced, a hybrid normalization technique was utilized, combining the Synthetic Minority Oversampling Technique (SMOTE) and Edited Nearest Neighbor (ENN). We applied a two-stage feature selection technique combined with Extra Tree Classifier and Principal Component Analysis. An eXplainable XGBoost Classifier is developed to predict lung cancer risk based on pesticide exposure. The robustness of the model is reflected in the results, with accuracy, sensitivity, and F1-Score as 99.00%, 98.87%, and 98.57%, respectively. Two public datasets were utilized to generalize the model, and the model performed well on both datasets. The model achieved accuracy, sensitivity, and F1-Score of 99.00%, 99.00%, and 99.33% on the ‘Lung Cancer Prediction’ dataset. The model is trained and tested on the ‘survey lung cancer’ dataset and obtained an accuracy, sensitivity, and F1-Score of 99.00%, 99.00%, 99.00%, respectively. The proposed model outperformed existing state-of-the-art methodologies regarding quality metrics. An illustration is done on the XAI (eXplainable Artificial Intelligence) model by utilizing SHapley Additive exPlanations (SHAP), thereby identifying the most relevant features contributing to the lung cancer menace.</div></div>","PeriodicalId":48907,"journal":{"name":"Journal of Computational Science","volume":"84 ","pages":"Article 102476"},"PeriodicalIF":3.1,"publicationDate":"2024-11-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142745801","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"perms: Likelihood-free estimation of marginal likelihoods for binary response data in Python and R","authors":"Dennis Christensen , Per August Jarval Moen","doi":"10.1016/j.jocs.2024.102467","DOIUrl":"10.1016/j.jocs.2024.102467","url":null,"abstract":"<div><div>In Bayesian statistics, the marginal likelihood (ML) is the key ingredient needed for model comparison and model averaging. Unfortunately, estimating MLs accurately is notoriously difficult, especially for models where posterior simulation is not possible. Recently, the idea of permutation counting was introduced, which provides an estimator which can accurately estimate MLs of models for exchangeable binary responses. Such data arise in a multitude of statistical problems, including binary classification, bioassay and sensitivity testing. Permutation counting is entirely likelihood-free and works for any model from which a random sample can be generated, including nonparametric models. Here we present <span>perms</span>, a package implementing permutation counting. Following optimisation efforts, <span>perms</span> is computationally efficient and can handle large data problems. It is available as both an R package and a Python library. A broad gallery of examples illustrating its usage is provided, which includes both standard parametric binary classification and novel applications of nonparametric models, such as changepoint analysis. We also cover the details of the implementation of <span>perms</span> and illustrate its computational speed via a simple simulation study.</div></div>","PeriodicalId":48907,"journal":{"name":"Journal of Computational Science","volume":"84 ","pages":"Article 102467"},"PeriodicalIF":3.1,"publicationDate":"2024-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142699237","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"On-the-fly mathematical formulation for estimating people flow from elevator load data in smart building virtual sensing platforms","authors":"Koichi Kondo , Ryosuke Ohori , Kiyotaka Matsue , Hiroyuki Aizu","doi":"10.1016/j.jocs.2024.102488","DOIUrl":"10.1016/j.jocs.2024.102488","url":null,"abstract":"<div><div>This paper considers a new approach for people flow estimation in buildings from elevator trip records and corresponding load data, and the resulting model is used on the virtual sensing platform we have developed. People flow data can be used to improve elevator performance through optimal car assignments to hall calls by a group controller and are useful for estimating occupant distributions as heat loads allowing for optimized air-conditioning control to realize energy savings. Available data from an elevator controller is insufficient for exact people flow estimation and therefore this problem becomes under-defined. Our virtual sensing platform adopts equation-based modeling and optimization-based parameter estimation, which estimates application-related parameters from available sensor data, allowing for over- or under-defined situations among sensory information, but better mathematical formulation is essential for accurate parameter estimation on this virtual sensing platform. Accordingly, we propose a new method to define an elevator trip-wise mathematical formulation by modifying pre-defined base equations or defining additional equations. The key idea is that each elevator trip has different features, including sparsity, that are useful for improving accuracy and can be successfully formulated as simultaneous equations that our virtual sensing platform accepts. The procedure for defining a mathematical formulation is invoked after trip data are obtained and we refer this procedure as “on-the-fly mathematical formulation.” The formulated trip-wise equations are combined as simultaneous equations for estimating people flow over a given period on the virtual sensing platform by mathematical optimization.</div></div>","PeriodicalId":48907,"journal":{"name":"Journal of Computational Science","volume":"84 ","pages":"Article 102488"},"PeriodicalIF":3.1,"publicationDate":"2024-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142699238","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Emel Kurul , Huseyin Tunc , Murat Sari , Nuran Guzel
{"title":"Deep learning aided surrogate modeling of the epidemiological models","authors":"Emel Kurul , Huseyin Tunc , Murat Sari , Nuran Guzel","doi":"10.1016/j.jocs.2024.102470","DOIUrl":"10.1016/j.jocs.2024.102470","url":null,"abstract":"<div><div>The study of disease spread often relies on compartmental models based on nonlinear differential equations, which typically require computationally intensive numerical algorithms, especially for parameter estimation. This paper introduces a deep neural network-based surrogate modeling (DNN-SM) approach, engineered to accurately replicate the behavior of epidemiological models while significantly reducing computational demands. This approach adeptly handles the complexities inherent in nonlinear models and optimizes parameter estimation efficiency. We demonstrate the efficacy of the DNN-SM through its application to various disease models, including the Susceptible–Infected–Recovered (SIR), Susceptible–Exposed–Infected–Recovered (SEIR), and the more complex Susceptible–Exposed–Presymptomatic–Asymptomatic–Symptomatic–Reported (SEPADR) models. The results reveal that our DNN-SM not only forecasts solution trajectories with high accuracy but also operates approximately ten times faster than traditional ODE solvers for forward problems. By comparing the parameter estimation results of the DNN-SM and ODE solvers, we show that the DNN-SM produces highly accurate results with much less computational costs. The DNN-SM has been validated using both short-term and long-term COVID-19 data from several European countries. The results demonstrate that the DNN-SM provides accurate trajectories with significantly lower computational cost compared to traditional numerical methods.</div></div>","PeriodicalId":48907,"journal":{"name":"Journal of Computational Science","volume":"84 ","pages":"Article 102470"},"PeriodicalIF":3.1,"publicationDate":"2024-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142745800","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"POD-Galerkin reduced order model coupled with neural networks to solve flow in porous media","authors":"C. Allery, C. Béghein, C. Dubot, F. Dubot","doi":"10.1016/j.jocs.2024.102471","DOIUrl":"10.1016/j.jocs.2024.102471","url":null,"abstract":"<div><div>This paper deals with the numerical modeling of flow around and through a porous obstacle by a reduced order model (ROM) obtained by Galerkin projection of the Navier–Stokes equations onto a Proper Orthogonal Decomposition (POD) reduced basis. In the few existing works dealing with model reduction techniques applied to flows in porous media, flows were described by Darcy’s law and the non linear Forchheimer term was neglected. This last term cannot be expressed in reduced form during the Galerkin projection phase. Indeed, at each new time step, the norm of the velocity needs to be recalculated and projected, which significantly increases the computational cost, rendering the reduced model inefficient. To overcome this difficulty, we propose to model the projected Forchheimer term with artificial neural networks. Moreover in order to build a stable ROM, the influence of unresolved modes and pressure variations are also modeled using a neural network. Instead of separately modeling each term, these terms were combined into a single term, which was modeled using the multilayer perceptron method (MLP). The validation of this approach was carried out for laminar flow past a porous obstacle in an unconfined channel. The proposed ROM coupled with MLP approach is able to accurately predict the dynamics of the flow while the standard ROM yields wrong results. Moreover, the ROM MLP method improves the prediction of flow for Reynolds numbers that are not included in the sampling and for times longer than sampling times. In the final part of the paper, the ROM MLP method was compared with purely data driven methods. It was shown that the MLP method is superior to the purely data driven methods.</div></div>","PeriodicalId":48907,"journal":{"name":"Journal of Computational Science","volume":"84 ","pages":"Article 102471"},"PeriodicalIF":3.1,"publicationDate":"2024-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142745799","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Dennis A. Christie , Rene Fluit , Guillaume Durandau , Massimo Sartori , Nico Verdonschot
{"title":"Comparative evaluation of sparse and minimal data point cloud registration: A study on Tibiofemoral Bones","authors":"Dennis A. Christie , Rene Fluit , Guillaume Durandau , Massimo Sartori , Nico Verdonschot","doi":"10.1016/j.jocs.2024.102463","DOIUrl":"10.1016/j.jocs.2024.102463","url":null,"abstract":"<div><div>An accurate bone registration is a crucial step in Computer-assisted Orthopaedic Surgery (CAOS) to estimate the relationship between a preoperative patient’s bone model and the actual position during surgery. A-mode ultrasound and motion capture system is a new promising non-invasive technique to determine the bone’s 3D pose. The main challenge with such a system is the sparsity of the measurement; it could trap the optimization, which minimizes the registration error, in the local minima. In this paper, we aim to find the registration algorithm that could provide enough surgical navigation accuracy. Several registration algorithms were compared using Monte Carlo simulations. The number of points and placement sensitivity were also investigated while keeping the practical aspect of the system. With 15 points, Unscented Kalman Filter (UKF)-based registration with 6D similarity vector showed superior to the other examined algorithms in minimizing the transformation error. In terms of balancing the accuracy and the equipment availability, the simulation showed that points needed to be dispersedly placed; 15 points were sufficient to register the femur, but 20 points were required to register the tibia. Beyond this number, the registration error hardly improved and will therefore be used to base our number of sensors on.</div></div>","PeriodicalId":48907,"journal":{"name":"Journal of Computational Science","volume":"84 ","pages":"Article 102463"},"PeriodicalIF":3.1,"publicationDate":"2024-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142745802","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Marcin Czajkowski, Krzysztof Jurczuk, Marek Kretowski
{"title":"Enhancing multi-omics data classification with relative expression analysis and decision trees","authors":"Marcin Czajkowski, Krzysztof Jurczuk, Marek Kretowski","doi":"10.1016/j.jocs.2024.102460","DOIUrl":"10.1016/j.jocs.2024.102460","url":null,"abstract":"<div><div>This study introduces the Relative Multi-test Classification Tree (RMCT), a novel classification method tailored for multi-omics data analysis. The RMCT method combines the interpretative power of decision trees with the analytical precision of Relative eXpression Analysis (RXA) to address the complex task of examining biomedical data derived from diverse high-throughput technologies. The proposed RMCT approach discerns patterns within and across omics layers, yielding an accurate and interpretable classifier. In each internal node of RMCT, we create a multitest - group of Top-Scoring-Pair tests, that capture the ordering relationships among features from various omics. Multi-tests are optimized for maximal reduction of Gini impurity, and ensuring consistency in decision-making. We address computational challenges by advanced GPU parallelization, remarkably improving RMCT’s time performance. Through experimental validation on diverse multi-omics datasets, RMCT has demonstrated superior performance compared to traditional tree-based solutions, particularly in terms of accuracy and clarity of predictions. This method effectively reveals intricate interactions and relationships within multi-omics data, marking it as a useful addition to bioinformatics and biomedicine. This work represents a thorough extension of our preliminary research, which was initially presented at the twenty-third edition of the International Conference on Computational Science (ICCS). It expands the initial concept of integrating decision trees with RXA for multi-omics data classification, deepening the analytical methodologies, further optimizing the GPU computing, and broadening the experimental validation.</div></div>","PeriodicalId":48907,"journal":{"name":"Journal of Computational Science","volume":"84 ","pages":"Article 102460"},"PeriodicalIF":3.1,"publicationDate":"2024-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142699236","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Identifying influential nodes in complex networks through the k-shell index and neighborhood information","authors":"Shima Esfandiari, Mohammad Reza Moosavi","doi":"10.1016/j.jocs.2024.102473","DOIUrl":"10.1016/j.jocs.2024.102473","url":null,"abstract":"<div><div>Identifying influential nodes is crucial in network science for controlling diseases, sharing information, and viral marketing. Current methods for finding vital spreaders have problems with accuracy, resolution, or time complexity. To address these limitations, this paper presents a hybrid approach called the Bubble Method (BM). First, the BM assumes a bubble with a radius of two surrounding each node. Then, it extracts various attributes from inside and near the surface of the bubble. These attributes are the k-shell index, k-shell diversity, and the distances of nodes within the bubble from the central node. We compared our method to 12 recent ones, including the Hybrid Global Structure model (HGSM) and Generalized Degree Decomposition (GDD), using the Susceptible–Infectious–Recovered (SIR) model to test its effectiveness. The results show the BM outperforms other methods in terms of accuracy, correctness, and resolution. Its low computational complexity renders it highly suitable for analyzing large-scale networks.</div></div>","PeriodicalId":48907,"journal":{"name":"Journal of Computational Science","volume":"84 ","pages":"Article 102473"},"PeriodicalIF":3.1,"publicationDate":"2024-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142699234","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Sophie Robert-Hayek , Soraya Zertal , Philippe Couvée
{"title":"EVADyR: A new dynamic resampling algorithm for auto-tuning noisy High Performance Computing systems","authors":"Sophie Robert-Hayek , Soraya Zertal , Philippe Couvée","doi":"10.1016/j.jocs.2024.102468","DOIUrl":"10.1016/j.jocs.2024.102468","url":null,"abstract":"<div><div>Black-box auto-tuning methods have been proven to be efficient for tuning configurable computer hardware, including those encountered within the High Performance Computing (HPC) ecosystem. However, because of the shared nature of HPC clusters and the complexity of the software and hardware stacks, the measurement of the performance function can be tainted by noise during the tuning process, which can reduce and sometimes prevent the benefit of the tuning approach. A usual choice for performing the tuning in spite of these interference is to add a resampling step at each iteration to reduce uncertainty, but this approach can be time-consuming and must be done carefully. In this paper, we propose a new resampling and filtering algorithm called EVADyR (Efficient Value Aware Dynamic Resampling). Compared to the state of the art, it finds a better exploration versus exploitation trade-off by resampling only promising configuration and increases the level of confidence around the suggested solution as the tuning process advances. This algorithm was able to tune efficiently two I/O accelerators highly sensitive to interference, in two different scenarios. Compared to Standard Error Dynamic Resampling (SEDR), a state of the art noise reduction strategy, we show that EVADyR is able to reduce the distance to the optimum by 93.5% and 24.7% for the two I/O accelerators respectively, as well as speed-up the experiment duration by 45.8% and 58.1% because less iterations are needed to reach the found optimum. Our results prove the importance of using noise reduction strategies whenever tuning systems running in production.</div></div>","PeriodicalId":48907,"journal":{"name":"Journal of Computational Science","volume":"84 ","pages":"Article 102468"},"PeriodicalIF":3.1,"publicationDate":"2024-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142699235","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}