AlgorithmsPub Date : 2025-06-01Epub Date: 2025-06-05DOI: 10.3390/a18060346
Hongbin Zhang, McKaylee Robertson, Sarah L Braunstein, David B Hanna, Uriel R Felsen, Levi Waldron, Denis Nash
{"title":"Inferring the Timing of Antiretroviral Therapy by Zero-Inflated Random Change Point Models Using Longitudinal Data Subject to Left-Censoring.","authors":"Hongbin Zhang, McKaylee Robertson, Sarah L Braunstein, David B Hanna, Uriel R Felsen, Levi Waldron, Denis Nash","doi":"10.3390/a18060346","DOIUrl":"10.3390/a18060346","url":null,"abstract":"<p><p>We propose a new random change point model that utilizes routinely recorded individual-level HIV viral load data to estimate the timing of antiretroviral therapy (ART) initiation in people living with HIV. The change point distribution is assumed to follow a zero-inflated exponential distribution for the longitudinal data, which is also subject to left-censoring, and the underlying data-generating mechanism is a nonlinear mixed-effects model. We extend the Stochastic EM (StEM) algorithm by combining a Gibbs sampler with a Metropolis-Hastings sampling. We apply the method to real HIV data to infer the timing of ART initiation since diagnosis. Additionally, we conduct simulation studies to assess the performance of our proposed method.</p>","PeriodicalId":7636,"journal":{"name":"Algorithms","volume":"18 6","pages":""},"PeriodicalIF":2.1,"publicationDate":"2025-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC13021155/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147572044","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
AlgorithmsPub Date : 2025-06-01Epub Date: 2025-06-18DOI: 10.3390/a18060368
Emir Veledar, Lili Zhou, Omar Veledar, Hannah Gardener, Carolina M Gutierrez, Jose G Romano, Tatjana Rundek
{"title":"Synthesizing Explainability Across Multiple ML Models for Structured Data.","authors":"Emir Veledar, Lili Zhou, Omar Veledar, Hannah Gardener, Carolina M Gutierrez, Jose G Romano, Tatjana Rundek","doi":"10.3390/a18060368","DOIUrl":"10.3390/a18060368","url":null,"abstract":"<p><p>Explainable Machine Learning (XML) in high-stakes domains demands reproducible methods to aggregate feature importance across multiple models applied to the same structured dataset. We propose the Weighted Importance Score and Frequency Count (WISFC) framework, which combines importance magnitude and consistency by aggregating ranked outputs from diverse explainers. WISFC assigns a weighted score to each feature based on its rank and frequency across model-explainer pairs, providing a robust ensemble feature-importance ranking. Unlike simple consensus voting or ranking heuristics that are insufficient for capturing complex relationships among different explainer outputs, WISFC offers a more principled approach to reconciling and aggregating this information. By aggregating many \"weak signals\" from brute-force modeling runs, WISFC can surface a stronger consensus on which variables matter most. The framework is designed to be reproducible and generalizable, capable of taking important outputs from any set of machine-learning models and producing an aggregated ranking highlighting consistently important features. This approach acknowledges that any single model is a simplification of complex, multidimensional phenomena; using multiple diverse models, each optimized from a different perspective, WISFC systematically captures different facets of the problem space to create a more structured and comprehensive view. As a consequence, this study offers a useful strategy for researchers and practitioners who seek innovative ways of exploring complex systems, not by discovering entirely new variables but by introducing a novel mindset for systematically combining multiple modeling perspectives.</p>","PeriodicalId":7636,"journal":{"name":"Algorithms","volume":"18 6","pages":""},"PeriodicalIF":2.1,"publicationDate":"2025-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12885564/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146155614","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Finding Multiple Optimal Solutions to an Integer Linear Program by Random Perturbations of Its Objective Function.","authors":"Noah Schulhof, Pattara Sukprasert, Eytan Ruppin, Samir Khuller, Alejandro A Schäffer","doi":"10.3390/a18030140","DOIUrl":"10.3390/a18030140","url":null,"abstract":"<p><p>Integer linear programs (ILPs) and mixed integer programs (MIPs) often have multiple distinct optimal solutions, yet the widely used Gurobi optimization solver returns certain solutions at disproportionately high frequencies. This behavior is disadvantageous, as, in fields such as biomedicine, the identification and analysis of distinct optima yields valuable domain-specific insights that inform future research directions. In the present work, we introduce MORSE (Multiple Optima via Random Sampling and careful choice of the parameter Epsilon), a randomized, parallelizable algorithm to efficiently generate multiple optima for ILPs. MORSE maps multiplicative perturbations to the coefficients in an instance's objective function, generating a modified instance that retains an optimum of the original problem. We formalize and prove the above claim in some practical conditions. Furthermore, we prove that for 0/1 selection problems, MORSE finds each distinct optimum with equal probability. We evaluate MORSE using two measures; the number of distinct optima found in <math><mi>r</mi></math> independent runs, and the diversity of the list (with repetitions) of solutions by average pairwise Hamming distance and Shannon entropy. Using these metrics, we provide empirical results demonstrating that MORSE outperforms the Gurobi method and unweighted variations of the MORSE method on a set of 20 Mixed Integer Programming Library (MIPLIB) instances and on a combinatorial optimization problem in cancer genomics.</p>","PeriodicalId":7636,"journal":{"name":"Algorithms","volume":"18 3","pages":""},"PeriodicalIF":1.8,"publicationDate":"2025-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11970949/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143794351","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
AlgorithmsPub Date : 2025-02-01Epub Date: 2025-01-24DOI: 10.3390/a18020062
Yunge Wang, Lingling Zhang, Tong Si, Graham Bishop, Haijun Gong
{"title":"Anomaly Detection in High-Dimensional Time Series Data with Scaled Bregman Divergence.","authors":"Yunge Wang, Lingling Zhang, Tong Si, Graham Bishop, Haijun Gong","doi":"10.3390/a18020062","DOIUrl":"10.3390/a18020062","url":null,"abstract":"<p><p>The purpose of anomaly detection is to identify special data points or patterns that significantly deviate from the expected or typical behavior of the majority of the data, and it has a wide range of applications across various domains. Most existing statistical and machine learning-based anomaly detection algorithms face challenges when applied to high-dimensional data. For instance, the unconstrained least-squares importance fitting (uLSIF) method, a state-of-the-art anomaly detection approach, encounters the unboundedness problem under certain conditions. In this study, we propose a scaled Bregman divergence-based anomaly detection algorithm using both least absolute deviation and least-squares loss for parameter learning. This new algorithm effectively addresses the unboundedness problem, making it particularly suitable for high-dimensional data. The proposed technique was evaluated on both synthetic and real-world high-dimensional time series datasets, demonstrating its effectiveness in detecting anomalies. Its performance was also compared to other density ratio estimation-based anomaly detection methods.</p>","PeriodicalId":7636,"journal":{"name":"Algorithms","volume":"18 2","pages":""},"PeriodicalIF":1.8,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11790285/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143121729","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
AlgorithmsPub Date : 2024-01-10DOI: 10.3390/a17010029
Pornrawee Tatit, Kiki Adhinugraha, David Taniar
{"title":"Navigating the Maps: Euclidean vs. Road Network Distances in Spatial Queries","authors":"Pornrawee Tatit, Kiki Adhinugraha, David Taniar","doi":"10.3390/a17010029","DOIUrl":"https://doi.org/10.3390/a17010029","url":null,"abstract":"Using spatial data in mobile applications has grown significantly, thereby empowering users to explore locations, navigate unfamiliar areas, find transportation routes, employ geomarketing strategies, and model environmental factors. Spatial databases are pivotal in efficiently storing, retrieving, and manipulating spatial data to fulfill users’ needs. Two fundamental spatial query types, k-nearest neighbors (kNN) and range search, enable users to access specific points of interest (POIs) based on their location, which are measured by actual road distance. However, retrieving the nearest POIs using actual road distance can be computationally intensive due to the need to find the shortest distance. Using straight-line measurements could expedite the process but might compromise accuracy. Consequently, this study aims to evaluate the accuracy of the Euclidean distance method in POIs retrieval by comparing it with the road network distance method. The primary focus is determining whether the trade-off between computational time and accuracy is justified, thus employing the Open Source Routing Machine (OSRM) for distance extraction. The assessment encompasses diverse scenarios and analyses factors influencing the accuracy of the Euclidean distance method. The methodology employs a quantitative approach, thereby categorizing query points based on density and analyzing them using kNN and range query methods. Accuracy in the Euclidean distance method is evaluated against the road network distance method. The results demonstrate peak accuracy for kNN queries at k=1, thus exceeding 85% across classes but declining as k increases. Range queries show varied accuracy based on POI density, with higher-density classes exhibiting earlier accuracy increases. Notably, datasets with fewer POIs exhibit unexpectedly higher accuracy, thereby providing valuable insights into spatial query processing.","PeriodicalId":7636,"journal":{"name":"Algorithms","volume":"81 5","pages":""},"PeriodicalIF":2.3,"publicationDate":"2024-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139440524","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
AlgorithmsPub Date : 2024-01-10DOI: 10.3390/a17010028
Y. Fan, Meng Wang
{"title":"Specification Mining Based on the Ordering Points to Identify the Clustering Structure Clustering Algorithm and Model Checking","authors":"Y. Fan, Meng Wang","doi":"10.3390/a17010028","DOIUrl":"https://doi.org/10.3390/a17010028","url":null,"abstract":"Software specifications are of great importance to improve the quality of software. To automatically mine specifications from software systems, some specification mining approaches based on finite-state automatons have been proposed. However, these approaches are inaccurate when dealing with large-scale systems. In order to improve the accuracy of mined specifications, we propose a specification mining approach based on the ordering points to identify the clustering structure clustering algorithm and model checking. In the approach, the neural network model is first used to produce the feature values of states in the traces of the program. Then, according to the feature values, finite-state automatons are generated based on the ordering points to identify the clustering structure clustering algorithm. Further, the finite-state automaton with the highest F-measure is selected. To improve the quality of the finite-state automatons, we refine it based on model checking. The proposed approach was implemented in a tool named MCLSM and experiments, including 13 target classes, were conducted to evaluate its effectiveness. The experimental results show that the average F-measure of finite-state automatons generated by our method reaches 92.19%, which is higher than most related tools.","PeriodicalId":7636,"journal":{"name":"Algorithms","volume":"3 5","pages":""},"PeriodicalIF":2.3,"publicationDate":"2024-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139439250","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
AlgorithmsPub Date : 2024-01-10DOI: 10.3390/a17010027
V. Sakalauskas, D. Kriksciuniene
{"title":"Personalized Advertising in E-Commerce: Using Clickstream Data to Target High-Value Customers","authors":"V. Sakalauskas, D. Kriksciuniene","doi":"10.3390/a17010027","DOIUrl":"https://doi.org/10.3390/a17010027","url":null,"abstract":"The growing popularity of e-commerce has prompted researchers to take a greater interest in deeper understanding online shopping behavior, consumer interest patterns, and the effectiveness of advertising campaigns. This paper presents a fresh approach for targeting high-value e-shop clients by utilizing clickstream data. We propose the new algorithm to measure customer engagement and recognizing high-value customers. Clickstream data is employed in the algorithm to compute a Customer Merit (CM) index that measures the customer’s level of engagement and anticipates their purchase intent. The CM index is evaluated dynamically by the algorithm, examining the customer’s activity level, efficiency in selecting items, and time spent in browsing. It combines tracking customers browsing and purchasing behaviors with other relevant factors: time spent on the website and frequency of visits to e-shops. This strategy proves highly beneficial for e-commerce enterprises, enabling them to pinpoint potential buyers and design targeted advertising campaigns exclusively for high-value customers of e-shops. It allows not only boosts e-shop sales but also minimizes advertising expenses effectively. The proposed method was tested on actual clickstream data from two e-commerce websites and showed that the personalized advertising campaign outperformed the non-personalized campaign in terms of click-through and conversion rate. In general, the findings suggest, that personalized advertising scenarios can be a useful tool for boosting e-commerce sales and reduce advertising cost. By utilizing clickstream data and adopting a targeted approach, e-commerce businesses can attract and retain high-value customers, leading to higher revenue and profitability.","PeriodicalId":7636,"journal":{"name":"Algorithms","volume":"91 8","pages":""},"PeriodicalIF":2.3,"publicationDate":"2024-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139440304","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
AlgorithmsPub Date : 2024-01-09DOI: 10.3390/a17010026
Amr A. Abd El-Mageed, A. Al-Hamadi, Samy Bakheet, Asmaa H. Abd El-Rahiem
{"title":"Hybrid Sparrow Search-Exponential Distribution Optimization with Differential Evolution for Parameter Prediction of Solar Photovoltaic Models","authors":"Amr A. Abd El-Mageed, A. Al-Hamadi, Samy Bakheet, Asmaa H. Abd El-Rahiem","doi":"10.3390/a17010026","DOIUrl":"https://doi.org/10.3390/a17010026","url":null,"abstract":"It is difficult to determine unknown solar cell and photovoltaic (PV) module parameters owing to the nonlinearity of the characteristic current–voltage (I-V) curve. Despite this, precise parameter estimation is necessary due to the substantial effect parameters have on the efficacy of the PV system with respect to current and energy results. The problem’s characteristics make the handling of algorithms susceptible to local optima and resource-intensive processing. To effectively extract PV model parameter values, an improved hybrid Sparrow Search Algorithm (SSA) with Exponential Distribution Optimization (EDO) based on the Differential Evolution (DE) technique and the bound-constraint modification procedure, called ISSAEDO, is presented in this article. The hybrid strategy utilizes EDO to improve global exploration and SSA to effectively explore the solution space, while DE facilitates local search to improve parameter estimations. The proposed method is compared to standard optimization methods using solar PV system data to demonstrate its effectiveness and speed in obtaining PV model parameters such as the single diode model (SDM) and the double diode model (DDM). The results indicate that the hybrid technique is a viable instrument for enhancing solar PV system design and performance analysis because it can predict PV model parameters accurately.","PeriodicalId":7636,"journal":{"name":"Algorithms","volume":"44 20","pages":""},"PeriodicalIF":2.3,"publicationDate":"2024-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139442474","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
AlgorithmsPub Date : 2024-01-07DOI: 10.3390/a17010025
C. Panagiotakis
{"title":"Particle Swarm Optimization-Based Unconstrained Polygonal Fitting of 2D Shapes","authors":"C. Panagiotakis","doi":"10.3390/a17010025","DOIUrl":"https://doi.org/10.3390/a17010025","url":null,"abstract":"In this paper, we present a general version of polygonal fitting problem called Unconstrained Polygonal Fitting (UPF). Our goal is to represent a given 2D shape S with an N-vertex polygonal curve P with a known number of vertices, so that the Intersection over Union (IoU) metric between S and P is maximized without any assumption or prior knowledge of the object structure and the location of the N-vertices of P that can be placed anywhere in the 2D space. The search space of the UPF problem is a superset of the classical polygonal approximation (PA) problem, where the vertices are constrained to belong in the boundary of the given 2D shape. Therefore, the resulting solutions of the UPF may better approximate the given curve than the solutions of the PA problem. For a given number of vertices N, a Particle Swarm Optimization (PSO) method is used to maximize the IoU metric, which yields almost optimal solutions. Furthermore, the proposed method has also been implemented under the equal area principle so that the total area covered by P is equal to the area of the original 2D shape to measure how this constraint affects IoU metric. The quantitative results obtained on more than 2800 2D shapes included in two standard datasets quantify the performance of the proposed methods and illustrate that their solutions outperform baselines from the literature.","PeriodicalId":7636,"journal":{"name":"Algorithms","volume":"29 5","pages":""},"PeriodicalIF":2.3,"publicationDate":"2024-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139448704","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
AlgorithmsPub Date : 2024-01-03DOI: 10.3390/a17010021
Ștefan-Andrei Ionescu, Camelia Delcea, Nora Chirita, I. Nica
{"title":"Exploring the Use of Artificial Intelligence in Agent-Based Modeling Applications: A Bibliometric Study","authors":"Ștefan-Andrei Ionescu, Camelia Delcea, Nora Chirita, I. Nica","doi":"10.3390/a17010021","DOIUrl":"https://doi.org/10.3390/a17010021","url":null,"abstract":"This research provides a comprehensive analysis of the dynamic interplay between agent-based modeling (ABM) and artificial intelligence (AI) through a meticulous bibliometric study. This study reveals a substantial increase in scholarly interest, particularly post-2006, peaking in 2021 and 2022, indicating a contemporary surge in research on the synergy between AI and ABM. Temporal trends and fluctuations prompt questions about influencing factors, potentially linked to technological advancements or shifts in research focus. The sustained increase in citations per document per year underscores the field’s impact, with the 2021 peak suggesting cumulative influence. Reference Publication Year Spectroscopy (RPYS) reveals historical patterns, and the recent decline prompts exploration into shifts in research focus. Lotka’s law is reflected in the author’s contributions, supported by Pareto analysis. Journal diversity signals extensive exploration of AI applications in ABM. Identifying impactful journals and clustering them per Bradford’s Law provides insights for researchers. Global scientific production dominance and regional collaboration maps emphasize the worldwide landscape. Despite acknowledging limitations, such as citation lag and interdisciplinary challenges, our study offers a global perspective with implications for future research and as a resource in the evolving AI and ABM landscape.","PeriodicalId":7636,"journal":{"name":"Algorithms","volume":"38 3","pages":""},"PeriodicalIF":2.3,"publicationDate":"2024-01-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139452010","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}