Serena Grazia De Benedictis , Grazia Gargano , Gaetano Settembre
{"title":"Enhanced MRI brain tumor detection and classification via topological data analysis and low-rank tensor decomposition","authors":"Serena Grazia De Benedictis , Grazia Gargano , Gaetano Settembre","doi":"10.1016/j.jcmds.2024.100103","DOIUrl":"10.1016/j.jcmds.2024.100103","url":null,"abstract":"<div><div>The advent of artificial intelligence in medical imaging has paved the way for significant advancements in the diagnosis of brain tumors. This study presents a novel ensemble approach that uses magnetic resonance imaging (MRI) to identify and categorize common brain cancers, such as pituitary, meningioma, and glioma. The proposed workflow is composed of a two-fold approach: firstly, it employs non-trivial image enhancement techniques in data preprocessing, low-rank Tucker decomposition for dimensionality reduction, and machine learning (ML) classifiers to detect and predict the type of brain tumor. Secondly, persistent homology (PH), a topological data analysis (TDA) technique, is exploited to extract potential critical areas in MRI scans. When paired with the ML classifier output, this additional information can help domain experts to identify areas of interest that might contain tumor signatures, improving the interpretability of ML predictions. When compared to automated diagnoses, this transparency adds another level of confidence and is essential for clinical acceptance. The performance of the system was quantitatively evaluated on a well-known MRI dataset, with an overall classification accuracy of 97.28% using an extremely randomized trees model. The promising results show that the integration of TDA, ML, and low-rank approximation methods is a successful approach for brain tumor identification and categorization, providing a solid foundation for further study and clinical application.</div></div>","PeriodicalId":100768,"journal":{"name":"Journal of Computational Mathematics and Data Science","volume":"13 ","pages":"Article 100103"},"PeriodicalIF":0.0,"publicationDate":"2024-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142421856","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
K. Gunasekaran , V.D. Ambeth Kumar , Mary Judith A.
{"title":"Artifact removal from ECG signals using online recursive independent component analysis","authors":"K. Gunasekaran , V.D. Ambeth Kumar , Mary Judith A.","doi":"10.1016/j.jcmds.2024.100102","DOIUrl":"10.1016/j.jcmds.2024.100102","url":null,"abstract":"<div><div>The diagnosis of cardiac abnormalities and monitoring of heart health heavily rely on Electrocardiogram (ECG) signals. Unfortunately, these signals frequently encounter interference from diverse artifacts, impeding precise interpretation and analysis. To overcome this challenge, we suggest a novel method for real-time artifact removal from ECG signals through the utilization of Online Recursive Independent Component Analysis (ORICA). Our study outlines a systematic preprocessing pipeline, adaptively estimating the mixing matrix and demixing matrix of the ICA model while streaming data is processed. Additionally, we explore the selection of appropriate ICA components and the use of relevant feature extraction techniques to enhance the quality of extracted cardiac signals. This research presents a promising solution for removing artifacts from ECG signals in real-time, paving the way for improved cardiac diagnostics and monitoring systems. Comparative analyses demonstrate significant improvements in the accuracy of subsequent ECG analysis and interpretation following the application of our ORICA-based preprocessing.</div></div>","PeriodicalId":100768,"journal":{"name":"Journal of Computational Mathematics and Data Science","volume":"13 ","pages":"Article 100102"},"PeriodicalIF":0.0,"publicationDate":"2024-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142421855","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Sabastine Emmanuel , Saratha Sathasivam , Muideen O. Ogunniran
{"title":"Leveraging feed-forward neural networks to enhance the hybrid block derivative methods for system of second-order ordinary differential equations","authors":"Sabastine Emmanuel , Saratha Sathasivam , Muideen O. Ogunniran","doi":"10.1016/j.jcmds.2024.100101","DOIUrl":"10.1016/j.jcmds.2024.100101","url":null,"abstract":"<div><div>This study introduces an innovative method combining discrete hybrid block techniques and artificial intelligence to enhance the solution of second-order Ordinary Differential Equations (ODEs). By integrating feed-forward neural networks (FFNN) into the hybrid block derivative method (HBDM), the modified approach shows improved accuracy and efficiency compared to traditional methods. Through comprehensive comparisons with exact and existing solutions, the study demonstrates the effectiveness of the proposed approach. The evaluation, utilizing root mean square error (RMSE), confirms its superior performance, robustness, and applicability in diverse scenarios. This research sets a new standard for solving complex ODE systems, offering promising avenues for future research and practical implementations.</div></div>","PeriodicalId":100768,"journal":{"name":"Journal of Computational Mathematics and Data Science","volume":"13 ","pages":"Article 100101"},"PeriodicalIF":0.0,"publicationDate":"2024-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142323259","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"On resolution coresets for constrained clustering","authors":"Maximilian Fiedler, Peter Gritzmann, Fabian Klemm","doi":"10.1016/j.jcmds.2024.100100","DOIUrl":"10.1016/j.jcmds.2024.100100","url":null,"abstract":"<div><p>Specific data compression techniques, formalized by the concept of coresets, proved to be powerful for many optimization problems. In fact, while tightly controlling the approximation error, coresets may lead to significant speed up of the computations and hence allow to extend algorithms to much larger problem sizes. The present paper deals with a weight-balanced clustering problem, and is specifically motivated by an application in materials science where a voxel-based image is to be processed into a diagram representation. Here, the class of desired coresets is naturally confined to those which can be viewed as lowering the resolution of the input data. While one might expect that such resolution coresets are inferior to unrestricted coreset we prove bounds for resolution coresets which improve known bounds in the relevant dimensions and also lead to significantly faster algorithms in practice.</p></div>","PeriodicalId":100768,"journal":{"name":"Journal of Computational Mathematics and Data Science","volume":"12 ","pages":"Article 100100"},"PeriodicalIF":0.0,"publicationDate":"2024-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2772415824000117/pdfft?md5=119df73da5369d09083c391d94764956&pid=1-s2.0-S2772415824000117-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142150297","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Fast empirical scenarios","authors":"Michael Multerer , Paul Schneider , Rohan Sen","doi":"10.1016/j.jcmds.2024.100099","DOIUrl":"10.1016/j.jcmds.2024.100099","url":null,"abstract":"<div><p>We seek to extract a small number of representative scenarios from large panel data that are consistent with sample moments. Among two novel algorithms, the first identifies scenarios that have not been observed before, and comes with a scenario-based representation of covariance matrices. The second proposal selects important data points from states of the world that have already realized, and are consistent with higher-order sample moment information. Both algorithms are efficient to compute and lend themselves to consistent scenario-based modeling and multi-dimensional numerical integration that can be used for interpretable decision-making under uncertainty. Extensive numerical benchmarking studies and an application in portfolio optimization favor the proposed algorithms.</p></div>","PeriodicalId":100768,"journal":{"name":"Journal of Computational Mathematics and Data Science","volume":"12 ","pages":"Article 100099"},"PeriodicalIF":0.0,"publicationDate":"2024-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2772415824000105/pdfft?md5=701519346db6f93b6f348d8512c143fa&pid=1-s2.0-S2772415824000105-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142150298","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Estimating data complexity and drift through a multiscale generalized impurity approach","authors":"Diogo Costa , Eugénio M. Rocha , Nelson Ferreira","doi":"10.1016/j.jcmds.2024.100098","DOIUrl":"10.1016/j.jcmds.2024.100098","url":null,"abstract":"<div><p>The quality of machine learning solutions, and of classifier models in general, depend largely on the performance of the chosen algorithm, and on the intrinsic characteristics of the input data. Although work has been extensive on the former of these aspects, the latter has received comparably less attention. In this paper, we introduce the Multiscale Impurity Complexity Analysis (MICA) algorithm for the quantification of class separability and decision-boundary complexity of datasets. MICA is both model and dimensionality-independent and can provide a measure of separability based on regional impurity values. This makes it so that MICA is sensible to both global and local data conditions. We show MICA to be capable of properly describing class separability in a comprehensive set of both synthetic and real datasets and comparing it against other state-of-the-art methods. After establishing the robustness of the proposed method, alternative applications are discussed, including a streaming-data variant of MICA (MICA-S), that can be repurposed into a model-independent method for concept drift detection.</p></div>","PeriodicalId":100768,"journal":{"name":"Journal of Computational Mathematics and Data Science","volume":"12 ","pages":"Article 100098"},"PeriodicalIF":0.0,"publicationDate":"2024-08-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2772415824000099/pdfft?md5=54b719dae828872e98af24740cf27e23&pid=1-s2.0-S2772415824000099-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142076295","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Structured stochastic curve fitting without gradient calculation","authors":"Jixin Chen","doi":"10.1016/j.jcmds.2024.100097","DOIUrl":"10.1016/j.jcmds.2024.100097","url":null,"abstract":"<div><p>Optimization of parameters and hyperparameters is a general process for any data analysis. Because not all models are mathematically well-behaved, stochastic optimization can be useful in many analyses by randomly choosing parameters in each optimization iteration. Many such algorithms have been reported and applied in chemistry data analysis, but the one reported here is interesting to check out, where a naïve algorithm searches each parameter sequentially and randomly in its bounds. Then it picks the best for the next iteration. Thus, one can ignore irrational solution of the model itself or its gradient in parameter space and continue the optimization.</p></div>","PeriodicalId":100768,"journal":{"name":"Journal of Computational Mathematics and Data Science","volume":"12 ","pages":"Article 100097"},"PeriodicalIF":0.0,"publicationDate":"2024-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2772415824000087/pdfft?md5=d29b0c976e4cd3877c7a001f5d45fd9a&pid=1-s2.0-S2772415824000087-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141841097","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A DEIM-CUR factorization with iterative SVDs","authors":"Perfect Y. Gidisu, Michiel E. Hochstenbach","doi":"10.1016/j.jcmds.2024.100095","DOIUrl":"https://doi.org/10.1016/j.jcmds.2024.100095","url":null,"abstract":"<div><p>A CUR factorization is often utilized as a substitute for the singular value decomposition (SVD), especially when a concrete interpretation of the singular vectors is challenging. Moreover, if the original data matrix possesses properties like nonnegativity and sparsity, a CUR decomposition can better preserve them compared to the SVD. An essential aspect of this approach is the methodology used for selecting a subset of columns and rows from the original matrix. This study investigates the effectiveness of <em>one-round sampling</em> and iterative subselection techniques and introduces new iterative subselection strategies based on iterative SVDs. One provably appropriate technique for index selection in constructing a CUR factorization is the discrete empirical interpolation method (DEIM). Our contribution aims to improve the approximation quality of the DEIM scheme by iteratively invoking it in several rounds, in the sense that we select subsequent columns and rows based on the previously selected ones. Thus, we modify <span><math><mi>A</mi></math></span> after each iteration by removing the information that has been captured by the previously selected columns and rows. We also discuss how iterative procedures for computing a few singular vectors of large data matrices can be integrated with the new iterative subselection strategies. We present the results of numerical experiments, providing a comparison of one-round sampling and iterative subselection techniques, and demonstrating the improved approximation quality associated with using the latter.</p></div>","PeriodicalId":100768,"journal":{"name":"Journal of Computational Mathematics and Data Science","volume":"12 ","pages":"Article 100095"},"PeriodicalIF":0.0,"publicationDate":"2024-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2772415824000063/pdfft?md5=16d9fd47f077d52851c28e4d876eb3c0&pid=1-s2.0-S2772415824000063-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141484237","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Bayesian sparsity and class sparsity priors for dictionary learning and coding","authors":"A. Bocchinfuso , D. Calvetti, E. Somersalo","doi":"10.1016/j.jcmds.2024.100094","DOIUrl":"https://doi.org/10.1016/j.jcmds.2024.100094","url":null,"abstract":"<div><p>Dictionary learning methods continue to gain popularity for the solution of challenging inverse problems. In the dictionary learning approach, the computational forward model is replaced by a large dictionary of possible outcomes, and the problem is to identify the dictionary entries that best match the data, akin to traditional query matching in search engines. Sparse coding techniques are used to guarantee that the dictionary matching identifies only few of the dictionary entries, and dictionary compression methods are used to reduce the complexity of the matching problem. In this article, we propose a work flow to facilitate the dictionary matching process. First, the full dictionary is divided into subdictionaries that are separately compressed. The error introduced by the dictionary compression is handled in the Bayesian framework as a modeling error. Furthermore, we propose a new Bayesian data-driven group sparsity coding method to help identify subdictionaries that are not relevant for the dictionary matching. After discarding irrelevant subdictionaries, the dictionary matching is addressed as a deflated problem using sparse coding. The compression and deflation steps can lead to substantial decreases of the computational complexity. The effectiveness of compensating for the dictionary compression error and using the novel group sparsity promotion to deflate the original dictionary are illustrated by applying the methodology to real world problems, the glitch detection in the LIGO experiment and hyperspectral remote sensing.</p></div>","PeriodicalId":100768,"journal":{"name":"Journal of Computational Mathematics and Data Science","volume":"11 ","pages":"Article 100094"},"PeriodicalIF":0.0,"publicationDate":"2024-03-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2772415824000051/pdfft?md5=87116ca1a8ef189c30f80b5ed4b567bd&pid=1-s2.0-S2772415824000051-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140321133","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Simulation of Erlang and negative binomial distributions using the generalized Lambert W function","authors":"C.Y. Chew , G. Teng , Y.S. Lai","doi":"10.1016/j.jcmds.2024.100092","DOIUrl":"https://doi.org/10.1016/j.jcmds.2024.100092","url":null,"abstract":"<div><p>We present a simulation method for generating random variables from Erlang and negative binomial distributions using the generalized Lambert <span><math><mi>W</mi></math></span> function. The generalized Lambert <span><math><mi>W</mi></math></span> function is utilized to solve the quantile functions of these distributions, allowing for efficient and accurate generation of random variables. The simulation procedure is based on Halley’s method and is demonstrated through the generation of 100,000 random variables for each distribution. The results show close agreement with the theoretical mean and variance values, indicating the effectiveness of the proposed method. This approach offers a valuable tool for generating random variables from Erlang and negative binomial distributions in various applications.</p></div>","PeriodicalId":100768,"journal":{"name":"Journal of Computational Mathematics and Data Science","volume":"10 ","pages":"Article 100092"},"PeriodicalIF":0.0,"publicationDate":"2024-02-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2772415824000038/pdfft?md5=106597da03409e1369af24276ca25af6&pid=1-s2.0-S2772415824000038-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139737745","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}