Di Chang , Liang Ding , Russell Malmberg , David Robinson , Matthew Wicker , Hongfei Yan , Aaron Martinez , Liming Cai
{"title":"Optimal learning of Markov k-tree topology","authors":"Di Chang , Liang Ding , Russell Malmberg , David Robinson , Matthew Wicker , Hongfei Yan , Aaron Martinez , Liming Cai","doi":"10.1016/j.jcmds.2022.100046","DOIUrl":"10.1016/j.jcmds.2022.100046","url":null,"abstract":"<div><p>The seminal work of Chow and Liu (1968) shows that approximation of a finite probabilistic system by Markov trees can achieve the minimum information loss with the topology of a maximum spanning tree. Our current paper generalizes the result to Markov networks of tree-width <span><math><mrow><mo>≤</mo><mi>k</mi></mrow></math></span>, for every fixed <span><math><mrow><mi>k</mi><mo>≥</mo><mn>2</mn></mrow></math></span>. In particular, we prove that approximation of a finite probabilistic system with such Markov networks has the minimum information loss when the network topology is achieved with a maximum spanning <span><math><mi>k</mi></math></span>-tree. While constructing a maximum spanning <span><math><mi>k</mi></math></span>-tree is intractable for even <span><math><mrow><mi>k</mi><mo>=</mo><mn>2</mn></mrow></math></span>, we show that polynomial algorithms can be ensured by a sufficient condition accommodated by many meaningful applications. In particular, we show an efficient algorithm for learning the optimal topology of higher order correlations among random variables that belong to an underlying linear structure. As an application, we demonstrate effectiveness of this efficient algorithm applied to biomolecular 3D structure prediction.</p></div>","PeriodicalId":100768,"journal":{"name":"Journal of Computational Mathematics and Data Science","volume":"4 ","pages":"Article 100046"},"PeriodicalIF":0.0,"publicationDate":"2022-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S277241582200013X/pdfft?md5=aad4f1525f5d10d5657975c4606226da&pid=1-s2.0-S277241582200013X-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77532056","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Understanding the assumptions of an SEIR compartmental model using agentization and a complexity hierarchy","authors":"Elizabeth Hunter, John D. Kelleher","doi":"10.1016/j.jcmds.2022.100056","DOIUrl":"10.1016/j.jcmds.2022.100056","url":null,"abstract":"<div><p>Equation-based and agent-based models are popular methods in understanding disease dynamics. Although there are many types of equation-based models, the most common is the SIR compartmental model that assumes homogeneous mixing and populations. One way to understand the effects of these assumptions is by agentization. Equation-based models can be agentized by creating a simple agent-based model that replicates the results of the equation-based model, then by adding complexity to these agentized models it is possible to break the assumptions of homogeneous mixing and populations and test how breaking these assumptions results in different outputs. We report a set of experiments comparing the outputs of an SEIR model and a set of agent-based models of varying levels of complexity, using as a case study a measles outbreak in a town in Ireland. We define and use a six level complexity hierarchy for agent-based models to create a set of progressively more complex variants of an agentized SEIR model for the spread of infectious disease. We then compare the results of the agent-based model at each level of complexity with results of the SEIR model to determine when the agentization breaks. Our analysis shows this occurs on the fourth step of complexity, when scheduled movements are added into the model. When agents networks and behaviours are complex the peak of the outbreak is shifted to the right and is lower than in the SEIR model suggesting that heterogeneous populations and mixing patterns lead to slower outbreaks compared homogeneous populations and mixing patterns.</p></div>","PeriodicalId":100768,"journal":{"name":"Journal of Computational Mathematics and Data Science","volume":"4 ","pages":"Article 100056"},"PeriodicalIF":0.0,"publicationDate":"2022-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2772415822000189/pdfft?md5=8f941ae4468af0fee67c94e73183f140&pid=1-s2.0-S2772415822000189-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77558484","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Rajib Biswas , Md. Shahadat Hossain , Rafiqul Islam , Sarder Firoz Ahmmed , S.R. Mishra , Mohammad Afikuzzaman
{"title":"Computational treatment of MHD Maxwell nanofluid flow across a stretching sheet considering higher-order chemical reaction and thermal radiation","authors":"Rajib Biswas , Md. Shahadat Hossain , Rafiqul Islam , Sarder Firoz Ahmmed , S.R. Mishra , Mohammad Afikuzzaman","doi":"10.1016/j.jcmds.2022.100048","DOIUrl":"10.1016/j.jcmds.2022.100048","url":null,"abstract":"<div><p>The present analysis reports a computational study of Magnetohydrodynamic (MHD) flow behaviour of 2D Maxwell nanofluid across a stretched sheet in appearance of Brownian motion. The substantial term thermal radiation and chemical reactions have been employed extensively in the current research. Nanofluids are usually chosen by researchers because of their rheological properties, which are important in determining their appropriateness for convective heat transfer. The present research reveals that the fluid velocity augments for the enhanced values of all the parameters. Heat source, as well as the radiation parameters, ensure that there is enough heat in the fluid, which implies escalation of the thermal boundary layer thickness by accruing radiation parameter. Moreover, streamlines and isotherms have been investigated for the different parametric values. The suggested model is valuable because it has a wide range of applications in domains including medical sciences (treatment of cancer therapeutics), microelectronics, biomedicine, biology, and industrial production processes.</p></div>","PeriodicalId":100768,"journal":{"name":"Journal of Computational Mathematics and Data Science","volume":"4 ","pages":"Article 100048"},"PeriodicalIF":0.0,"publicationDate":"2022-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2772415822000141/pdfft?md5=901981bc7e4956837055a6b712d8d47e&pid=1-s2.0-S2772415822000141-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88860933","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Thermodynamic analysis of a tangent hyperbolic hydromagnetic heat generating fluid in quadratic Boussinesq approximation","authors":"A.R. Hassan , S.O. Salawu , A.B. Disu , O.R. Aderele","doi":"10.1016/j.jcmds.2022.100058","DOIUrl":"10.1016/j.jcmds.2022.100058","url":null,"abstract":"<div><p>The current investigation is to examine the compound impact of electromagnetic induced force and internal heat source on a tangent hyperbolic fluid in quadratic Boussinesq approximation. The current hyperbolic tangent liquid flow and heat transport formulation model adequately predicts and characterizes the shear-stricken event. The nonlinear dimensionless heat transfer flow equations are solved completely using weighted residual solution procedures coupled with Galerkin approximation integration approach. The results in the table and graphs revealed that the magnetic field strength has a substantial impact on the fluid flow and heat propagation, as well as the internal heat source. Therefore, the entropy generation is optimized through an enhanced thermodynamic equilibrium and adequate control of heat generating terms and energy loss.</p></div>","PeriodicalId":100768,"journal":{"name":"Journal of Computational Mathematics and Data Science","volume":"4 ","pages":"Article 100058"},"PeriodicalIF":0.0,"publicationDate":"2022-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2772415822000190/pdfft?md5=8078865678a19d4ad4b600d103f6351a&pid=1-s2.0-S2772415822000190-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88840844","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pierluigi Amodio , Marcello De Giosa , Felice Iavernaro , Roberto La Scala , Arcangelo Labianca , Monica Lazzo , Francesca Mazzia , Lorenzo Pisani
{"title":"Detection of anomalies in the proximity of a railway line: A case study","authors":"Pierluigi Amodio , Marcello De Giosa , Felice Iavernaro , Roberto La Scala , Arcangelo Labianca , Monica Lazzo , Francesca Mazzia , Lorenzo Pisani","doi":"10.1016/j.jcmds.2022.100052","DOIUrl":"https://doi.org/10.1016/j.jcmds.2022.100052","url":null,"abstract":"<div><p>A point cloud describing a railway environment is considered in a case study aimed at presenting a workflow for the automatic detection of external objects that, coming too close to the railway infrastructure, may cause potential risks for its correct functioning. The approach combines classical semantic segmentation methodologies with a novel geometric and numerical procedure to define a <em>region of interest</em>, consisting of a lower tube enveloping the 3D space occupied by the train during its transit and an upper tube enclosing the overhead contact lines. One useful application could be automatic vegetation monitoring in the proximity of the railway structure, which would help with planning maintenance pruning activities.</p></div>","PeriodicalId":100768,"journal":{"name":"Journal of Computational Mathematics and Data Science","volume":"4 ","pages":"Article 100052"},"PeriodicalIF":0.0,"publicationDate":"2022-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2772415822000165/pdfft?md5=39ce7dbb7fdd23f164ad540509765339&pid=1-s2.0-S2772415822000165-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"137407105","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A general framework for hypercomplex-valued extreme learning machines","authors":"Guilherme Vieira, Marcos Eduardo Valle","doi":"10.1016/j.jcmds.2022.100032","DOIUrl":"https://doi.org/10.1016/j.jcmds.2022.100032","url":null,"abstract":"<div><p>This paper aims to establish a framework for extreme learning machines (ELMs) on general hypercomplex algebras. Hypercomplex neural networks are machine learning models that feature higher-dimension numbers as parameters, inputs, and outputs. Firstly, we review broad hypercomplex algebras and show a framework to operate in these algebras through real-valued linear algebra operations in a robust manner. We proceed to explore a handful of well-known four-dimensional examples. Then, we propose the hypercomplex-valued ELMs and derive their learning using a hypercomplex-valued least-squares problem. Finally, we compare real and hypercomplex-valued ELM models’ performance in an experiment on time-series prediction and another on color image auto-encoding. The computational experiments highlight the excellent performance of hypercomplex-valued ELMs to treat multi-dimensional data, including models based on unusual hypercomplex algebras.</p></div>","PeriodicalId":100768,"journal":{"name":"Journal of Computational Mathematics and Data Science","volume":"3 ","pages":"Article 100032"},"PeriodicalIF":0.0,"publicationDate":"2022-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2772415822000062/pdfft?md5=a9358c110cb7cefa5f7093886926f21f&pid=1-s2.0-S2772415822000062-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72243327","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Giuseppina Andresini , Andrea Iovine , Roberto Gasbarro , Marco Lomolino , Marco de Gemmis , Annalisa Appice
{"title":"EUPHORIA: A neural multi-view approach to combine content and behavioral features in review spam detection","authors":"Giuseppina Andresini , Andrea Iovine , Roberto Gasbarro , Marco Lomolino , Marco de Gemmis , Annalisa Appice","doi":"10.1016/j.jcmds.2022.100036","DOIUrl":"10.1016/j.jcmds.2022.100036","url":null,"abstract":"<div><p>Nowadays, online reviews are the main source to customer opinions. They are especially important in the realm of e-commerce, where reviews regarding products and services influence the purchase decisions of customers, as well as the reputation of the commerce websites. Unfortunately, not all the online reviews are truthful and trustworthy. Therefore, it is crucial to develop machine learning techniques to detect review spam. This study describes <span>EUPHORIA</span> — a novel classification approach to distinguish spam from truthful reviews. This approach couples multi-view learning to deep learning, in order to gain accuracy by accounting for the variety of information possibly associated with both the reviews’ content and the reviewers’ behavior. Experiments carried out on two real review datasets from Yelp.com – Hotel and Restaurant – show that the use of multi-view learning can improve the performance of a deep learning classifier trained for review spam detection. In particular, the proposed approach achieves AUC-ROC equal to 0.813 and 0.708 in Hotel and Restaurant, respectively.</p></div>","PeriodicalId":100768,"journal":{"name":"Journal of Computational Mathematics and Data Science","volume":"3 ","pages":"Article 100036"},"PeriodicalIF":0.0,"publicationDate":"2022-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2772415822000086/pdfft?md5=2d7de96c79d3f46c848780e22dd8e576&pid=1-s2.0-S2772415822000086-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81237745","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An improved K-medoids clustering approach based on the crow search algorithm","authors":"Nitesh Sureja , Bharat Chawda , Avani Vasant","doi":"10.1016/j.jcmds.2022.100034","DOIUrl":"https://doi.org/10.1016/j.jcmds.2022.100034","url":null,"abstract":"<div><p>K-medoids clustering algorithm is a simple yet effective algorithm that has been applied to solve many clustering problems. Instead of using the mean point as the centre of a cluster, K-medoids uses an actual point to represent it. Medoid is the most centrally located object of the cluster, with a minimum sum of distances to other points. K-medoids can correctly represent the cluster centre as it is robust to outliers. However, the K-medoids algorithm is unsuitable for clustering arbitrary shaped groups of objects and large scale datasets. This is because it uses compactness as a clustering criterion instead of connectivity. An improved k-medoids algorithm based on the crow search algorithm is proposed to overcome the above problems. This research uses the crow search algorithm to improve the balance between the exploration and exploitation process of the K-medoids algorithm. Experimental result comparison shows that the proposed improved algorithm performs better than other competitors.</p></div>","PeriodicalId":100768,"journal":{"name":"Journal of Computational Mathematics and Data Science","volume":"3 ","pages":"Article 100034"},"PeriodicalIF":0.0,"publicationDate":"2022-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2772415822000074/pdfft?md5=51264beac75b1244da73f110e16c4c0a&pid=1-s2.0-S2772415822000074-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72243328","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Revealing influence of meteorological conditions and flight factors on delays Using XGBoost","authors":"Yinghan Wu, Gang Mei, Kaixuan Shao","doi":"10.1016/j.jcmds.2022.100030","DOIUrl":"https://doi.org/10.1016/j.jcmds.2022.100030","url":null,"abstract":"<div><p>With the increasing demand for air transportation, the negative impact of flight delays has been paid more and more attention, especially in the hubs of large cities. By examining flight delay data and analyzing the main factors affecting flight delays, the causes of flight delays can be found and effectively avoided. In this paper, we collect meteorological data and flight data of New York’s John F. Kennedy International Airport (JFK), Laguardia Airport (LGA), and Newark Liberty International Airport (EWR). By consulting relevant data, we select the factors that may have a strong correlation with flight delays, and we simplify and classify the data. Based on the preliminary analysis of the relationship between a single factor and flight delays, we use XGBoost to predict and analyze flight delays. We find that: (1) the effect of a single feature on flight delays is limited; (2) departure time, carrier, and precipitation have a great influence on flight delays; and (3) the accuracy of the prediction results of the change of delay duration during flight is better than the departure delay and arrival delay. Our research results can help airports combine meteorological conditions and forecasts to arrange flights properly and reduce the rate of flight delays and the losses to airlines and passengers.</p></div>","PeriodicalId":100768,"journal":{"name":"Journal of Computational Mathematics and Data Science","volume":"3 ","pages":"Article 100030"},"PeriodicalIF":0.0,"publicationDate":"2022-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2772415822000050/pdfft?md5=bee0b2b1da153dcda474586e7f45857c&pid=1-s2.0-S2772415822000050-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136550813","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"MicroRNA signature for interpretable breast cancer classification with subtype clue","authors":"Paolo Andreini , Simone Bonechi , Monica Bianchini , Filippo Geraci","doi":"10.1016/j.jcmds.2022.100042","DOIUrl":"https://doi.org/10.1016/j.jcmds.2022.100042","url":null,"abstract":"<div><p>MicroRNAs (miRNAs) are short non-coding RNAs engaged in cellular regulation by suppressing genes at their post-transcriptional stage. Evidence of their involvement in breast cancer and the possibility of quantifying the their concentration in the blood has sparked the hope of using them as reliable, inexpensive and non-invasive biomarkers.</p><p>While differential expression analysis succeeded in identifying groups of disregulated miRNAs among tumor and healthy samples, its intrinsic dual nature makes it inadequate for cancer subtype detection. Using artificial intelligence or machine learning to uncover complex profiles of miRNA expression associated with different breast cancer subtypes has poorly been investigated and only few recent works have explored this possibility. However, the use of the same dataset both for training and testing leaves the issue of the robustness of these results still open.</p><p>In this paper, we propose a two-stage method that leverages on two ad-hoc classifiers for tumor/healthy classification and subtype identification. We assess our results using two completely independent datasets: TGCA for training and GSE68085 for testing. Experiments show that our strategy is extraordinarily effective especially for tumor/healthy classification, where we achieved an accuracy of 0.99. Yet, by means of a feature importance mechanism, our method is able to display which miRNAs lead to every single sample classification so as to enable a personalized medicine approach to therapy as well as the algorithm explainability required by the EU GDPR regulation and other similar legislations.</p></div>","PeriodicalId":100768,"journal":{"name":"Journal of Computational Mathematics and Data Science","volume":"3 ","pages":"Article 100042"},"PeriodicalIF":0.0,"publicationDate":"2022-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2772415822000116/pdfft?md5=5ebd30b1a40a0f15df580e1b4efa8552&pid=1-s2.0-S2772415822000116-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72292921","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}