Annals of Data Science最新文献

筛选
英文 中文
Utilization of Priori Information in the Estimation of Population Mean for Time-Based Surveys 基于时间的调查中先验信息在人口均值估计中的应用
Annals of Data Science Pub Date : 2023-06-05 DOI: 10.1007/s40745-023-00472-6
Sanjay Kumar, Priyanka Chhaparwal
{"title":"Utilization of Priori Information in the Estimation of Population Mean for Time-Based Surveys","authors":"Sanjay Kumar,&nbsp;Priyanka Chhaparwal","doi":"10.1007/s40745-023-00472-6","DOIUrl":"10.1007/s40745-023-00472-6","url":null,"abstract":"<div><p>Use of a priori information is very common at an estimation stage to form an estimator of a population parameter. Estimation problems can lead to more accurate and efficient estimates using prior information. In this study, we utilized the information from the past surveys along with the information available from the current surveys in the form of a hybrid exponentially weighted moving average to suggest the estimator of the population mean using a known coefficient of variation of the study variable for time-based surveys. We derived the expression of the mean square error of the suggested estimator and established the mathematical conditions to prove the efficiency of the suggested estimator. The results showed that the utilization of information from past surveys and current surveys excels the estimator's efficiency. A simulation study and a real-life example are provided to support using the suggested estimator.</p></div>","PeriodicalId":36280,"journal":{"name":"Annals of Data Science","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45425769","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
LADDERS: Log Based Anomaly Detection and Diagnosis for Enterprise Systems 梯子:基于日志的企业系统异常检测和诊断
Annals of Data Science Pub Date : 2023-06-04 DOI: 10.1007/s40745-023-00471-7
Sakib A. Mondal, Prashanth Rv, Sagar Rao, Arun Menon
{"title":"LADDERS: Log Based Anomaly Detection and Diagnosis for Enterprise Systems","authors":"Sakib A. Mondal,&nbsp;Prashanth Rv,&nbsp;Sagar Rao,&nbsp;Arun Menon","doi":"10.1007/s40745-023-00471-7","DOIUrl":"10.1007/s40745-023-00471-7","url":null,"abstract":"<div><p>Enterprise software can fail due to not only malfunction of application servers, but also due to performance degradation or non-availability of other servers or middle layers. Consequently, valuable time and resources are wasted in trying to identify the root cause of software failures. To address this, we have developed a framework called LADDERS. In LADDERS, anomalous incidents are detected from log events generated by various systems and KPIs (Key Performance Indicators) through an ensemble of supervised and unsupervised models. Without transaction identifiers, it is not possible to relate various events from different systems. LADDERS implements Recursive Parallel Causal Discovery (RPCD) to establish causal relationships among log events. The framework builds coresets using BICO to manage high volumes of log data during training and inferencing. An anomaly can cause a number of anomalies throughout the systems. LADDERS makes use of RPCD again to discover causal relationships among these anomalous events. Probable root causes are revealed from the causal graph and anomaly rating of events using a k-shortest path algorithm. We evaluated LADDERS using live logs from an enterprise system. The results demonstrate its effectiveness and efficiency for anomaly detection.</p></div>","PeriodicalId":36280,"journal":{"name":"Annals of Data Science","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46232475","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Jump-Drop Adjusted Prediction of Cumulative Infected Cases Using the Modified SIS Model 使用修改后的 SIS 模型对累计感染病例进行跳跃式下降调整预测
Annals of Data Science Pub Date : 2023-05-15 DOI: 10.1007/s40745-023-00467-3
Rashi Mohta, Sravya Prathapani, Palash Ghosh
{"title":"Jump-Drop Adjusted Prediction of Cumulative Infected Cases Using the Modified SIS Model","authors":"Rashi Mohta,&nbsp;Sravya Prathapani,&nbsp;Palash Ghosh","doi":"10.1007/s40745-023-00467-3","DOIUrl":"10.1007/s40745-023-00467-3","url":null,"abstract":"<div><p>Accurate prediction of cumulative COVID-19 infected cases is essential for effectively managing the limited healthcare resources in India. Historically, epidemiological models have helped in controlling such epidemics. Models require accurate historical data to predict future outcomes. In our data, there were days exhibiting erratic, apparently anomalous jumps and drops in the number of daily reported COVID-19 infected cases that did not conform with the overall trend. Including those observations in the training data would most likely worsen model predictive accuracy. However, with existing epidemiological models it is not straightforward to determine, for a specific day, whether or not an outcome should be considered anomalous. In this work, we propose an algorithm to automatically identify anomalous ‘jump’ and ‘drop’ days, and then based upon the overall trend, the number of daily infected cases for those days is adjusted and the training data is amended using the adjusted observations. We applied the algorithm in conjunction with a recently proposed, modified Susceptible-Infected-Susceptible (SIS) model to demonstrate that prediction accuracy is improved after adjusting training data counts for apparent erratic anomalous jumps and drops.</p></div>","PeriodicalId":36280,"journal":{"name":"Annals of Data Science","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135086225","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Novel Stock Trading Model based on Reinforcement Learning and Technical Analysis 一种基于强化学习和技术分析的股票交易模型
Annals of Data Science Pub Date : 2023-05-11 DOI: 10.1007/s40745-023-00469-1
Zahra Pourahmadi, Dariush Fareed, Hamid Reza Mirzaei
{"title":"A Novel Stock Trading Model based on Reinforcement Learning and Technical Analysis","authors":"Zahra Pourahmadi,&nbsp;Dariush Fareed,&nbsp;Hamid Reza Mirzaei","doi":"10.1007/s40745-023-00469-1","DOIUrl":"10.1007/s40745-023-00469-1","url":null,"abstract":"<div><p>This study investigates the potential of using reinforcement learning (RL) to establish a financial trading system (FTS), taking into account the main constraint imposed by the stock market, e.g., transaction costs. More specifically, this paper shows the inferior performance of the pure reinforcement learning model when it is applied in a multi-dimensional and noisy stock market environment. To solve this problem and to get a practical and reasonable trading strategies process, a modified RL model is proposed based on the actor-critic method where we have amended the actor by incorporating three metrics from technical analysis. The results show significant improvement compared with traditional trading strategies. The reliability of the model is verified by experimental results on financial data (S&amp;P500 index) and a fair evaluation of the proposed method and pure RL and three benchmarks is demonstrated. Statistical analysis proves that a combination of a) technical analysis (role-based strategies) and b) RL (machine learning strategies) and c) restricting the action of the RL policy network with a few realistic conditions results in trading decisions with higher investment return rates.\u0000</p></div>","PeriodicalId":36280,"journal":{"name":"Annals of Data Science","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-05-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49174695","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Platform Resource Scheduling Method Based on Branch-and-Bound and Genetic Algorithm 基于分支定界和遗传算法的平台资源调度方法
Annals of Data Science Pub Date : 2023-05-11 DOI: 10.1007/s40745-023-00470-8
Yanfen Zhang, Jinyao Ma, Haibin Zhang, Bin Yue
{"title":"Platform Resource Scheduling Method Based on Branch-and-Bound and Genetic Algorithm","authors":"Yanfen Zhang,&nbsp;Jinyao Ma,&nbsp;Haibin Zhang,&nbsp;Bin Yue","doi":"10.1007/s40745-023-00470-8","DOIUrl":"10.1007/s40745-023-00470-8","url":null,"abstract":"<div><p>Platform resource scheduling is an operational research optimization problem of matching tasks and platform resources, which has important applications in production or marketing arrangement layout, combat task planning, etc. The existing algorithms are inflexible in task planning sequence and have poor stability. Aiming at this defect, the branch-and-bound algorithm is combined with the genetic algorithm in this paper. Branch-and-bound algorithm can adaptively adjust the next task to be planned and calculate a variety of feasible task planning sequences. Genetic algorithm is used to assign a platform combination to the selected task. Besides, we put forward a new lower bound calculation method and pruning rule. On the basis of the processing time of the direct successor tasks, the influence of the resource requirements of tasks on the priority of tasks is considered. Numerical experiments show that the proposed algorithm has good performance in platform resource scheduling problem.</p></div>","PeriodicalId":36280,"journal":{"name":"Annals of Data Science","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-05-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43159033","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Bayesian Estimation of Multiple Covariate of Autoregressive (MC-AR) Model 自回归(MC-AR)模型多协变量的贝叶斯估计
Annals of Data Science Pub Date : 2023-05-04 DOI: 10.1007/s40745-023-00468-2
Jitendra Kumar, Ashok Kumar, Varun Agiwal
{"title":"Bayesian Estimation of Multiple Covariate of Autoregressive (MC-AR) Model","authors":"Jitendra Kumar,&nbsp;Ashok Kumar,&nbsp;Varun Agiwal","doi":"10.1007/s40745-023-00468-2","DOIUrl":"10.1007/s40745-023-00468-2","url":null,"abstract":"<div><p>In present scenario, handling covariate/explanatory variable with the model is one of most important factor to study with the models. The main advantages of covariate are it’s dependency on past observations. So, study variable is modelled after explaining both on own past and past and future observation of covariates. Present paper deals estimation of parameters of autoregressive model with multiple covariates under Bayesian approach. A simulation and empirical study is performed to check the applicability of the model and recorded the better results.</p></div>","PeriodicalId":36280,"journal":{"name":"Annals of Data Science","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47960675","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Bayes Analysis of Random Walk Model Under Different Error Assumptions 不同误差假设下随机漫步模型的贝叶斯分析
Annals of Data Science Pub Date : 2023-04-22 DOI: 10.1007/s40745-023-00465-5
Praveen Kumar Tripathi, Manika Agarwal
{"title":"A Bayes Analysis of Random Walk Model Under Different Error Assumptions","authors":"Praveen Kumar Tripathi,&nbsp;Manika Agarwal","doi":"10.1007/s40745-023-00465-5","DOIUrl":"10.1007/s40745-023-00465-5","url":null,"abstract":"<div><p>In this paper, the Bayesian analyses for the random walk models have been performed under the assumptions of normal distribution, log-normal distribution and the stochastic volatility model, for the error component, one by one. For the various parameters, in each model, some suitable choices of informative and non-informative priors have been made and the posterior distributions are calculated. For the first two choices of error distribution, the posterior samples are easily obtained by using the gamma generating routine in R software. For a random walk model, having stochastic volatility error, the Gibbs sampling with intermediate independent Metropolis–Hastings steps is employed to obtain the desired posterior samples. The whole procedure is numerically illustrated through a real data set of crude oil prices from April 2014 to March 2022. The models are, then, compared on the basis of their accuracies in forecasting the true values. Among the other choices, the random walk model with stochastic volatile errors outperformed for the data in hand.\u0000</p></div>","PeriodicalId":36280,"journal":{"name":"Annals of Data Science","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-04-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47611888","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Count Regression and Machine Learning Techniques for Zero-Inflated Overdispersed Count Data: Application to Ecological Data 零膨胀过分散计数数据的计数回归和机器学习技术:在生态数据中的应用
Annals of Data Science Pub Date : 2023-04-13 DOI: 10.1007/s40745-023-00464-6
Bonelwa Sidumo, Energy Sonono, Isaac Takaidza
{"title":"Count Regression and Machine Learning Techniques for Zero-Inflated Overdispersed Count Data: Application to Ecological Data","authors":"Bonelwa Sidumo,&nbsp;Energy Sonono,&nbsp;Isaac Takaidza","doi":"10.1007/s40745-023-00464-6","DOIUrl":"10.1007/s40745-023-00464-6","url":null,"abstract":"<div><p>The aim of this study is to investigate the overdispersion problem that is rampant in ecological count data. In order to explore this problem, we consider the most commonly used count regression models: the Poisson, the negative binomial, the zero-inflated Poisson and the zero-inflated negative binomial models. The performance of these count regression models is compared with the four proposed machine learning (ML) regression techniques: random forests, support vector machines, <span>(k-)</span>nearest neighbors and artificial neural networks. The mean absolute error was used to compare the performance of count regression models and ML regression models. The results suggest that ML regression models perform better compared to count regression models. The performance shown by ML regression techniques is a motivation for further research in improving methods and applications in ecological studies.</p></div>","PeriodicalId":36280,"journal":{"name":"Annals of Data Science","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-04-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s40745-023-00464-6.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43264905","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Inferences Based on Correlated Randomly Censored Gumbel’s Type-I Bivariate Exponential Distribution 基于相关随机截尾GumbelⅠ型双变量指数分布的推断
Annals of Data Science Pub Date : 2023-01-31 DOI: 10.1007/s40745-023-00463-7
Hare Krishna, Rajni Goel
{"title":"Inferences Based on Correlated Randomly Censored Gumbel’s Type-I Bivariate Exponential Distribution","authors":"Hare Krishna,&nbsp;Rajni Goel","doi":"10.1007/s40745-023-00463-7","DOIUrl":"10.1007/s40745-023-00463-7","url":null,"abstract":"<div><p>The formal random censoring plan has been extensively studied earlier in statistical literature by numerous researchers to deal with dropouts or unintentional random removals in life-testing experiments. All of them considered failure time and censoring time to be independent. But there are several situations in which one observes that as the failure time of an item increases, the censoring time decreases. In medical studies or especially in clinical trials, the occurrence of dropouts or unintentional removals is frequently observed in such a way that as the treatment (failure) time increases, the dropout (censoring) time decreases. No work has yet been found that deals with such correlated failure and censoring times. Therefore, in this article, we assume that the failure time is negatively correlated with censoring time, and they follow Gumbel’s type-I bivariate exponential distribution. We compute the maximum likelihood estimates of the model parameters. Using the Monte Carlo Markov chain methods, the Bayesian estimators of the parameters are calculated. The expected experimental time is also evaluated. Finally, for illustrative purposes, a numerical study and a real data set analysis are given.</p></div>","PeriodicalId":36280,"journal":{"name":"Annals of Data Science","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-01-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49343107","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Bayesian Hierarchical Spatial Modeling of COVID-19 Cases in Bangladesh 孟加拉国COVID-19病例的贝叶斯分层空间模型
Annals of Data Science Pub Date : 2023-01-22 DOI: 10.1007/s40745-022-00461-1
Md. Rezaul Karim,  Sefat-E-Barket
{"title":"Bayesian Hierarchical Spatial Modeling of COVID-19 Cases in Bangladesh","authors":"Md. Rezaul Karim,&nbsp; Sefat-E-Barket","doi":"10.1007/s40745-022-00461-1","DOIUrl":"10.1007/s40745-022-00461-1","url":null,"abstract":"<div><p>This research aimed to investigate the spatial autocorrelation and heterogeneity throughout Bangladesh’s 64 districts. Moran <i>I</i> and Geary <i>C</i> are used to measure spatial autocorrelation. Different conventional models, such as Poisson-Gamma and Poisson-Lognormal, and spatial models, such as Conditional Autoregressive (CAR) Model, Convolution Model, and modified CAR Model, have been employed to detect the spatial heterogeneity. Bayesian hierarchical methods via Gibbs sampling are used to implement these models. The best model is selected using the Deviance Information Criterion. Results revealed Dhaka has the highest relative risk due to the city’s high population density and growth rate. This study identifies which district has the highest relative risk and which districts adjacent to that district also have a high risk, which allows for the appropriate actions to be taken by the government agencies and communities to mitigate the risk effect.</p></div>","PeriodicalId":36280,"journal":{"name":"Annals of Data Science","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47950849","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信