Annals of Data Science最新文献

筛选
英文 中文
On Poisson Moment Exponential Distribution with Associated Regression and INAR(1) Process 带关联回归和INAR(1)过程的泊松矩指数分布
Annals of Data Science Pub Date : 2023-06-08 DOI: 10.1007/s40745-023-00476-2
R. Maya, Jie Huang, M. R. Irshad, Fukang Zhu
{"title":"On Poisson Moment Exponential Distribution with Associated Regression and INAR(1) Process","authors":"R. Maya,&nbsp;Jie Huang,&nbsp;M. R. Irshad,&nbsp;Fukang Zhu","doi":"10.1007/s40745-023-00476-2","DOIUrl":"10.1007/s40745-023-00476-2","url":null,"abstract":"<div><p>Numerous studies have emphasised the significance of count data modeling and its applications to phenomena that occur in the real world. From this perspective, this article examines the traits and applications of the Poisson-moment exponential (PME) distribution in the contexts of time series analysis and regression analysis for real-world phenomena. The PME distribution is a novel one-parameter discrete distribution that can be used as a powerful alternative for the existing distributions for modeling over-dispersed count datasets. The advantages of the PME distribution, including the simplicity of the probability mass function and the explicit expressions of the functions of all the statistical properties, drove us to develop the inferential aspects and learn more about its practical applications. The unknown parameter is estimated using both maximum likelihood and moment estimation methods. Also, we present a parametric regression model based on the PME distribution for the count datasets. To strengthen the utility of the suggested distribution, we propose a new first-order integer-valued autoregressive (INAR(1)) process with PME innovations based on binomial thinning for modeling integer-valued time series with over-dispersion. Application to four real datasets confirms the empirical significance of the proposed model.</p></div>","PeriodicalId":36280,"journal":{"name":"Annals of Data Science","volume":"11 5","pages":"1741 - 1759"},"PeriodicalIF":0.0,"publicationDate":"2023-06-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43264212","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A New Compound Distribution and Its Applications in Over-dispersed Count Data 一种新的复合分布及其在过分散计数数据中的应用
Annals of Data Science Pub Date : 2023-06-07 DOI: 10.1007/s40745-023-00478-0
Peer Bilal Ahmad, Mohammad Kafeel Wani
{"title":"A New Compound Distribution and Its Applications in Over-dispersed Count Data","authors":"Peer Bilal Ahmad,&nbsp;Mohammad Kafeel Wani","doi":"10.1007/s40745-023-00478-0","DOIUrl":"10.1007/s40745-023-00478-0","url":null,"abstract":"<div><p>Every time variance exceeds mean, over-dispersed models are typically employed. This is the reason that over-dispersed models are such an important aspect of statistical modeling. In this work, the parameter of Poisson distribution is assumed to follow a new lifespan distribution called as Chris-Jerry distribution. The resulting compound distribution is an over-dispersed model known as the Poisson-Chris-Jerry distribution. As a result of deriving a general expression for the <i>r th</i> factorial moment, we acquired the moments about origin and the central moments. In addition to this, moment’s related measurements, generating functions, over-dispersion property, reliability characteristics, recurrence relation for probability, and other statistical qualities, have also been described. For the goal of estimating parameter of the suggested model, the maximum likelihood estimation and method of moment estimation have been addressed. The usefulness of maximum likelihood estimates has also been taken into consideration through a simulation study. We employed four real life data sets, examined the goodness-of-fit test, and considered additional standards such as the Akaike’s information criterion and Bayesian information criterion. The outcomes are compared with several potential models.</p></div>","PeriodicalId":36280,"journal":{"name":"Annals of Data Science","volume":"11 5","pages":"1799 - 1820"},"PeriodicalIF":0.0,"publicationDate":"2023-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46822534","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Utilization of Priori Information in the Estimation of Population Mean for Time-Based Surveys 基于时间的调查中先验信息在人口均值估计中的应用
Annals of Data Science Pub Date : 2023-06-05 DOI: 10.1007/s40745-023-00472-6
Sanjay Kumar, Priyanka Chhaparwal
{"title":"Utilization of Priori Information in the Estimation of Population Mean for Time-Based Surveys","authors":"Sanjay Kumar,&nbsp;Priyanka Chhaparwal","doi":"10.1007/s40745-023-00472-6","DOIUrl":"10.1007/s40745-023-00472-6","url":null,"abstract":"<div><p>Use of a priori information is very common at an estimation stage to form an estimator of a population parameter. Estimation problems can lead to more accurate and efficient estimates using prior information. In this study, we utilized the information from the past surveys along with the information available from the current surveys in the form of a hybrid exponentially weighted moving average to suggest the estimator of the population mean using a known coefficient of variation of the study variable for time-based surveys. We derived the expression of the mean square error of the suggested estimator and established the mathematical conditions to prove the efficiency of the suggested estimator. The results showed that the utilization of information from past surveys and current surveys excels the estimator's efficiency. A simulation study and a real-life example are provided to support using the suggested estimator.</p></div>","PeriodicalId":36280,"journal":{"name":"Annals of Data Science","volume":"11 5","pages":"1675 - 1685"},"PeriodicalIF":0.0,"publicationDate":"2023-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45425769","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
LADDERS: Log Based Anomaly Detection and Diagnosis for Enterprise Systems 梯子:基于日志的企业系统异常检测和诊断
Annals of Data Science Pub Date : 2023-06-04 DOI: 10.1007/s40745-023-00471-7
Sakib A. Mondal, Prashanth Rv, Sagar Rao, Arun Menon
{"title":"LADDERS: Log Based Anomaly Detection and Diagnosis for Enterprise Systems","authors":"Sakib A. Mondal,&nbsp;Prashanth Rv,&nbsp;Sagar Rao,&nbsp;Arun Menon","doi":"10.1007/s40745-023-00471-7","DOIUrl":"10.1007/s40745-023-00471-7","url":null,"abstract":"<div><p>Enterprise software can fail due to not only malfunction of application servers, but also due to performance degradation or non-availability of other servers or middle layers. Consequently, valuable time and resources are wasted in trying to identify the root cause of software failures. To address this, we have developed a framework called LADDERS. In LADDERS, anomalous incidents are detected from log events generated by various systems and KPIs (Key Performance Indicators) through an ensemble of supervised and unsupervised models. Without transaction identifiers, it is not possible to relate various events from different systems. LADDERS implements Recursive Parallel Causal Discovery (RPCD) to establish causal relationships among log events. The framework builds coresets using BICO to manage high volumes of log data during training and inferencing. An anomaly can cause a number of anomalies throughout the systems. LADDERS makes use of RPCD again to discover causal relationships among these anomalous events. Probable root causes are revealed from the causal graph and anomaly rating of events using a k-shortest path algorithm. We evaluated LADDERS using live logs from an enterprise system. The results demonstrate its effectiveness and efficiency for anomaly detection.</p></div>","PeriodicalId":36280,"journal":{"name":"Annals of Data Science","volume":"11 4","pages":"1165 - 1183"},"PeriodicalIF":0.0,"publicationDate":"2023-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46232475","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Jump-Drop Adjusted Prediction of Cumulative Infected Cases Using the Modified SIS Model 使用修改后的 SIS 模型对累计感染病例进行跳跃式下降调整预测
Annals of Data Science Pub Date : 2023-05-15 DOI: 10.1007/s40745-023-00467-3
Rashi Mohta, Sravya Prathapani, Palash Ghosh
{"title":"Jump-Drop Adjusted Prediction of Cumulative Infected Cases Using the Modified SIS Model","authors":"Rashi Mohta,&nbsp;Sravya Prathapani,&nbsp;Palash Ghosh","doi":"10.1007/s40745-023-00467-3","DOIUrl":"10.1007/s40745-023-00467-3","url":null,"abstract":"<div><p>Accurate prediction of cumulative COVID-19 infected cases is essential for effectively managing the limited healthcare resources in India. Historically, epidemiological models have helped in controlling such epidemics. Models require accurate historical data to predict future outcomes. In our data, there were days exhibiting erratic, apparently anomalous jumps and drops in the number of daily reported COVID-19 infected cases that did not conform with the overall trend. Including those observations in the training data would most likely worsen model predictive accuracy. However, with existing epidemiological models it is not straightforward to determine, for a specific day, whether or not an outcome should be considered anomalous. In this work, we propose an algorithm to automatically identify anomalous ‘jump’ and ‘drop’ days, and then based upon the overall trend, the number of daily infected cases for those days is adjusted and the training data is amended using the adjusted observations. We applied the algorithm in conjunction with a recently proposed, modified Susceptible-Infected-Susceptible (SIS) model to demonstrate that prediction accuracy is improved after adjusting training data counts for apparent erratic anomalous jumps and drops.</p></div>","PeriodicalId":36280,"journal":{"name":"Annals of Data Science","volume":"11 3","pages":"959 - 978"},"PeriodicalIF":0.0,"publicationDate":"2023-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135086225","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Novel Stock Trading Model based on Reinforcement Learning and Technical Analysis 一种基于强化学习和技术分析的股票交易模型
Annals of Data Science Pub Date : 2023-05-11 DOI: 10.1007/s40745-023-00469-1
Zahra Pourahmadi, Dariush Fareed, Hamid Reza Mirzaei
{"title":"A Novel Stock Trading Model based on Reinforcement Learning and Technical Analysis","authors":"Zahra Pourahmadi,&nbsp;Dariush Fareed,&nbsp;Hamid Reza Mirzaei","doi":"10.1007/s40745-023-00469-1","DOIUrl":"10.1007/s40745-023-00469-1","url":null,"abstract":"<div><p>This study investigates the potential of using reinforcement learning (RL) to establish a financial trading system (FTS), taking into account the main constraint imposed by the stock market, e.g., transaction costs. More specifically, this paper shows the inferior performance of the pure reinforcement learning model when it is applied in a multi-dimensional and noisy stock market environment. To solve this problem and to get a practical and reasonable trading strategies process, a modified RL model is proposed based on the actor-critic method where we have amended the actor by incorporating three metrics from technical analysis. The results show significant improvement compared with traditional trading strategies. The reliability of the model is verified by experimental results on financial data (S&amp;P500 index) and a fair evaluation of the proposed method and pure RL and three benchmarks is demonstrated. Statistical analysis proves that a combination of a) technical analysis (role-based strategies) and b) RL (machine learning strategies) and c) restricting the action of the RL policy network with a few realistic conditions results in trading decisions with higher investment return rates.\u0000</p></div>","PeriodicalId":36280,"journal":{"name":"Annals of Data Science","volume":"11 5","pages":"1653 - 1674"},"PeriodicalIF":0.0,"publicationDate":"2023-05-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49174695","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Platform Resource Scheduling Method Based on Branch-and-Bound and Genetic Algorithm 基于分支定界和遗传算法的平台资源调度方法
Annals of Data Science Pub Date : 2023-05-11 DOI: 10.1007/s40745-023-00470-8
Yanfen Zhang, Jinyao Ma, Haibin Zhang, Bin Yue
{"title":"Platform Resource Scheduling Method Based on Branch-and-Bound and Genetic Algorithm","authors":"Yanfen Zhang,&nbsp;Jinyao Ma,&nbsp;Haibin Zhang,&nbsp;Bin Yue","doi":"10.1007/s40745-023-00470-8","DOIUrl":"10.1007/s40745-023-00470-8","url":null,"abstract":"<div><p>Platform resource scheduling is an operational research optimization problem of matching tasks and platform resources, which has important applications in production or marketing arrangement layout, combat task planning, etc. The existing algorithms are inflexible in task planning sequence and have poor stability. Aiming at this defect, the branch-and-bound algorithm is combined with the genetic algorithm in this paper. Branch-and-bound algorithm can adaptively adjust the next task to be planned and calculate a variety of feasible task planning sequences. Genetic algorithm is used to assign a platform combination to the selected task. Besides, we put forward a new lower bound calculation method and pruning rule. On the basis of the processing time of the direct successor tasks, the influence of the resource requirements of tasks on the priority of tasks is considered. Numerical experiments show that the proposed algorithm has good performance in platform resource scheduling problem.</p></div>","PeriodicalId":36280,"journal":{"name":"Annals of Data Science","volume":"10 5","pages":"1421 - 1445"},"PeriodicalIF":0.0,"publicationDate":"2023-05-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43159033","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Bayesian Estimation of Multiple Covariate of Autoregressive (MC-AR) Model 自回归(MC-AR)模型多协变量的贝叶斯估计
Annals of Data Science Pub Date : 2023-05-04 DOI: 10.1007/s40745-023-00468-2
Jitendra Kumar, Ashok Kumar, Varun Agiwal
{"title":"Bayesian Estimation of Multiple Covariate of Autoregressive (MC-AR) Model","authors":"Jitendra Kumar,&nbsp;Ashok Kumar,&nbsp;Varun Agiwal","doi":"10.1007/s40745-023-00468-2","DOIUrl":"10.1007/s40745-023-00468-2","url":null,"abstract":"<div><p>In present scenario, handling covariate/explanatory variable with the model is one of most important factor to study with the models. The main advantages of covariate are it’s dependency on past observations. So, study variable is modelled after explaining both on own past and past and future observation of covariates. Present paper deals estimation of parameters of autoregressive model with multiple covariates under Bayesian approach. A simulation and empirical study is performed to check the applicability of the model and recorded the better results.</p></div>","PeriodicalId":36280,"journal":{"name":"Annals of Data Science","volume":"11 4","pages":"1291 - 1301"},"PeriodicalIF":0.0,"publicationDate":"2023-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47960675","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Bayes Analysis of Random Walk Model Under Different Error Assumptions 不同误差假设下随机漫步模型的贝叶斯分析
Annals of Data Science Pub Date : 2023-04-22 DOI: 10.1007/s40745-023-00465-5
Praveen Kumar Tripathi, Manika Agarwal
{"title":"A Bayes Analysis of Random Walk Model Under Different Error Assumptions","authors":"Praveen Kumar Tripathi,&nbsp;Manika Agarwal","doi":"10.1007/s40745-023-00465-5","DOIUrl":"10.1007/s40745-023-00465-5","url":null,"abstract":"<div><p>In this paper, the Bayesian analyses for the random walk models have been performed under the assumptions of normal distribution, log-normal distribution and the stochastic volatility model, for the error component, one by one. For the various parameters, in each model, some suitable choices of informative and non-informative priors have been made and the posterior distributions are calculated. For the first two choices of error distribution, the posterior samples are easily obtained by using the gamma generating routine in R software. For a random walk model, having stochastic volatility error, the Gibbs sampling with intermediate independent Metropolis–Hastings steps is employed to obtain the desired posterior samples. The whole procedure is numerically illustrated through a real data set of crude oil prices from April 2014 to March 2022. The models are, then, compared on the basis of their accuracies in forecasting the true values. Among the other choices, the random walk model with stochastic volatile errors outperformed for the data in hand.\u0000</p></div>","PeriodicalId":36280,"journal":{"name":"Annals of Data Science","volume":"11 5","pages":"1635 - 1652"},"PeriodicalIF":0.0,"publicationDate":"2023-04-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47611888","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Count Regression and Machine Learning Techniques for Zero-Inflated Overdispersed Count Data: Application to Ecological Data 零膨胀过分散计数数据的计数回归和机器学习技术:在生态数据中的应用
Annals of Data Science Pub Date : 2023-04-13 DOI: 10.1007/s40745-023-00464-6
Bonelwa Sidumo, Energy Sonono, Isaac Takaidza
{"title":"Count Regression and Machine Learning Techniques for Zero-Inflated Overdispersed Count Data: Application to Ecological Data","authors":"Bonelwa Sidumo,&nbsp;Energy Sonono,&nbsp;Isaac Takaidza","doi":"10.1007/s40745-023-00464-6","DOIUrl":"10.1007/s40745-023-00464-6","url":null,"abstract":"<div><p>The aim of this study is to investigate the overdispersion problem that is rampant in ecological count data. In order to explore this problem, we consider the most commonly used count regression models: the Poisson, the negative binomial, the zero-inflated Poisson and the zero-inflated negative binomial models. The performance of these count regression models is compared with the four proposed machine learning (ML) regression techniques: random forests, support vector machines, <span>(k-)</span>nearest neighbors and artificial neural networks. The mean absolute error was used to compare the performance of count regression models and ML regression models. The results suggest that ML regression models perform better compared to count regression models. The performance shown by ML regression techniques is a motivation for further research in improving methods and applications in ecological studies.</p></div>","PeriodicalId":36280,"journal":{"name":"Annals of Data Science","volume":"11 3","pages":"803 - 817"},"PeriodicalIF":0.0,"publicationDate":"2023-04-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s40745-023-00464-6.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43264905","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信