Annals of Data Science最新文献_第2页

Sentiment-Based Hierarchical Deep Learning Framework Using Hybrid Optimization for Course Recommendation in E-learning 基于情感的分层深度学习框架混合优化的电子学习课程推荐

Annals of Data Science Pub Date : 2024-12-05 DOI: 10.1007/s40745-024-00580-x

A. Madhavi, A. Nagesh, A. Govardhan

{"title":"Sentiment-Based Hierarchical Deep Learning Framework Using Hybrid Optimization for Course Recommendation in E-learning","authors":"A. Madhavi, A. Nagesh, A. Govardhan","doi":"10.1007/s40745-024-00580-x","DOIUrl":"10.1007/s40745-024-00580-x","url":null,"abstract":"<div><p>Course recommendation (CD) is essential for success in a student’s educational journey. Due to the variations in student’s knowledge system, it might be difficult to select the course content from online educational platforms. This problem is overcome by introducing the Political Jellyfish search optimization (PJSO) based Hierarchical Deep Learning for Text (HDLTex) model for sentiment classification (SC) in CD. Here, the input data is taken from the E-khool database, which is subjected to the learner/course agglomerative matrix calculation. Then, the course is grouped by utilizing Bayesian Fuzzy clustering (BFC). When the query is given, bi-level matching is performed. The learner retrieves the preferred items after the best course group is found. Furthermore, course review data is applied to the tokenization process employing <i>Bidirectional Encoder Representations from Transformers (</i>BERT). Finally, the feature extraction is carried out and SC is performed by using HDLTex, which is trained by the proposed PJSO. Moreover, the PJSO is the incorporation of Political Optimizer (PO) and Jellyfish Search Optimization (JSO). The devised PJSO-based HDLTex has a superior assessment for maximum precision of 0.904, maximum recall of 0.915 and maximum F-Measure of 0.904 respectively.</p></div>","PeriodicalId":36280,"journal":{"name":"Annals of Data Science","volume":"12 5","pages":"1661 - 1690"},"PeriodicalIF":0.0,"publicationDate":"2024-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144905056","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

On the xgamma k-record values and associated inference 关于xgamma k记录值和相关推断

Annals of Data Science Pub Date : 2024-11-25 DOI: 10.1007/s40745-024-00582-9

Masoud Bazari Jamkhaneh, S. M. T. K. MirMostafaee, Marziye Jadidi

{"title":"On the xgamma k-record values and associated inference","authors":"Masoud Bazari Jamkhaneh, S. M. T. K. MirMostafaee, Marziye Jadidi","doi":"10.1007/s40745-024-00582-9","DOIUrl":"10.1007/s40745-024-00582-9","url":null,"abstract":"<div><p>The xgamma distribution was first introduced by Sen et al. [1] as an alternative distribution to the exponential model. The xgamma distribution exhibits a bathtub-shaped hazard rate function, so it is suitable for many lifetime phenomena. In this paper, we consider the upper <i>k</i>-record values from the xgamma distribution. We obtain exact explicit expressions for the moments of <i>k</i>-record values. We compute the means, variances, and covariances of the upper <i>k</i>-records. Using these computed values, we can find the best linear unbiased estimators (BLUEs) and the best linear invariant estimators (BLIEs) of the location and scale parameters of the xgamma model. In addition, we work on the prediction of a future <i>k</i>-record value. We find the best linear unbiased predictor (BLUP) and the best linear invariant predictor (BLIP) of a future <i>k</i>-record value. Another linear predictor is also discussed. A simulation study is performed to assess the proposed estimators and predictors. We also present a real data example in order to illustrate the application of the theoretical results of the paper. At the end of the paper, we will provide several concluding remarks.</p></div>","PeriodicalId":36280,"journal":{"name":"Annals of Data Science","volume":"12 5","pages":"1717 - 1745"},"PeriodicalIF":0.0,"publicationDate":"2024-11-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144905131","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Deep Enhancement in Supplychain Management with Adaptive Serial Cascaded Autoencoder with Long Short Term Memory and Multi-layered Perceptron Framework 基于长短期记忆自适应串行级联自编码器和多层感知器框架的供应链管理深度增强

Annals of Data Science Pub Date : 2024-11-18 DOI: 10.1007/s40745-024-00576-7

Ashok Kumar Sarkar, Anupam Das

{"title":"Deep Enhancement in Supplychain Management with Adaptive Serial Cascaded Autoencoder with Long Short Term Memory and Multi-layered Perceptron Framework","authors":"Ashok Kumar Sarkar, Anupam Das","doi":"10.1007/s40745-024-00576-7","DOIUrl":"10.1007/s40745-024-00576-7","url":null,"abstract":"<div><p>Recognizing and reducing risk is a major part of Supply Chain Management (SCM). Several companies are invested in Supply Chain Risk Management (SCRM) and they have the knowledge about the procurement occupancies within their companies and take steps to ensure this potent source of strategic value. Additionally, these types of companies yield the highest returns with the lowest amount of financial risk. Moreover, reducing financial risk in the SCM network requires thoughtful analysis and a proactive strategy. Hence, this task aims to make a financial risk assessment in SCM with deep learning techniques based on big data. Financial risk-related big data is collected from the Kaggle database and utilized in the data transformation phase. The transformed data is employed for evaluating the financial risk with the support of an Adaptive Serial Cascaded Autoencoder with Long Short-Term Memory and Multi-Layered Perceptron (ASCALSMLP). Here, the parameters for the deep learning techniques like LSTM and MLP were tuned by the hybrid Sandpiper Galactic Swarm Optimization (SGSO) algorithm to enhance the efficacy of the offered approach. From the results analysis, the accuracy of the developed model is 91.12% better than DHOA, 92.5% more than COA, 93.75% improved than GSO, and 94.62% superior to SOA models. Therefore, the results from the developed approach demonstrate effective prediction of financial risks.</p></div>","PeriodicalId":36280,"journal":{"name":"Annals of Data Science","volume":"12 5","pages":"1577 - 1606"},"PeriodicalIF":0.0,"publicationDate":"2024-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144905011","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Statistical Data-Driven Modelling and Forecasting: An Application to COVID-19 Pandemic 统计数据驱动的建模与预测：在COVID-19大流行中的应用

Annals of Data Science Pub Date : 2024-11-18 DOI: 10.1007/s40745-024-00583-8

Shalabh, Subhra Sankar Dhar, Sabara Parshad Rajeshbhai

{"title":"Statistical Data-Driven Modelling and Forecasting: An Application to COVID-19 Pandemic","authors":"Shalabh, Subhra Sankar Dhar, Sabara Parshad Rajeshbhai","doi":"10.1007/s40745-024-00583-8","DOIUrl":"10.1007/s40745-024-00583-8","url":null,"abstract":"<div><p>One of the key objectives of statistics is to provide a model compatible with the data generated by an unknown random process. Often, it happens that the unknown process is intractable, and no prior data or information associated with the unknown process is available. Under such circumstances, well-known techniques like regression modelling techniques may not work. As a result, an alternative approach may be to observe the general features of the process from the available data. Afterward, a suitable statistical distribution, like a mixture of certain distributions, can be fitted to the existing available data, and future observations can be predicted using this fitting. For example, one may consider the prediction related to the COVID-19 pandemic. As it occurred for the first time, no prior data was available to apprehend the behaviour and progression of the COVID-19 pandemic. For such cases, a data-based statistical modelling procedure can be adopted to predict future occurrences based on a small data set. This article presents such an application-oriented, data-based statistical modelling procedure with an implementation on the COVID-19 data. The proposed procedure can be used for a wide range of modelling and forecasting of future events.</p></div>","PeriodicalId":36280,"journal":{"name":"Annals of Data Science","volume":"12 5","pages":"1747 - 1770"},"PeriodicalIF":0.0,"publicationDate":"2024-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144905012","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Beyond Regular SPC: Bridging the (C_{pk}) Capability Index for (a)Symmetric Data 超越常规SPC：桥接(a)对称数据的(C_{pk})能力指数

Annals of Data Science Pub Date : 2024-10-28 DOI: 10.1007/s40745-024-00577-6

Pedro Luiz Ramos, Ana Paula Silva Figueiredo, Diego Carvalho do Nascimento, Fernando Moala, Edilson Flores

{"title":"Beyond Regular SPC: Bridging the (C_{pk}) Capability Index for (a)Symmetric Data","authors":"Pedro Luiz Ramos, Ana Paula Silva Figueiredo, Diego Carvalho do Nascimento, Fernando Moala, Edilson Flores","doi":"10.1007/s40745-024-00577-6","DOIUrl":"10.1007/s40745-024-00577-6","url":null,"abstract":"<div><p>The advancement of technology has increased competitiveness, especially in the manufacturing industry. Alongside Statistical Process Control (SPC), capacity indices are tools used to measure the quality of processes and are useful for establishing standards in manufacturing products. This study was motivated to propose a new control chart based on the capability index <span>(C_{pk})</span>, which is particularly useful for real-time monitoring with respect to short time frames and longitudinal studies. Our methodology proposes a graphical monitoring tools that is obtained by utilizing the rolling capability index with standard distributions (Normal, Gamma, or Weibull) and bootstrap intervals based on closed-form estimators. Simulations and real-world applications demonstrated the utility of our framework, which is computationally inexpensive and applicable to real-time monitoring (useful for longitudinal or time-varying processes), showing that modified-Cpk for asymmetric processes is more accurate than point estimation based on normality. Moreover, the exemplification data from a Chocolate Factory showed an acceptable process trend in 25% of the observed rolling-windows (from the first modified-Cpk estimations), versus the normal-based that barely detected this pattern (only one in-sample period). That means, a reduction of <span>(approx 22%)</span> on the quality improvement interventions (translated as false alarms).</p></div>","PeriodicalId":36280,"journal":{"name":"Annals of Data Science","volume":"12 5","pages":"1607 - 1633"},"PeriodicalIF":0.0,"publicationDate":"2024-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144905039","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Modeling and Analysis of Trading Volume and Stock Return Data Using Bivariate q-Gaussian Distribution 基于双变量q-高斯分布的交易量和股票收益数据建模与分析

Annals of Data Science Pub Date : 2024-10-23 DOI: 10.1007/s40745-024-00578-5

T. Princy

{"title":"Modeling and Analysis of Trading Volume and Stock Return Data Using Bivariate q-Gaussian Distribution","authors":"T. Princy","doi":"10.1007/s40745-024-00578-5","DOIUrl":"10.1007/s40745-024-00578-5","url":null,"abstract":"<div><p>Two known characteristics of the distribution of stock returns (price fluctuations) and, more recently, the distribution of financial asset volumes are power laws and scaling. These power laws can be viewed as the asymptotic behaviour of distributions derived from nonextensive statistics, as demonstrated by an extensive number of instances in the field of physics. In this study, we explain the application of a non-extended statistics-based model for trading volume and stock price data. We present some novel theoretical results for the correlation between the trading volume distribution and stock return volatility that comes from entropy optimisation. We named this probability distribution as a bivariate <i>q</i>-Gaussian distribution since the resulting distribution is in terms of the <i>q</i>-exponential function, and when <i>q</i> tends to 1, it goes to the bivariate normal distribution. The primary characteristics of the novel model are thoroughly examined. The maximum likelihood estimation, a conventional technique, is used to conduct parameter estimation. The utility of the framing model is demonstrated using BSE Sensex data, which is used to illustrate the application of the bivariate <i>q</i>-Gaussian distribution.</p></div>","PeriodicalId":36280,"journal":{"name":"Annals of Data Science","volume":"12 5","pages":"1635 - 1659"},"PeriodicalIF":0.0,"publicationDate":"2024-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144905027","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A Novel Finite Mixture Model Based on the Generalized t Distributions with Two-Sided Censored Data 基于广义t分布的双侧截尾数据有限混合模型

Annals of Data Science Pub Date : 2024-09-25 DOI: 10.1007/s40745-024-00572-x

Ruijie Guan, Yaohua Rong, Weihu Cheng, Zhenyu Xin

{"title":"A Novel Finite Mixture Model Based on the Generalized t Distributions with Two-Sided Censored Data","authors":"Ruijie Guan, Yaohua Rong, Weihu Cheng, Zhenyu Xin","doi":"10.1007/s40745-024-00572-x","DOIUrl":"10.1007/s40745-024-00572-x","url":null,"abstract":"<div><p>In light of the rapid technological advancements witnessed in recent decades, numerous disciplines have been inundated with voluminous datasets characterized by multimodality, heavy-tailed distributions, and prevalent missing information. Consequently, the task of effectively modeling such intricate data poses a formidable yet indispensable challenge. This paper endeavors to address this challenge by introducing a novel finite mixture model predicated upon the generalized <i>t</i> distribution, tailored specifically to accommodate two-sided censored observations, thereby establishing a foundational framework for modeling this complex data structure. To facilitate parameter estimation within this model, we devise a variant of the EM-type algorithm, amalgamating the profile likelihood approach with the classical Expectation Conditional Maximization algorithm. Notably, this hybridized methodology affords analytical expressions in the E-step and a tractable M-step, thereby substantially enhancing computational expediency and efficiency. Furthermore, we furnish closed-form expressions delineating the observed information matrix, pivotal for approximating the asymptotic covariance matrix of the MLEs within this mixture model. To empirically evaluate the efficacy of the proposed algorithm, a series of simulation studies are conducted, demonstrating promising performance across various artificial datasets. Additionally, the practical applicability of the proposed methodology is elucidated through its deployment on two real-world datasets, thereby underscoring its feasibility and utility in practical settings.</p></div>","PeriodicalId":36280,"journal":{"name":"Annals of Data Science","volume":"12 1","pages":"341 - 379"},"PeriodicalIF":0.0,"publicationDate":"2024-09-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143521701","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Exploring the Potential of the Kumaraswamy Discrete Half-Logistic Distribution in Data Science Scanning and Decision-Making 探索Kumaraswamy离散半logistic分布在数据科学扫描和决策中的潜力

Annals of Data Science Pub Date : 2024-09-24 DOI: 10.1007/s40745-024-00558-9

Hend S. Shahen, Mohamed S. Eliwa, Mahmoud El-Morshedy

{"title":"Exploring the Potential of the Kumaraswamy Discrete Half-Logistic Distribution in Data Science Scanning and Decision-Making","authors":"Hend S. Shahen, Mohamed S. Eliwa, Mahmoud El-Morshedy","doi":"10.1007/s40745-024-00558-9","DOIUrl":"10.1007/s40745-024-00558-9","url":null,"abstract":"<div><p>Data science often employs discrete probability distributions to model and analyze various phenomena. These distributions are particularly useful when dealing with data that can be categorized into distinct outcomes or events. This study presents a discrete random probability model, supported by non-negative integers, formulated from the well-established Kumaraswamy family through a recognized discretization method, preserving the survival function’s functional structure. Various significant statistical properties like hazard rate function, crude moments, index of dispersion, skewness, kurtosis, quantile function, L-moments, and entropies are derived. This new probability mass function allows for the analysis of asymmetric dispersion data across different kurtosis forms, including mesokurtic, platykurtic, and leptokurtic distributions. Furthermore, this model effectively handles excess zeros, under and over dispersion commonly encountered in diverse fields. Additionally, the hazard rate function demonstrates considerable flexibility, encompassing monotonic decreasing, bathtub, monotonously increasing, and bathtub-constant failure rate characteristics. Following the theoretical introduction of this new discrete model, model parameters are estimated through maximum likelihood estimation, with a subsequent discussion on the performance of this technique through a simulation study. Finally, three real-world applications employing count data demonstrate the significance and adaptability of this novel discrete distribution.</p></div>","PeriodicalId":36280,"journal":{"name":"Annals of Data Science","volume":"12 3","pages":"1013 - 1040"},"PeriodicalIF":0.0,"publicationDate":"2024-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145168627","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Determining the Correlation among the Users' Satisfaction and Familiarity with Malay Entrepreneurs Food Delivery Mobile Applications in Malaysia 确定用户对马来西亚餐饮外卖移动应用的满意度和熟悉度之间的相关性

Annals of Data Science Pub Date : 2024-09-13 DOI: 10.1007/s40745-024-00568-7

Muhamad Redha Iqbal Bin Daud, Norhidayah Abdullah, Lovelyna Benedict Jipiu

{"title":"Determining the Correlation among the Users' Satisfaction and Familiarity with Malay Entrepreneurs Food Delivery Mobile Applications in Malaysia","authors":"Muhamad Redha Iqbal Bin Daud, Norhidayah Abdullah, Lovelyna Benedict Jipiu","doi":"10.1007/s40745-024-00568-7","DOIUrl":"10.1007/s40745-024-00568-7","url":null,"abstract":"<div><p>The rise of mobile technology has significantly transformed numerous aspects of our everyday lives, especially within food delivery services. The investigation aims to explore the food delivery mobile apps (FDMA) satisfaction (SAT) and the influence of familiarity (FAM). Data was gathered from 381 individuals who have experience in using any FDMA services specifically in Shah Alam, Selangor with the aid of online questionnaires. The study findings indicate user satisfaction (US) with FDMA is strongly influenced by the level of familiarity users have with the platform. The research result shows the satisfaction of users with FDMA is strongly linked to how easy they find the platform to use. The research provides a unique contribution by exploring the influence of familiarity on the US with FDMA. Investigating how users' prior experiences and comfort levels impact their satisfaction provides valuable insights for enhancing app design and user experience in the rapidly evolving food delivery industry. The study contributes by elucidating the significant impact of FAM on FDMA satisfaction. This insight aids in refining app design and strategies to enhance user experience. The study suggests optimizing FDMA by prioritizing features that enhance user FAM, ultimately developing higher SAT levels and improving overall user experience. The research findings indicate a notable correlation between the US and the inclination to maintain the usage of FDMA systems.</p></div>","PeriodicalId":36280,"journal":{"name":"Annals of Data Science","volume":"12 5","pages":"1431 - 1462"},"PeriodicalIF":0.0,"publicationDate":"2024-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144905220","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Designing Supply Chain Management Pattern in Small Scale Integrated Commercial Agriculture 小规模综合商业农业供应链管理模式设计

Annals of Data Science Pub Date : 2024-08-22 DOI: 10.1007/s40745-024-00574-9

Seyed Hasan Hosseini Khesht Masjedi, Sahar Dehyouri, Seyed Jamal Farajolah Hosseini, Maryam Omidi Najafabadi

引用次数: 0