{"title":"Stock Prices Forecasting by Using a Novel Hybrid Method Based on the MFO-Optimized GRU Network","authors":"Xinjian Zhang, Guanlin Liu","doi":"10.1007/s40745-025-00616-w","DOIUrl":"10.1007/s40745-025-00616-w","url":null,"abstract":"<div><p>With the social economy growing at a quick pace and the stock market seeing constant developments, more and more people are voicing concerns about investing in stocks. The importance of forecasting stock values has increased in the domain of engineering's use of cognitive computing. Utilizing data-driven tactics for forecasting stock prices, investors can effectively mitigate risks and enhance profits. Investors can use projections based on historical values and textual data to make well-informed judgments about future patterns in stock prices. Stock price anticipation is a pivotal undertaking in the financial sector that has substantial consequences for traders and investors. This article presents an in-depth comparison analysis of machine learning tactics for forecasting price fluctuations in stocks. The research deploys historical stock data and diverse technical indicators. This paper presents the Gated Recurrent Unit (GRU) model for Nasdaq stock index anticipation, which is optimized by Particle swarm optimization (PSO), Biogeography-based optimization (BBO), and Moth flame optimization (MFO). Among these optimizers, MFO has the best outcomes. Compared to the GRU scheme the optimized PSO-GRU, BBO-GRU, and MFO-optimized GRU for stock forecasting has the outcomes of 0.9807, 0.9824, and 0.9904 in coefficient of determination (<span>({R}^{2})</span>) which shows the improvement of the presented scheme as a result of its development. The criteria used to evaluate this model are mean absolute error, root mean absolute error, and <span>({R}^{2})</span>.</p></div>","PeriodicalId":36280,"journal":{"name":"Annals of Data Science","volume":"12 4","pages":"1369 - 1387"},"PeriodicalIF":0.0,"publicationDate":"2025-06-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145166219","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Optimization of Oil and Gas Pipeline Leakage Data and Defect Identification Based on Graph Neural Processing","authors":"Lizhen Zhang","doi":"10.1007/s40745-025-00619-7","DOIUrl":"10.1007/s40745-025-00619-7","url":null,"abstract":"<div><p>With the increasing complexity of oil and gas pipeline networks, early identification of leaks and defects is crucial to ensure the safe operation of pipelines. This study proposes a graph neural network (GNN) method for data processing and defect identification aimed at optimizing monitoring and maintenance strategies for oil and gas pipelines. Through the analysis of historical leakage data, we constructed a graph database containing 5000 samples, each containing 10 features such as pressure, flow, temperature, etc. Using graph convolutional network and graph attention network (GAT) to perform feature extraction and pattern recognition on nodes in pipeline network, our model achieves 92% accuracy in defect recognition, which is 15% higher than traditional methods. In addition, we have developed a leakage prediction model based on time series analysis, which is able to predict potential leakage risks 24 h in advance with an accuracy of 85%. The results of this study not only improve the safety management level of oil and gas pipelines, but also provide a new technical path for future intelligent pipeline maintenance.</p></div>","PeriodicalId":36280,"journal":{"name":"Annals of Data Science","volume":"12 4","pages":"1413 - 1430"},"PeriodicalIF":0.0,"publicationDate":"2025-06-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s40745-025-00619-7.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145165694","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Improving Predictive Accuracy in Writing Assessment Through Advanced Machine Learning Techniques","authors":"Xiao Zhang","doi":"10.1007/s40745-025-00618-8","DOIUrl":"10.1007/s40745-025-00618-8","url":null,"abstract":"<div><p>This research investigates the application of the Machine Learning (ML) model for effective and equitable essay scoring in education. Unlike their human counterpart, ML models have the capacity to rapidly analyze scores of essays, providing timely and equitable scores that take into account varying student demographics and styles of writing. This function helps in the identification of classroom problems and supports the design of focused teaching methodologies. For the study, a Light Gradient Boosting Classification (LGBC) model was optimized by three optimizers: Black Widow Optimization (BWO), Zebra Optimization Algorithm (ZOA), and Leader Harris Hawks Optimization (LHHO), for the development of the hybrid models with a focus on improved prediction quality. Comparison of these hybrid models with the base LGBC model was performed through different phases, such as Training, Validation, and Testing. The findings show that the LGLH model exhibited improved performance with an accuracy rate of 0.981, followed by the LGZO model with 0.971 and the LGBW model with 0.963. The lowest rate of accuracy was observed in the base LGBC model, which was 0.946. The results demonstrate the efficacy of hybrid models, which harness the optimality of several optimization techniques and provide more robust results for complicated tasks. The study emphasizes the importance of selecting the appropriate model architecture to achieve optimal performance, providing valuable insights into model efficacy at various stages of evaluation.</p></div>","PeriodicalId":36280,"journal":{"name":"Annals of Data Science","volume":"12 4","pages":"1389 - 1412"},"PeriodicalIF":0.0,"publicationDate":"2025-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145161714","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Uncovering University Application Patterns Through Graph Representation Learning","authors":"Hendrik Santoso Sugiarto, Yozef Tjandra","doi":"10.1007/s40745-025-00611-1","DOIUrl":"10.1007/s40745-025-00611-1","url":null,"abstract":"<div><p>In university admissions, interaction networks naturally emerge between prospective students and available majors. Understanding hidden patterns in such a vast network is crucial for decision-making but poses technical challenges due to its complexity and data limitations. Many existing models rely heavily on user profiling, raising privacy concerns and making data collection difficult. Instead, this work extracts meaningful insights using only the adjacency information of the network, avoiding the need for personal data. We leverage Graph Convolutional Networks (GCN) to generate compact representations for major recommendation and clustering tasks. Our GCN-based approach outperforms classical methods such as popularity-based and Non-negative Matrix Factorization (NMF), as well as the neural Generalized Matrix Factorization (GMF) model, achieving up to 61.06% and 12.17% improvements in smaller (dimension 40) and larger (dimension 80) embeddings, respectively. Furthermore, hierarchical clustering on these embeddings reveals implicit patterns in student preferences, particularly regarding fields of study and geographic locations, even without explicit data on these attributes. These findings demonstrate that meaningful insights can be derived from interaction networks while mitigating privacy concerns associated with user profiling.</p></div>","PeriodicalId":36280,"journal":{"name":"Annals of Data Science","volume":"12 4","pages":"1343 - 1368"},"PeriodicalIF":0.0,"publicationDate":"2025-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145163350","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Rumor Governance Under Uncertain Conditions: An Evolutionary Game Theory Analysis","authors":"Xuefan Dong, Lei Tang","doi":"10.1007/s40745-025-00606-y","DOIUrl":"10.1007/s40745-025-00606-y","url":null,"abstract":"<div><p>In the rapidly evolving landscape of online information dissemination, managing rumors has become an imperative challenge for governments worldwide. This study employs a tripartite evolutionary game model to examine the behavior evolution of the government, online media, and netizens in the process of rumor propagation under uncertain conditions. The innovation of the model lies in considering the probability of successful rumor detection under government regulation, the uncertainty of rumor dissemination by online media and netizens, and introducing a dynamic government penalty mechanism. Through simulation and analysis, we identify the evolutionarily stable strategies of each participant under different scenarios and provide specific governance strategies for each party involved. The results reveal that appropriate government penalties, proactive regulation by online media, and rational choices by netizens can effectively curb rumor spreading. In uncertain environments, adopting flexible policies and dynamic adjustment mechanisms is crucial for effective rumor governance. The results reveal that appropriate government penalties, proactive regulation by online media, and rational choices by netizens can effectively curb rumor spreading. In uncertain environments, adopting flexible policies and dynamic adjustment mechanisms is crucial for effective rumor governance. This study not only enriches the application of evolutionary game theory but also offers practical strategic recommendations for policymakers to address the challenges of rumor propagation.</p></div>","PeriodicalId":36280,"journal":{"name":"Annals of Data Science","volume":"12 3","pages":"1073 - 1111"},"PeriodicalIF":0.0,"publicationDate":"2025-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145161941","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Optimal Individual Selection Algorithm Based on Layer Proximity and Branch Distance Functions","authors":"An Yingjian, La Ping","doi":"10.1007/s40745-025-00600-4","DOIUrl":"10.1007/s40745-025-00600-4","url":null,"abstract":"<div><p>Automatic generation of test cases using heuristic methods is a hot research topic nowadays. Although its advantages are obvious, it is slightly insufficient in the selection of optimal individuals. Aiming at the existing problems in the evaluation and selection of the optimal individual, this paper proposes a test case evaluation algorithm based on the comprehensive analysis of the characteristics of layer proximity and branch distance function, which is a joint structure of “layer proximity and branch distance function”. The basic idea of this algorithm is that when selecting pilot individuals in the evolutionary process, we first select the individuals with high proximity between the actual execution path and the target path, and then select the individuals with the smallest branching distances among these individuals, so as to obtain the individuals with the optimal piloting ability. Experiments show that the proposed algorithm can quickly find the optimal test cases, especially for the test case generation of multi-layer nested programs.</p></div>","PeriodicalId":36280,"journal":{"name":"Annals of Data Science","volume":"12 3","pages":"1041 - 1054"},"PeriodicalIF":0.0,"publicationDate":"2025-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145161940","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jun Li, Chenwu Shan, Liyan Shen, Yawei Ren, Jiajie Zhang
{"title":"PNAP-YOLO: An Improved Prompts-Based Naturalistic Adversarial Patch Model for Object Detectors","authors":"Jun Li, Chenwu Shan, Liyan Shen, Yawei Ren, Jiajie Zhang","doi":"10.1007/s40745-025-00604-0","DOIUrl":"10.1007/s40745-025-00604-0","url":null,"abstract":"<div><p>Detectors have been extensively utilized in various scenarios such as autonomous driving and video surveillance. Nonetheless, recent studies have revealed that these detectors are vulnerable to adversarial attacks, particularly adversarial patch attacks. Adversarial patches are specifically crafted to disrupt deep learning models by disturbing image regions, thereby misleading the deep learning models when added to into normal images. Traditional adversarial patches often lack semantics, posing challenges in maintaining concealment in physical world scenarios. To tackle this issue, this paper proposes a Prompt-based Natural Adversarial Patch generation method, which creates patches controllable by textual descriptions to ensure flexibility in application. This approach leverages the latest text-to-image generation model—Latent Diffusion Model (LDM) to produce adversarial patches. We optimize the attack performance of the patches by updating the latent variables of LDM through a combined loss function. Experimental results indicate that our method can generate more natural, semantically rich adversarial patches, achieving effective attacks on various detectors.</p></div>","PeriodicalId":36280,"journal":{"name":"Annals of Data Science","volume":"12 3","pages":"1055 - 1072"},"PeriodicalIF":0.0,"publicationDate":"2025-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145161140","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jixia Zheng, Rui Chen, Qinggen Zeng, Yanan Chen, Qianlin Ye
{"title":"Sustainable Development of Green Tourism Supply Chain Considering Blockchain Traceability and Government Subsidies","authors":"Jixia Zheng, Rui Chen, Qinggen Zeng, Yanan Chen, Qianlin Ye","doi":"10.1007/s40745-025-00588-x","DOIUrl":"10.1007/s40745-025-00588-x","url":null,"abstract":"<div><p>This study examines the issue of green information distortion and its impact on tourists’ purchasing decisions, as well as the associated high transaction costs within the green tourism supply chain. By selecting a green tourism supply chain with varying government subsidy schemes as the focus of the research, the objective is to explore optimal subsidy strategies and assess the implications of blockchain integration. A three-level Stackelberg game model is established, featuring the government as the leader and a scenic spot (SS) and travel agency (TA) as participants. Key findings include: (1) Production subsidies are more effective in boosting market demand than environmental investment subsidies, particularly when tourist green trust and preferences are high. (2) Blockchain enhances greenness, market demand, and social welfare, positively influencing the green tourism supply chain (GTSC). (3) Tourist green preference and trust significantly affect GTSC optimization, especially as preferences increase. Additionally, a cost-sharing smart contract mechanism is designed to mitigate environmental investment's negative impact and optimize social welfare and product greenness.</p></div>","PeriodicalId":36280,"journal":{"name":"Annals of Data Science","volume":"12 4","pages":"1315 - 1342"},"PeriodicalIF":0.0,"publicationDate":"2025-04-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145167864","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Identifying the Intents Behind Website Visits by Employing Unsupervised Machine Learning Models","authors":"Judah Soobramoney, Retius Chifurira, Temesgen Zewotir, Knowledge Chinhamu","doi":"10.1007/s40745-024-00586-5","DOIUrl":"10.1007/s40745-024-00586-5","url":null,"abstract":"<div><p>With digitisation globally on the rise, corporates are compelled to better understand the usage of their websites. In doing so, corporates will be empowered to better understand consumers, and make necessary adjustments to ultimately improve the corporate’s stance in the competitive global landscape of this modern age. However, the online website visit data has proven to be highly complex, big in data volume, and highly transactional with users expressing unique behaviours. Thus, extracting insight can be a complex problem to solve. This study aimed to employ unsupervised machine learning models to identify the intentions behind the visits on the observed website. The data studied was sourced from the Google Analytics tracking tool that was deployed on a corporate informative website. The study employed a k-means, hierarchical and dbscan unsupervised machine learning models to understand the intents behind visitors on the studied website. All three models detected five major intents that were expressed within the observed data. The intents identified were labelled as “accidentals”, “drop-offs”, “engrossed”, “get-in-touch” and “seekers”. On the observed data, all three unsupervised machine learning methods have performed well. However, in the context of the study, which investigated the intents that drove online visits, the hierarchical clustering method yielded superior results by maintaining the best balance between cluster homogeneity (stronger silhouette coefficients) and cluster size.</p></div>","PeriodicalId":36280,"journal":{"name":"Annals of Data Science","volume":"12 1","pages":"413 - 437"},"PeriodicalIF":0.0,"publicationDate":"2025-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s40745-024-00586-5.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143521572","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Generalized Alpha Power Inverted Weibull Distribution: Application of Air Pollution in Kathmandu, Nepal","authors":"Govinda Prasad Dhungana, Arun Kumar Chaudhary, Ramesh Prasad Tharu, Vijay Kumar","doi":"10.1007/s40745-024-00581-w","DOIUrl":"10.1007/s40745-024-00581-w","url":null,"abstract":"<div><p>A novel probability distribution, the Generalized Alpha Power Inverted Weibull (GAPIW) distribution, is derived from the generalization of the <span>(alpha)</span><i>-</i>power family and compounded with the inverted Weibull distribution. The researchers looked into a lot of different sub-models and found important properties of the GAPIW distribution such as, quantile function, median, mode, moments, mean residual lifetime, and stress-strength reliability. The estimation of distribution parameters was carried out through maximum likelihood estimation methods.</p><p>To gain insights into the characteristics of the GAPIW distribution, the study applied it to the analysis of air pollution data, specifically PM2.5, PM10, and TSP data from multiple stations in the Kathmandu Valley. Notably, the findings indicate that air quality in these areas was significantly worse during winter than in other seasons. Also, the ratio (PM2.5/PM10) of particulate matter is higher, indicating air pollution from anthropogenesis particles in the Valley<i>.</i></p><p>The results demonstrate that the GAPIW distribution is validated through different diagrammatic representations, such as P-P plots, Q-Q plots, and mathematical calculations like the K-S test. The findings reveal that, on average, only three days per month or one month per year predict air pollution levels below the threshold in the Kathmandu Valley. Furthermore, compared to others <span>(alpha)</span><i>-</i>power family of distribution available in the literature, the proposed GAPIW distribution stands as a viable alternative model for assessing and understanding air pollution data and related environmental data. This research has the potential to make valuable contributions to the field of environmental science and air quality monitoring.</p></div>","PeriodicalId":36280,"journal":{"name":"Annals of Data Science","volume":"12 5","pages":"1691 - 1715"},"PeriodicalIF":0.0,"publicationDate":"2024-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144905057","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}