Md Nuho Ul Alam, Ibrahim Hasnine, Erfanul Hoque Bahadur, Abdul Kadar Muhammad Masum, Mercedes Briones Urbano, Manuel Masias Vergara, Jia Uddin, Imran Ashraf, Md. Abdus Samad
{"title":"DiabSense: early diagnosis of non-insulin-dependent diabetes mellitus using smartphone-based human activity recognition and diabetic retinopathy analysis with Graph Neural Network","authors":"Md Nuho Ul Alam, Ibrahim Hasnine, Erfanul Hoque Bahadur, Abdul Kadar Muhammad Masum, Mercedes Briones Urbano, Manuel Masias Vergara, Jia Uddin, Imran Ashraf, Md. Abdus Samad","doi":"10.1186/s40537-024-00959-w","DOIUrl":"https://doi.org/10.1186/s40537-024-00959-w","url":null,"abstract":"<p>Non-Insulin-Dependent Diabetes Mellitus (NIDDM) is a chronic health condition caused by high blood sugar levels, and if not treated early, it can lead to serious complications i.e. blindness. Human Activity Recognition (HAR) offers potential for early NIDDM diagnosis, emerging as a key application for HAR technology. This research introduces DiabSense, a state-of-the-art smartphone-dependent system for early staging of NIDDM. DiabSense incorporates HAR and Diabetic Retinopathy (DR) upon leveraging the power of two different Graph Neural Networks (GNN). HAR uses a comprehensive array of 23 human activities resembling Diabetes symptoms, and DR is a prevalent complication of NIDDM. Graph Attention Network (GAT) in HAR achieved 98.32% accuracy on sensor data, while Graph Convolutional Network (GCN) in the Aptos 2019 dataset scored 84.48%, surpassing other state-of-the-art models. The trained GCN analyzed retinal images of four experimental human subjects for DR report generation, and GAT generated their average duration of daily activities over 30 days. The daily activities in non-diabetic periods of diabetic patients were measured and compared with the daily activities of the experimental subjects, which helped generate risk factors. Fusing risk factors with DR conditions enabled early diagnosis recommendations for the experimental subjects despite the absence of any apparent symptoms. The comparison of DiabSense system outcome with clinical diagnosis reports in the experimental subjects was conducted using the A1C test. The test results confirmed the accurate assessment of early diagnosis requirements for experimental subjects by the system. Overall, DiabSense exhibits significant potential for ensuring early NIDDM treatment, improving millions of lives worldwide.</p>","PeriodicalId":15158,"journal":{"name":"Journal of Big Data","volume":"75 1","pages":""},"PeriodicalIF":8.1,"publicationDate":"2024-08-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141930322","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jeyoon Yeom, Hakyung Lee, Hoyoon Byun, Yewon Kim, Jeongeun Byun, Yunjeong Choi, Sungjin Kim, Kyungwoo Song
{"title":"Tc-llama 2: fine-tuning LLM for technology and commercialization applications","authors":"Jeyoon Yeom, Hakyung Lee, Hoyoon Byun, Yewon Kim, Jeongeun Byun, Yunjeong Choi, Sungjin Kim, Kyungwoo Song","doi":"10.1186/s40537-024-00963-0","DOIUrl":"https://doi.org/10.1186/s40537-024-00963-0","url":null,"abstract":"<p>This paper introduces TC-Llama 2, a novel application of large language models (LLMs) in the technology-commercialization field. Traditional methods in this field, reliant on statistical learning and expert knowledge, often face challenges in processing the complex and diverse nature of technology-commercialization data. TC-Llama 2 addresses these limitations by utilizing the advanced generalization capabilities of LLMs, specifically adapting them to this intricate domain. Our model, based on the open-source LLM framework, Llama 2, is customized through instruction tuning using bilingual Korean-English datasets. Our approach involves transforming technology-commercialization data into formats compatible with LLMs, enabling the model to learn detailed technological knowledge and product hierarchies effectively. We introduce a unique model evaluation strategy, leveraging new matching and generation tasks to verify the alignment of the technology-commercialization relationship in TC-Llama 2. Our results, derived from refining task-specific instructions for inference, provide valuable insights into customizing language models for specific sectors, potentially leading to new applications in technology categorization, utilization, and predictive product development.</p>","PeriodicalId":15158,"journal":{"name":"Journal of Big Data","volume":"51 1","pages":""},"PeriodicalIF":8.1,"publicationDate":"2024-08-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141930412","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Longcan Cheng, Yan Nie, Hongxia Wen, Yan Li, Yali Zhao, Qian Zhang, Mingxing Lei, Shihui Fu
{"title":"An ensemble machine learning model for predicting one-year mortality in elderly coronary heart disease patients with anemia","authors":"Longcan Cheng, Yan Nie, Hongxia Wen, Yan Li, Yali Zhao, Qian Zhang, Mingxing Lei, Shihui Fu","doi":"10.1186/s40537-024-00966-x","DOIUrl":"https://doi.org/10.1186/s40537-024-00966-x","url":null,"abstract":"<h3 data-test=\"abstract-sub-heading\">Objective</h3><p>This study was designed to develop and validate a robust predictive model for one-year mortality in elderly coronary heart disease (CHD) patients with anemia using machine learning methods.</p><h3 data-test=\"abstract-sub-heading\">Methods</h3><p>Demographics, tests, comorbidities, and drugs were collected for a cohort of 974 elderly patients with CHD. A prospective analysis was performed to evaluate predictive performances of the developed models. External validation of models was performed in a series of 112 elderly CHD patients with anemia.</p><h3 data-test=\"abstract-sub-heading\">Results</h3><p>The overall one-year mortality was 43.6%. Risk factors included heart rate, chronic heart failure, tachycardia and β receptor blockers. Protective factors included hemoglobin, albumin, high density lipoprotein cholesterol, estimated glomerular filtration rate (eGFR), left ventricular ejection fraction (LVEF), aspirin, clopidogrel, calcium channel blockers, angiotensin converting enzyme inhibitors (ACEIs)/angiotensin receptor blockers (ARBs), and statins. Compared with other algorithms, an ensemble machine learning model performed the best with area under the curve (95% confidence interval) being 0.828 (0.805–0.870) and Brier score being 0.170. Calibration and density curves further confirmed favorable predicted probability and discriminative ability of an ensemble machine learning model. External validation of Ensemble Model also exhibited good performance with area under the curve (95% confidence interval) being 0.825 (0.734–0.916) and Brier score being 0.185. Patients in the high-risk group had more than six-fold probability of one-year mortality compared with those in the low-risk group (<i>P</i> < 0.001). Shaley Additive exPlanation identified the top five risk factors that associated with one-year mortality were hemoglobin, albumin, eGFR, LVEF, and ACEIs/ARBs.</p><h3 data-test=\"abstract-sub-heading\">Conclusions</h3><p>This model identifies key risk factors and protective factors, providing valuable insights for improving risk assessment, informing clinical decision-making and performing targeted interventions. It outperforms other algorithms with predictive performance and provides significant opportunities for personalized risk mitigation strategies, with clinical implications for improving patient care.</p>","PeriodicalId":15158,"journal":{"name":"Journal of Big Data","volume":"17 1","pages":""},"PeriodicalIF":8.1,"publicationDate":"2024-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141772925","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abdullah Al Maruf, Ahmad Jainul Abidin, Md. Mahmudul Haque, Zakaria Masud Jiyad, Aditi Golder, Raaid Alubady, Zeyar Aung
{"title":"Hate speech detection in the Bengali language: a comprehensive survey","authors":"Abdullah Al Maruf, Ahmad Jainul Abidin, Md. Mahmudul Haque, Zakaria Masud Jiyad, Aditi Golder, Raaid Alubady, Zeyar Aung","doi":"10.1186/s40537-024-00956-z","DOIUrl":"https://doi.org/10.1186/s40537-024-00956-z","url":null,"abstract":"<p>The detection of hate speech (HS) in online platforms has become extremely important for maintaining a safe and inclusive environment. While significant progress has been made in English-language HS detection, methods for detecting HS in other languages, such as Bengali, have not been explored much like English. In this survey, we outlined the key challenges specific to HS detection in Bengali, including the scarcity of labeled datasets, linguistic nuances, and contextual variations. We also examined different approaches and methodologies employed by researchers to address these challenges, including classical machine learning techniques, ensemble approaches, and more recent deep learning advancements. Furthermore, we explored the performance metrics used for evaluation, including the accuracy, precision, recall, receiver operating characteristic (ROC) curve, area under the ROC curve (AUC), sensitivity, specificity, and F1 score, providing insights into the effectiveness of the proposed models. Additionally, we identified the limitations and future directions of research in Bengali HS detection, highlighting the need for larger annotated datasets, cross-lingual transfer learning techniques, and the incorporation of contextual information to improve the detection accuracy. This survey provides a comprehensive overview of the current state-of-the-art HS detection methods used in Bengali text and serves as a valuable resource for researchers and practitioners interested in understanding the advancements, challenges, and opportunities in addressing HS in the Bengali language, ultimately assisting in the creation of reliable and effective online platform detection systems.</p>","PeriodicalId":15158,"journal":{"name":"Journal of Big Data","volume":"14 1","pages":""},"PeriodicalIF":8.1,"publicationDate":"2024-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141772928","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mohammed Bergui, Soufiane Hourri, Said Najah, Nikola S. Nikolov
{"title":"Predictive modelling of MapReduce job performance in cloud environments using machine learning techniques","authors":"Mohammed Bergui, Soufiane Hourri, Said Najah, Nikola S. Nikolov","doi":"10.1186/s40537-024-00964-z","DOIUrl":"https://doi.org/10.1186/s40537-024-00964-z","url":null,"abstract":"<p>Within the Hadoop ecosystem, MapReduce stands as a cornerstone for managing, processing, and mining large-scale datasets. Yet, the absence of efficient solutions for precise estimation of job execution times poses a persistent challenge, impacting task allocation and distribution within Hadoop clusters. In this study, we present a comprehensive machine learning approach for predicting the execution time of MapReduce jobs, encompassing data collection, preprocessing, feature engineering, and model evaluation. Leveraging a rich dataset derived from comprehensive Hadoop MapReduce job traces, we explore the intricate relationship between cluster parameters and job performance. Through a comparative analysis of machine learning models, including linear regression, decision tree, random forest, and gradient-boosted regression trees, we identify the random forest model as the most effective, demonstrating superior predictive accuracy and robustness. Our findings underscore the critical role of features such as data size and resource allocation in determining job performance. With this work, we aim to enhance resource management efficiency and enable more effective utilisation of cloud-based Hadoop clusters for large-scale data processing tasks.</p>","PeriodicalId":15158,"journal":{"name":"Journal of Big Data","volume":"48 1","pages":""},"PeriodicalIF":8.1,"publicationDate":"2024-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141772927","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Maryam Shahcheraghi, Ryan Mercer, João Manuel de Almeida Rodrigues, Audrey Der, Hugo Filipe Silveira Gamboa, Zachary Zimmerman, Kerry Mauck, Eamonn Keogh
{"title":"Introducing Mplots: scaling time series recurrence plots to massive datasets","authors":"Maryam Shahcheraghi, Ryan Mercer, João Manuel de Almeida Rodrigues, Audrey Der, Hugo Filipe Silveira Gamboa, Zachary Zimmerman, Kerry Mauck, Eamonn Keogh","doi":"10.1186/s40537-024-00954-1","DOIUrl":"https://doi.org/10.1186/s40537-024-00954-1","url":null,"abstract":"<p>Time series similarity matrices (informally, recurrence plots or dot-plots), are useful tools for time series data mining. They can be used to guide data exploration, and various useful features can be derived from them and then fed into downstream analytics. However, time series similarity matrices suffer from very poor scalability, taxing both time and memory requirements. In this work, we introduce novel ideas that allow us to scale the largest time series similarity matrices that can be examined by several orders of magnitude. The first idea is a novel algorithm to compute the matrices in a way that removes dependency on the subsequence length. This algorithm is so fast that it allows us to now address datasets where the memory limitations begin to dominate. Our second novel contribution is a multiscale algorithm that computes an approximation of the matrix appropriate for the limitations of the user’s memory/screen-resolution, then performs a local, just-in-time recomputation of any region that the user wishes to zoom-in on. Given that this largely removes time and space barriers, human visual attention then becomes the bottleneck. We further introduce algorithms that search massive matrices with quadrillions of cells and then prioritize regions for later examination by either humans or algorithms. We will demonstrate the utility of our ideas for data exploration, segmentation, and classification in domains as diverse as astronomy, bioinformatics, entomology, and wildlife monitoring.</p>","PeriodicalId":15158,"journal":{"name":"Journal of Big Data","volume":"47 1","pages":""},"PeriodicalIF":8.1,"publicationDate":"2024-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141742305","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Emotion AWARE: an artificial intelligence framework for adaptable, robust, explainable, and multi-granular emotion analysis","authors":"Gihan Gamage, Daswin De Silva, Nishan Mills, Damminda Alahakoon, Milos Manic","doi":"10.1186/s40537-024-00953-2","DOIUrl":"https://doi.org/10.1186/s40537-024-00953-2","url":null,"abstract":"<p>Emotions are fundamental to human behaviour. How we feel, individually and collectively, determines how humanity evolves and advances into our shared future. The rapid digitalisation of our personal, social and professional lives means we are frequently using digital media to express, understand and respond to emotions. Although recent developments in Artificial Intelligence (AI) are able to analyse sentiment and detect emotions, they are not effective at comprehending the complexity and ambiguity of digital emotion expressions in knowledge-focused activities of customers, people, and organizations. In this paper, we address this challenge by proposing a novel AI framework for the adaptable, robust, and explainable detection of multi-granular assembles of emotions. This framework consolidates lexicon generation and finetuned Large Language Model (LLM) approaches to formulate multi-granular assembles of two, eight and fourteen emotions. The framework is robust to ambiguous emotion expressions that are implied in conversation, adaptable to domain-specific emotion semantics, and the assembles are explainable using constituent terms and intensity. We conducted nine empirical studies using datasets representing diverse human emotion behaviours. The results of these studies comprehensively demonstrate and evaluate the core capabilities of the framework, and consistently outperforms state-of-the-art approaches in adaptable, robust, and explainable multi-granular emotion detection.</p>","PeriodicalId":15158,"journal":{"name":"Journal of Big Data","volume":"153 1","pages":""},"PeriodicalIF":8.1,"publicationDate":"2024-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141587939","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abdullah Alqahtani, Shtwai Alsubai, Mohemmed Sha, Ashit Kumar Dutta
{"title":"Examining ALS: reformed PCA and random forest for effective detection of ALS","authors":"Abdullah Alqahtani, Shtwai Alsubai, Mohemmed Sha, Ashit Kumar Dutta","doi":"10.1186/s40537-024-00951-4","DOIUrl":"https://doi.org/10.1186/s40537-024-00951-4","url":null,"abstract":"<p>ALS (Amyotrophic Lateral Sclerosis) is a fatal neurodegenerative disease of the human motor system. It is a group of progressive diseases that affects the nerve cells in the brain and spinal cord that control the muscle movement of the body hence, detection and classification of ALS at the right time is considered to be one of the vital aspects that can save the life of humans. Therefore, in various studies, different AI techniques are used for the detection of ALS, however, these methods are considered to be ineffectual in terms of identifying the disease due to the employment of ineffective algorithms. Hence, the proposed model utilizes Modified Principal Component Analysis (MPCA) and Modified Random Forest (MRF) for performing dimensionality reduction of all the potential features considered for effective classification of the ALS presence and absence of ALS causing mutation in the corresponding gene. The MPCA is adapted for capturing all the Low-Importance Data transformation. Furthermore, The MPCA is objected to performing three various approaches: Covariance Matrix Correlation, Eigen Vector- Eigenvalue decomposition, and selecting the desired principal components. This is done in aspects of implying the LI (Lower-Importance) Data Transformation. By choosing these potential components without any loss of features ensures better viability of selecting the attributes for ALS-causing gene classification. This is followed by the classification of the proposed model by using Modified RF by updating the clump detector technique. The clump detector is proceeded by clustering approach using K-means, and the data reduced by their dimension are grouped accordingly. These clustered data are analyzed either for ALS causing or devoid of causing ALS. Finally, the model’s performance is assessed using different evaluation metrics like accuracy, recall, F1 score, and precision, and the proposed model is further compared with the existing models to assess the efficacy of the proposed model.</p>","PeriodicalId":15158,"journal":{"name":"Journal of Big Data","volume":"21 1","pages":""},"PeriodicalIF":8.1,"publicationDate":"2024-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141587940","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Supriya V. Mahadevkar, Shruti Patil, Ketan Kotecha, Lim Way Soong, Tanupriya Choudhury
{"title":"Exploring AI-driven approaches for unstructured document analysis and future horizons","authors":"Supriya V. Mahadevkar, Shruti Patil, Ketan Kotecha, Lim Way Soong, Tanupriya Choudhury","doi":"10.1186/s40537-024-00948-z","DOIUrl":"https://doi.org/10.1186/s40537-024-00948-z","url":null,"abstract":"<p>In the current industrial landscape, a significant number of sectors are grappling with the challenges posed by unstructured data, which incurs financial losses amounting to millions annually. If harnessed effectively, this data has the potential to substantially boost operational efficiency. Traditional methods for extracting information have their limitations; however, solutions powered by artificial intelligence (AI) could provide a more fitting alternative. There is an evident gap in scholarly research concerning a comprehensive evaluation of AI-driven techniques for the extraction of information from unstructured content. This systematic literature review aims to identify, assess, and deliberate on prospective research directions within the field of unstructured document information extraction. It has been observed that prevailing extraction methods primarily depend on static patterns or rules, often proving inadequate when faced with complex document structures typically encountered in real-world scenarios, such as medical records. Datasets currently available to the public suffer from low quality and are tailored for specific tasks only. This underscores an urgent need for developing new datasets that accurately reflect complex issues encountered in practical settings. The review reveals that AI-based techniques show promise in autonomously extracting information from diverse unstructured documents, encompassing both printed and handwritten text. Challenges arise, however, when dealing with varied document layouts. Proposing a framework through hybrid AI-based approaches, this review envisions processing a high-quality dataset for automatic information extraction from unstructured documents. Additionally, it emphasizes the importance of collaborative efforts between organizations and researchers to address the diverse challenges associated with unstructured data analysis.</p>","PeriodicalId":15158,"journal":{"name":"Journal of Big Data","volume":"31 1","pages":""},"PeriodicalIF":8.1,"publicationDate":"2024-07-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141576733","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"New custom rating for improving recommendation system performance","authors":"Tora Fahrudin, Dedy Rahman Wijaya","doi":"10.1186/s40537-024-00952-3","DOIUrl":"https://doi.org/10.1186/s40537-024-00952-3","url":null,"abstract":"<p>Recommendation system is currently attracting the interest of many explorers. Various new businesses have surfaced with the rise of online marketing (E-Commerce) in response to Covid-19 pandemic. This phenomenon allows recommendation items through a system called Collaborative Filtering (CF), aiming to improve shopping experience of users. Typically, the effectiveness of CF relies on the precise identification of similar profile users by similarity algorithms. Traditional similarity measures are based on the user-item rating matrix. Approximately, four custom ratings (CR) were used along with a new rating formula, termed New Custom Rating (NCR), derived from the popularity of users and items in addition to the original rating. Specifically, NCR optimized recommendation system performance by using the popularity of users and items to determine new ratings value, rather than solely relying on the original rating. Additionally, the formulas improved the representativeness of the new rating values and the accuracy of similarity algorithm calculations. Consequently, the increased accuracy of recommendation system was achieved. The implementation of NCR across four CR algorithms and recommendation system using five public datasets was examined. Consequently, the experimental results showed that NCR significantly increased recommendation system accuracy, as evidenced by reductions in RMSE, MSE, and MAE as well as increasing FCP and Hit Rate. Moreover, by combining the popularity of users and items into rating calculations, NCR improved the accuracy of various recommendation system algorithms reducing RMSE, MSE, and MAE up to 62.10%, 53.62%, 65.97%, respectively, while also increasing FCP and Hit Rate up to 11.89% and 31.42%, respectively.</p>","PeriodicalId":15158,"journal":{"name":"Journal of Big Data","volume":"23 1","pages":""},"PeriodicalIF":8.1,"publicationDate":"2024-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141520245","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}