Frontiers in Big Data最新文献

ULBERT: a domain-adapted BERT model for bilingual information retrieval from Pakistan's constitution.

IF 2.4

Frontiers in Big Data Pub Date : 2025-09-22 eCollection Date: 2025-01-01 DOI: 10.3389/fdata.2025.1448785

Qaiser Abbas, Waqas Nawaz, Sadia Niazi, Muhammad Awais

{"title":"ULBERT: a domain-adapted BERT model for bilingual information retrieval from Pakistan's constitution.","authors":"Qaiser Abbas, Waqas Nawaz, Sadia Niazi, Muhammad Awais","doi":"10.3389/fdata.2025.1448785","DOIUrl":"https://doi.org/10.3389/fdata.2025.1448785","url":null,"abstract":"Introduction: Navigating legal texts like a national constitution is notoriously difficult due to specialized jargon and complex internal references. For the Constitution of Pakistan, no automated, user-friendly search tool existed to address this challenge. This paper introduces ULBERT, a novel AI-powered information retrieval framework designed to make the constitution accessible to all users, from legal experts to ordinary citizens, in both English and Urdu.Methods: The system is built around a custom AI model that moves beyond keyword matching to understand the semantic meaning of a user's query. It processes questions in English or Urdu and compares them to the constitutional text, identifying the most relevant passages based on contextual and semantic similarity.Results: In performance testing, the ULBERT framework proved highly effective. It successfully retrieved the correct constitutional information with an accuracy of 86% for English queries and 73% for Urdu queries.Discussion: These results demonstrate a significant breakthrough in enhancing the accessibility of foundational legal documents through artificial intelligence. The framework provides an effective and intuitive tool for legal inquiry, empowering a broader audience to understand the Constitution of Pakistan.","PeriodicalId":52859,"journal":{"name":"Frontiers in Big Data","volume":"8 ","pages":"1448785"},"PeriodicalIF":2.4,"publicationDate":"2025-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12497596/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145245803","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Enhancing intelligence source performance management through two-stage stochastic programming and machine learning techniques.

IF 2.4

Frontiers in Big Data Pub Date : 2025-09-22 eCollection Date: 2025-01-01 DOI: 10.3389/fdata.2025.1640539

Lucas Wafula Wekesa, Stephen Korir

{"title":"Enhancing intelligence source performance management through two-stage stochastic programming and machine learning techniques.","authors":"Lucas Wafula Wekesa, Stephen Korir","doi":"10.3389/fdata.2025.1640539","DOIUrl":"https://doi.org/10.3389/fdata.2025.1640539","url":null,"abstract":"Introduction: The effectiveness of intelligence operations depends heavily on the reliability and performance of human intelligence (HUMINT) sources. Yet, source behavior is often unpredictable, deceptive or shaped by operational context, complicating resource allocation and tasking decisions.Methods: This study developed a hybrid framework combining Machine Learning (ML) techniques and Two-Stage Stochastic Programming (TSSP) for HUMINT source performance management under uncertainty. A synthetic dataset reflecting HUMINT operational patterns was generated and used to train classification and regression models. The extreme Gradient Boosting (XGBoost) and Support Vector Machines (SVM) were applied for behavioral classification and prediction of reliability and deception scores. The predictive outputs were then transformed into scenario probabilities and integrated into the TSSP model to optimize task allocation under varying behavioral uncertainties.Results: The classifiers achieved 98% overall accuracy, with XGBoost exhibiting higher precision and SVM demonstrating superior recall for rare but operationally significant categories. The regression models achieved R-squared scores of 93% for reliability and 81% for deception. These predictive outputs were transformed into scenario probabilities for integration into the TSSP model, optimizing task allocation under varying behavioral risks. When compared to a deterministic optimization baseline, the hybrid framework delivered a 16.8% reduction in expected tasking costs and a 19.3% improvement in mission success rates.Discussion and conclusion: The findings demonstrated that scenario-based probabilistic planning offers significant advantages over static heuristics in managing uncertainty in HUMINT operations. While the simulation results are promising, validation through field data is required before operational deployment.","PeriodicalId":52859,"journal":{"name":"Frontiers in Big Data","volume":"8 ","pages":"1640539"},"PeriodicalIF":2.4,"publicationDate":"2025-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12498342/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145245750","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Multistakeholder fairness in tourism: what can algorithms learn from tourism management? 旅游中的多利益相关者公平：算法能从旅游管理中学到什么？

IF 2.4

Frontiers in Big Data Pub Date : 2025-09-18 eCollection Date: 2025-01-01 DOI: 10.3389/fdata.2025.1632766

Peter Müllner, Anna Schreuer, Simone Kopeinik, Bernhard Wieser, Dominik Kowald

{"title":"Multistakeholder fairness in tourism: what can algorithms learn from tourism management?","authors":"Peter Müllner, Anna Schreuer, Simone Kopeinik, Bernhard Wieser, Dominik Kowald","doi":"10.3389/fdata.2025.1632766","DOIUrl":"https://doi.org/10.3389/fdata.2025.1632766","url":null,"abstract":"Algorithmic decision-support systems, i.e., recommender systems, are popular digital tools that help tourists decide which places and attractions to explore. However, algorithms often unintentionally direct tourist streams in a way that negatively affects the environment, local communities, or other stakeholders. This issue can be partly attributed to the computer science community's limited understanding of the complex relationships and trade-offs among stakeholders in the real world. In this work, we draw on the practical findings and methods from tourism management to inform research on multistakeholder fairness in algorithmic decision-support. Leveraging a semi-systematic literature review, we synthesize literature from tourism management as well as literature from computer science. Our findings suggest that tourism management actively tries to identify the specific needs of stakeholders and utilizes qualitative, inclusive and participatory methods to study fairness from a normative and holistic research perspective. In contrast, computer science lacks sufficient understanding of the stakeholder needs and primarily considers fairness through descriptive factors, such as measureable discrimination, while heavily relying on few mathematically formalized fairness criteria that fail to capture the multidimensional nature of fairness in tourism. With the results of this work, we aim to illustrate the shortcomings of purely algorithmic research and stress the potential and particular need for future interdisciplinary collaboration. We believe such a collaboration is a fundamental and necessary step to enhance algorithmic decision-support systems toward understanding and supporting true multistakeholder fairness in tourism.","PeriodicalId":52859,"journal":{"name":"Frontiers in Big Data","volume":"8 ","pages":"1632766"},"PeriodicalIF":2.4,"publicationDate":"2025-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12488424/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145234040","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

FAST-framework for AI-based surgical transformation. 基于人工智能的手术转化fast框架。

IF 2.4

Frontiers in Big Data Pub Date : 2025-09-12 eCollection Date: 2025-01-01 DOI: 10.3389/fdata.2025.1655260

Harmehr Sekhon, Farid Al Zoubi, Paul E Beaulé, Pascal Fallavollita

{"title":"FAST-framework for AI-based surgical transformation.","authors":"Harmehr Sekhon, Farid Al Zoubi, Paul E Beaulé, Pascal Fallavollita","doi":"10.3389/fdata.2025.1655260","DOIUrl":"10.3389/fdata.2025.1655260","url":null,"abstract":"Background: The use of machine learning (ML) in surgery till date has largely focused on predication of surgical variables, which has not been found to significantly improve operating room efficiencies and surgical success rates (SSR). Due to the long surgery wait times, limited health care resources and an increased population need, innovative ML models are needed. Thus, the Framework for AI-based Surgical Transformation (FAST) was created to make real time recommendations to improve OR efficiency.Methods: The FAST model was developed and evaluated using a dataset of n=4796 orthopedic cases that utilizes surgery and team specific variables (e.g. specific team composition, OR turnover time, procedure duration), along with regular positive deviance seminars with the stakeholders for adherence and uptake. FAST was created using six ML algorithms, including decision trees and neural networks. The FAST was implemented in orthopedic surgeries at a hospital in Canada's capital (Ottawa).Results: FAST was found to be feasible and implementable in the hospital orthopedic OR, with good team engagement due to the PD seminars. FAST led to a SSR of 93% over 23 weeks (57 arthroplasty surgery days) compared to 39% at baseline. Key variables impacting SSR included starting the first surgery on time, turnover time, and team composition.Conclusions: FAST is a novel ML framework that can provide real time feedback for improving OR efficiency and SSR. Stakeholder integration is key in its success in uptake and adherence. This unique framework can be implemented in different hospitals and for diverse surgeries, offering a novel and innovative application of ML for improving OR efficiency without additional resources.","PeriodicalId":52859,"journal":{"name":"Frontiers in Big Data","volume":"8 ","pages":"1655260"},"PeriodicalIF":2.4,"publicationDate":"2025-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12463642/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145187517","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Secure aggregation of sufficiently many private inputs. 足够多的私有输入的安全聚合。

IF 2.4

Frontiers in Big Data Pub Date : 2025-09-10 eCollection Date: 2025-01-01 DOI: 10.3389/fdata.2025.1638307

Thijs Veugen, Gabriele Spini, Frank Muller

引用次数: 0

Toward more realistic career path prediction: evaluation and methods. 走向更现实的职业道路预测：评价与方法。

IF 2.4

Frontiers in Big Data Pub Date : 2025-08-25 eCollection Date: 2025-01-01 DOI: 10.3389/fdata.2025.1564521

Elena Senger, Yuri Campbell, Rob van der Goot, Barbara Plank

{"title":"Toward more realistic career path prediction: evaluation and methods.","authors":"Elena Senger, Yuri Campbell, Rob van der Goot, Barbara Plank","doi":"10.3389/fdata.2025.1564521","DOIUrl":"https://doi.org/10.3389/fdata.2025.1564521","url":null,"abstract":"Predicting career trajectories is a complex yet impactful task, offering significant benefits for personalized career counseling, recruitment optimization, and workforce planning. However, effective career path prediction (CPP) modeling faces challenges including highly variable career trajectories, free-text resume data, and limited publicly available benchmark datasets. In this study, we present a comprehensive comparative evaluation of CPP models-linear projection, multilayer perceptron (MLP), LSTM, and large language models (LLMs)-across multiple input settings and two recently introduced public datasets. Our contributions are threefold: (1) we propose novel model variants, including an MLP extension and a standardized LLM approach, (2) we systematically evaluate model performance across input types (titles only vs. title+description, standardized vs. free-text), and (3) we investigate the role of synthetic data and fine-tuning strategies in addressing data scarcity and improving model generalization. Additionally, we provide a detailed qualitative analysis of prediction behaviors across industries, career lengths, and transitions. Our findings establish new baselines, reveal the trade-offs of different modeling strategies, and offer practical insights for deploying CPP systems in real-world settings.","PeriodicalId":52859,"journal":{"name":"Frontiers in Big Data","volume":"8 ","pages":"1564521"},"PeriodicalIF":2.4,"publicationDate":"2025-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12415007/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145030713","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Automated road surface classification in OpenStreetMap using MaskCNN and aerial imagery. OpenStreetMap中使用MaskCNN和航空图像的自动路面分类。

IF 2.4

Frontiers in Big Data Pub Date : 2025-08-13 eCollection Date: 2025-01-01 DOI: 10.3389/fdata.2025.1657320

R Parvathi, V Pattabiraman, Nancy Saxena, Aakarsh Mishra, Utkarsh Mishra, Ansh Pandey

{"title":"Automated road surface classification in OpenStreetMap using MaskCNN and aerial imagery.","authors":"R Parvathi, V Pattabiraman, Nancy Saxena, Aakarsh Mishra, Utkarsh Mishra, Ansh Pandey","doi":"10.3389/fdata.2025.1657320","DOIUrl":"10.3389/fdata.2025.1657320","url":null,"abstract":"Introduction: OpenStreetMap (OSM) road surface data is critical for navigation, infrastructure monitoring, and urban planning but is often incomplete or inconsistent. This study addresses the need for automated validation and classification of road surfaces by leveraging high-resolution aerial imagery and deep learning techniques.Methods: We propose a MaskCNN-based deep learning model enhanced with attention mechanisms and a hierarchical loss function to classify road surfaces into four types: asphalt, concrete, gravel, and dirt. The model uses NAIP (National Agriculture Imagery Program) aerial imagery aligned with OSM labels. Preprocessing includes georeferencing, data augmentation, label cleaning, and class balancing. The architecture comprises a ResNet-50 encoder with squeeze-and-excitation blocks and a U-Net-style decoder with spatial attention. Evaluation metrics include accuracy, mIoU, precision, recall, and F1-score.Results: The proposed model achieved an overall accuracy of 92.3% and a mean Intersection over Union (mIoU) of 83.7%, outperforming baseline models such as SVM (81.2% accuracy), Random Forest (83.7%), and standard U-Net (89.6%). Class-wise performance showed high precision and recall even for challenging surface types like gravel and dirt. Comparative evaluations against state-of-the-art models (COANet, SA-UNet, MMFFNet) also confirmed superior performance.Discussion: The results demonstrate that combining NAIP imagery with attention-guided CNN architectures and hierarchical loss functions significantly improves road surface classification. The model is robust across varied terrains and visual conditions and shows potential for real-world applications such as OSM data enhancement, infrastructure analysis, and autonomous navigation. Limitations include label noise in OSM and class imbalance, which can be addressed through future work involving semi-supervised learning and multimodal data integration.","PeriodicalId":52859,"journal":{"name":"Frontiers in Big Data","volume":"8 ","pages":"1657320"},"PeriodicalIF":2.4,"publicationDate":"2025-08-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12382388/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144978127","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Editorial: Interdisciplinary approaches to complex systems: highlights from FRCCS 2023/24. 编辑：复杂系统的跨学科方法：FRCCS 2023/24的亮点。

IF 2.4

Frontiers in Big Data Pub Date : 2025-08-12 eCollection Date: 2025-01-01 DOI: 10.3389/fdata.2025.1666305

Roberto Interdonato, Hocine Cherifi

引用次数: 0

Artificial intelligence for surgical outcome prediction in glaucoma: a systematic review. 人工智能在青光眼手术预后预测中的应用综述。

IF 2.4

Frontiers in Big Data Pub Date : 2025-08-08 eCollection Date: 2025-01-01 DOI: 10.3389/fdata.2025.1605018

Zeena Kailani, Lauren Kim, Joshua Bierbrier, Michael Balas, David J Mathew

{"title":"Artificial intelligence for surgical outcome prediction in glaucoma: a systematic review.","authors":"Zeena Kailani, Lauren Kim, Joshua Bierbrier, Michael Balas, David J Mathew","doi":"10.3389/fdata.2025.1605018","DOIUrl":"10.3389/fdata.2025.1605018","url":null,"abstract":"Introduction: Glaucoma is a leading cause of irreversible blindness, and its rising global prevalence has led to a significant increase in glaucoma surgeries. However, predicting postoperative outcomes remains challenging due to the complex interplay of patient factors, surgical techniques, and postoperative care. Artificial intelligence (AI) has emerged as a promising tool for enhancing predictive accuracy in clinical decision-making.Methods: This systematic review was conducted to evaluate the current evidence on the use of AI to predict surgical outcomes in glaucoma patients. A comprehensive search of Medline, Embase, Web of Science, and Scopus was performed. Studies were included if they applied AI models to glaucoma surgery outcome prediction.Results: Six studies met inclusion criteria, collectively analyzing 4,630 surgeries. A variety of algorithms were applied, including random forests, support vector machines, and neural networks. Overall, AI models consistently outperformed traditional statistical approaches, with the best-performing model achieving an accuracy of 87.5%. Key predictors of outcomes included demographic factors (e.g., age), systemic health indicators (e.g., smoking status and body mass index), and ophthalmic parameters (e.g., baseline intraocular pressure, central corneal thickness, mitomycin C use).Discussion: While AI models demonstrated superior performance to traditional statistical approaches, the lack of external validation and standardized surgical success definitions limit their clinical applicability. This review highlights both the promise and the current limitations of artificial intelligence in glaucoma surgery outcome prediction, emphasizing the need for prospective, multicenter studies, publicly available datasets, and standardized evaluation metrics to enhance the generalizability and clinical utility of future models.Systematic review registration: https://www.crd.york.ac.uk/PROSPERO/view/CRD42024621758, identifier: CRD42024621758.","PeriodicalId":52859,"journal":{"name":"Frontiers in Big Data","volume":"8 ","pages":"1605018"},"PeriodicalIF":2.4,"publicationDate":"2025-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12370750/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144977903","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A fashion product recommendation based on adaptive VPKNN-NET algorithm without fuzzy similar image. 基于自适应VPKNN-NET算法的无模糊相似图像时尚产品推荐。

IF 2.4

Frontiers in Big Data Pub Date : 2025-08-07 eCollection Date: 2025-01-01 DOI: 10.3389/fdata.2025.1557779

R Sabitha, D Sundar

{"title":"A fashion product recommendation based on adaptive VPKNN-NET algorithm without fuzzy similar image.","authors":"R Sabitha, D Sundar","doi":"10.3389/fdata.2025.1557779","DOIUrl":"10.3389/fdata.2025.1557779","url":null,"abstract":"Introduction: Recommender systems are essential in e-commerce for assisting users in navigating large product catalogs, particularly in visually driven domains like fashion. Traditional keyword-based systems often struggle to capture subjective style preferences.Methods: This study proposes a novel fashion recommendation framework using an Adaptive VPKNN-net algorithm. The model integrates deep visual feature extraction using a pre-trained VGG16 Convolutional Neural Network (CNN), dimensionality reduction through Principal Component Analysis (PCA), and a modified K-Nearest Neighbors (KNN) algorithm that combines Euclidean and cosine similarity metrics to enhance visual similarity assessment.Results: Experiments were conducted using the \"Fashion Product Images (Small)\" dataset from Kaggle. The proposed system achieved high accuracy (98.69%) and demonstrated lower RMSE (0.8213) and MAE (0.6045) compared to baseline models such as Random Forest, SVM, and standard KNN.Discussion: The proposed Adaptive VPKNN-net framework significantly improves the precision, interpretability, and efficiency of visual fashion recommendations. It eliminates the limitations of fuzzy similarity models and offers a scalable solution for visually oriented e-commerce platforms, particularly in cold-start scenarios and low-data conditions.","PeriodicalId":52859,"journal":{"name":"Frontiers in Big Data","volume":"8 ","pages":"1557779"},"PeriodicalIF":2.4,"publicationDate":"2025-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12367692/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144977884","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0