Frontiers in Artificial Intelligence最新文献

筛选
英文 中文
Socially interactive agents for robotic neurorehabilitation training: conceptualization and proof-of-concept study. 用于机器人神经康复训练的社交互动代理:概念化和概念验证研究。
IF 3
Frontiers in Artificial Intelligence Pub Date : 2024-11-28 eCollection Date: 2024-01-01 DOI: 10.3389/frai.2024.1441955
Rhythm Arora, Pooja Prajod, Matteo Lavit Nicora, Daniele Panzeri, Giovanni Tauro, Rocco Vertechy, Matteo Malosio, Elisabeth André, Patrick Gebhard
{"title":"Socially interactive agents for robotic neurorehabilitation training: conceptualization and proof-of-concept study.","authors":"Rhythm Arora, Pooja Prajod, Matteo Lavit Nicora, Daniele Panzeri, Giovanni Tauro, Rocco Vertechy, Matteo Malosio, Elisabeth André, Patrick Gebhard","doi":"10.3389/frai.2024.1441955","DOIUrl":"10.3389/frai.2024.1441955","url":null,"abstract":"<p><strong>Introduction: </strong>Individuals with diverse motor abilities often benefit from intensive and specialized rehabilitation therapies aimed at enhancing their functional recovery. Nevertheless, the challenge lies in the restricted availability of neurorehabilitation professionals, hindering the effective delivery of the necessary level of care. Robotic devices hold great potential in reducing the dependence on medical personnel during therapy but, at the same time, they generally lack the crucial human interaction and motivation that traditional in-person sessions provide.</p><p><strong>Methods: </strong>To bridge this gap, we introduce an AI-based system aimed at delivering personalized, out-of-hospital assistance during neurorehabilitation training. This system includes a rehabilitation training device, affective signal classification models, training exercises, and a socially interactive agent as the user interface. With the assistance of a professional, the envisioned system is designed to be tailored to accommodate the unique rehabilitation requirements of an individual patient. Conceptually, after a preliminary setup and instruction phase, the patient is equipped to continue their rehabilitation regimen autonomously in the comfort of their home, facilitated by a socially interactive agent functioning as a virtual coaching assistant. Our approach involves the integration of an interactive socially-aware virtual agent into a neurorehabilitation robotic framework, with the primary objective of recreating the social aspects inherent to in-person rehabilitation sessions. We also conducted a feasibility study to test the framework with healthy patients.</p><p><strong>Results and discussion: </strong>The results of our preliminary investigation indicate that participants demonstrated a propensity to adapt to the system. Notably, the presence of the interactive agent during the proposed exercises did not act as a source of distraction; instead, it positively impacted users' engagement.</p>","PeriodicalId":33315,"journal":{"name":"Frontiers in Artificial Intelligence","volume":"7 ","pages":"1441955"},"PeriodicalIF":3.0,"publicationDate":"2024-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11634856/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142819591","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Implementation of deep reinforcement learning models for emotion detection and personalization of learning in hybrid educational environments. 在混合教育环境中实现情感检测和个性化学习的深度强化学习模型。
IF 3
Frontiers in Artificial Intelligence Pub Date : 2024-11-28 eCollection Date: 2024-01-01 DOI: 10.3389/frai.2024.1458230
Jaime Govea, Alexandra Maldonado Navarro, Santiago Sánchez-Viteri, William Villegas-Ch
{"title":"Implementation of deep reinforcement learning models for emotion detection and personalization of learning in hybrid educational environments.","authors":"Jaime Govea, Alexandra Maldonado Navarro, Santiago Sánchez-Viteri, William Villegas-Ch","doi":"10.3389/frai.2024.1458230","DOIUrl":"10.3389/frai.2024.1458230","url":null,"abstract":"<p><p>The integration of artificial intelligence in education has shown great potential to improve student's learning experience through emotion detection and the personalization of learning. Many educational settings lack adequate mechanisms to dynamically adapt to students' emotions, which can negatively impact their academic performance and engagement. This study addresses this problem by implementing a deep reinforcement learning model to detect emotions in real-time and personalize teaching strategies in a hybrid educational environment. Using data from 500 students, captured through cameras, microphones, and biometric sensors and pre-processed with advanced techniques such as histogram equalization and noise reduction, the deep reinforcement learning model was trained and validated to improve the detection accuracy of emotions and the personalization of learning. The results showed a significant improvement in the accuracy of emotion detection, going from 72.4% before the implementation of the system to 89.3% after. Real-time adaptability also increased from 68.5 to 87.6%, while learning personalization rose from 70.2 to 90.1%. K-fold cross-validation with k = 10 confirmed the robustness and generalization of the model, with consistently high scores in all evaluated metrics. This study demonstrates that integrating reinforcement learning models for emotion detection and learning personalization can transform education, providing a more adaptive and student-centered learning experience. These findings identify the potential of these technologies to improve academic performance and student engagement, offering a solid foundation for future research and implementation.</p>","PeriodicalId":33315,"journal":{"name":"Frontiers in Artificial Intelligence","volume":"7 ","pages":"1458230"},"PeriodicalIF":3.0,"publicationDate":"2024-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11634863/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142819585","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
MLR-predictor: a versatile and efficient computational framework for multi-label requirements classification. mlr预测器:一个多标签需求分类的通用和高效的计算框架。
IF 3
Frontiers in Artificial Intelligence Pub Date : 2024-11-27 eCollection Date: 2024-01-01 DOI: 10.3389/frai.2024.1481581
Summra Saleem, Muhammad Nabeel Asim, Ludger Van Elst, Markus Junker, Andreas Dengel
{"title":"MLR-predictor: a versatile and efficient computational framework for multi-label requirements classification.","authors":"Summra Saleem, Muhammad Nabeel Asim, Ludger Van Elst, Markus Junker, Andreas Dengel","doi":"10.3389/frai.2024.1481581","DOIUrl":"10.3389/frai.2024.1481581","url":null,"abstract":"<p><strong>Introduction: </strong>Requirements classification is an essential task for development of a successful software by incorporating all relevant aspects of users' needs. Additionally, it aids in the identification of project failure risks and facilitates to achieve project milestones in more comprehensive way. Several machine learning predictors are developed for binary or multi-class requirements classification. However, a few predictors are designed for multi-label classification and they are not practically useful due to less predictive performance.</p><p><strong>Method: </strong>MLR-Predictor makes use of innovative OkapiBM25 model to transforms requirements text into statistical vectors by computing words informative patterns. Moreover, predictor transforms multi-label requirements classification data into multi-class classification problem and utilize logistic regression classifier for categorization of requirements. The performance of the proposed predictor is evaluated and compared with 123 machine learning and 9 deep learning-based predictive pipelines across three public benchmark requirements classification datasets using eight different evaluation measures.</p><p><strong>Results: </strong>The large-scale experimental results demonstrate that proposed MLR-Predictor outperforms 123 adopted machine learning and 9 deep learning predictive pipelines, as well as the state-of-the-art requirements classification predictor. Specifically, in comparison to state-of-the-art predictor, it achieves a 13% improvement in macro F1-measure on the PROMISE dataset, a 1% improvement on the EHR-binary dataset, and a 2.5% improvement on the EHR-multiclass dataset.</p><p><strong>Discussion: </strong>As a case study, the generalizability of proposed predictor is evaluated on softwares customer reviews classification data. In this context, the proposed predictor outperformed the state-of-the-art BERT language model by F-1 score of 1.4%. These findings underscore the robustness and effectiveness of the proposed MLR-Predictor in various contexts, establishing its utility as a promising solution for requirements classification task.</p>","PeriodicalId":33315,"journal":{"name":"Frontiers in Artificial Intelligence","volume":"7 ","pages":"1481581"},"PeriodicalIF":3.0,"publicationDate":"2024-11-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11632133/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142814505","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Predicting financial distress in TSX-listed firms using machine learning algorithms. 使用机器学习算法预测多伦多证券交易所上市公司的财务困境。
IF 3
Frontiers in Artificial Intelligence Pub Date : 2024-11-27 eCollection Date: 2024-01-01 DOI: 10.3389/frai.2024.1466321
Mark Eshwar Lokanan, Sana Ramzan
{"title":"Predicting financial distress in TSX-listed firms using machine learning algorithms.","authors":"Mark Eshwar Lokanan, Sana Ramzan","doi":"10.3389/frai.2024.1466321","DOIUrl":"10.3389/frai.2024.1466321","url":null,"abstract":"<p><strong>Introduction: </strong>This study investigates the application of machine learning (ML) algorithms, a subset of artificial intelligence (AI), to predict financial distress in companies. Given the critical need for reliable financial health indicators, this research evaluates the predictive capabilities of various ML techniques on firm-level financial data.</p><p><strong>Methods: </strong>The dataset comprises financial ratios and firm-specific variables from 464 firms listed on the TSX. Multiple ML models were tested, including decision trees, random forests, support vector machines (SVM), and artificial neural networks (ANN). Recursive feature elimination with cross-validation (RFECV) and bootstrapped CART were also employed to enhance model stability and feature selection.</p><p><strong>Results: </strong>The findings highlight key predictors of financial distress, such as revenue growth, dividend growth, cash-to-current liabilities, and gross profit margins. Among the models tested, the ANN classifier achieved the highest accuracy at 98%, outperforming other algorithms.</p><p><strong>Discussion: </strong>The results suggest that ANN provides a robust and reliable method for financial distress prediction. The use of RFECV and bootstrapped CART contributes to the model's stability, underscoring the potential of ML tools in financial health monitoring. These insights carry valuable implications for auditors, regulators, and company management in enhancing practices around financial oversight and fraud detection.</p>","PeriodicalId":33315,"journal":{"name":"Frontiers in Artificial Intelligence","volume":"7 ","pages":"1466321"},"PeriodicalIF":3.0,"publicationDate":"2024-11-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11631907/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142813477","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Adoption of artificial intelligence and machine learning in banking systems: a qualitative survey of board of directors. 人工智能和机器学习在银行系统中的应用:对董事会的定性调查。
IF 3
Frontiers in Artificial Intelligence Pub Date : 2024-11-27 eCollection Date: 2024-01-01 DOI: 10.3389/frai.2024.1440051
Abdullah Eskandarany
{"title":"Adoption of artificial intelligence and machine learning in banking systems: a qualitative survey of board of directors.","authors":"Abdullah Eskandarany","doi":"10.3389/frai.2024.1440051","DOIUrl":"10.3389/frai.2024.1440051","url":null,"abstract":"<p><p>The aim of the paper is twofold. First to examine the role of the board of directors in facilitating the adoption of AI and ML in Saudi Arabian banking sector. Second, to explore the effectiveness of artificial intelligence and machine learning in protection of Saudi Arabian banking sector from cyberattacks. A qualitative research approach was applied using in-depth interviews with 17 board of directors from prominent Saudi Arabian banks. The present study highlights both the opportunities and challenges of integrating artificial intelligence and machine learning advanced technologies in this highly regulated industry. Findings reveal that advanced artificial intelligence and machine learning technologies offer substantial benefits, particularly in areas like threat detection, fraud prevention, and process automation, enabling banks to meet regulatory standards and mitigate cyber threats efficiently. However, the research also identifies significant barriers, including limited technological infrastructure, a lack of cohesive artificial intelligence strategies, and ethical concerns around data privacy and algorithmic bias. Interviewees emphasized the board of directors' critical role in providing strategic direction, securing resources, and fostering partnerships with artificial intelligence technology providers. The study further highlights the importance of aligning artificial intelligence and machine learning initiatives with national development goals, such as Saudi Vision 2030, to ensure sustained growth and competitiveness. The findings from the present study offer valuable implications for policymakers in banking in navigating the complexities of artificial intelligence and machine learning adoption in financial services, particularly in emerging markets.</p>","PeriodicalId":33315,"journal":{"name":"Frontiers in Artificial Intelligence","volume":"7 ","pages":"1440051"},"PeriodicalIF":3.0,"publicationDate":"2024-11-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11631877/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142814504","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multimodal driver emotion recognition using motor activity and facial expressions. 使用运动活动和面部表情的多模式驾驶员情绪识别。
IF 3
Frontiers in Artificial Intelligence Pub Date : 2024-11-27 eCollection Date: 2024-01-01 DOI: 10.3389/frai.2024.1467051
Carlos H Espino-Salinas, Huizilopoztli Luna-García, José M Celaya-Padilla, Cristian Barría-Huidobro, Nadia Karina Gamboa Rosales, David Rondon, Klinge Orlando Villalba-Condori
{"title":"Multimodal driver emotion recognition using motor activity and facial expressions.","authors":"Carlos H Espino-Salinas, Huizilopoztli Luna-García, José M Celaya-Padilla, Cristian Barría-Huidobro, Nadia Karina Gamboa Rosales, David Rondon, Klinge Orlando Villalba-Condori","doi":"10.3389/frai.2024.1467051","DOIUrl":"10.3389/frai.2024.1467051","url":null,"abstract":"<p><p>Driving performance can be significantly impacted when a person experiences intense emotions behind the wheel. Research shows that emotions such as anger, sadness, agitation, and joy can increase the risk of traffic accidents. This study introduces a methodology to recognize four specific emotions using an intelligent model that processes and analyzes signals from motor activity and driver behavior, which are generated by interactions with basic driving elements, along with facial geometry images captured during emotion induction. The research applies machine learning to identify the most relevant motor activity signals for emotion recognition. Furthermore, a pre-trained Convolutional Neural Network (CNN) model is employed to extract probability vectors from images corresponding to the four emotions under investigation. These data sources are integrated through a unidimensional network for emotion classification. The main proposal of this research was to develop a multimodal intelligent model that combines motor activity signals and facial geometry images to accurately recognize four specific emotions (anger, sadness, agitation, and joy) in drivers, achieving a 96.0% accuracy in a simulated environment. The study confirmed a significant relationship between drivers' motor activity, behavior, facial geometry, and the induced emotions.</p>","PeriodicalId":33315,"journal":{"name":"Frontiers in Artificial Intelligence","volume":"7 ","pages":"1467051"},"PeriodicalIF":3.0,"publicationDate":"2024-11-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11631879/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142813449","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Toward explainable deep learning in healthcare through transition matrix and user-friendly features. 通过转换矩阵和用户友好特性,向医疗保健中可解释的深度学习迈进。
IF 3
Frontiers in Artificial Intelligence Pub Date : 2024-11-25 eCollection Date: 2024-01-01 DOI: 10.3389/frai.2024.1482141
Oleksander Barmak, Iurii Krak, Sergiy Yakovlev, Eduard Manziuk, Pavlo Radiuk, Vladislav Kuznetsov
{"title":"Toward explainable deep learning in healthcare through transition matrix and user-friendly features.","authors":"Oleksander Barmak, Iurii Krak, Sergiy Yakovlev, Eduard Manziuk, Pavlo Radiuk, Vladislav Kuznetsov","doi":"10.3389/frai.2024.1482141","DOIUrl":"10.3389/frai.2024.1482141","url":null,"abstract":"<p><p>Modern artificial intelligence (AI) solutions often face challenges due to the \"black box\" nature of deep learning (DL) models, which limits their transparency and trustworthiness in critical medical applications. In this study, we propose and evaluate a scalable approach based on a transition matrix to enhance the interpretability of DL models in medical signal and image processing by translating complex model decisions into user-friendly and justifiable features for healthcare professionals. The criteria for choosing interpretable features were clearly defined, incorporating clinical guidelines and expert rules to align model outputs with established medical standards. The proposed approach was tested on two medical datasets: electrocardiography (ECG) for arrhythmia detection and magnetic resonance imaging (MRI) for heart disease classification. The performance of the DL models was compared with expert annotations using Cohen's Kappa coefficient to assess agreement, achieving coefficients of 0.89 for the ECG dataset and 0.80 for the MRI dataset. These results demonstrate strong agreement, underscoring the reliability of the approach in providing accurate, understandable, and justifiable explanations of DL model decisions. The scalability of the approach suggests its potential applicability across various medical domains, enhancing the generalizability and utility of DL models in healthcare while addressing practical challenges and ethical considerations.</p>","PeriodicalId":33315,"journal":{"name":"Frontiers in Artificial Intelligence","volume":"7 ","pages":"1482141"},"PeriodicalIF":3.0,"publicationDate":"2024-11-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11625760/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142802415","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Accuracy improvement in financial sanction screening: is natural language processing the solution? 提高金融制裁筛选的准确性:自然语言处理是解决方案吗?
IF 3
Frontiers in Artificial Intelligence Pub Date : 2024-11-22 eCollection Date: 2024-01-01 DOI: 10.3389/frai.2024.1374323
Seihee Kim, ShengYun Yang
{"title":"Accuracy improvement in financial sanction screening: is natural language processing the solution?","authors":"Seihee Kim, ShengYun Yang","doi":"10.3389/frai.2024.1374323","DOIUrl":"10.3389/frai.2024.1374323","url":null,"abstract":"<p><p>Sanction screening is a crucial banking compliance process that protects financial institutions from inadvertently engaging with internationally sanctioned individuals or organizations. Given the severe consequences, including financial crime risks and potential loss of banking licenses, effective execution is essential. One of the major challenges in this process is balancing the high rate of false positives, which exceed 90% and lead to inefficiencies due to increased human oversight, with the more critical issue of false negatives, which pose severe regulatory and financial risks by allowing sanctioned entities to go undetected. This study explores the use of Natural Language Processing (NLP) to enhance the accuracy of sanction screening, with a particular focus on reducing false negatives. Using an experimental approach, we evaluated a prototype NLP program on a dataset of sanctioned entities and transactions, assessing its performance in minimising false negatives and understanding its effect on false positives. Our findings demonstrate that while NLP significantly improves sensitivity by detecting more true positives, it also increases false positives, resulting in a trade-off between improved detection and reduced overall accuracy. Given the heightened risks associated with false negatives, this research emphasizes the importance of prioritizing their reduction. The study provides practical insights into how NLP can enhance sanction screening, while recognizing the need for ongoing adaptation to the dynamic nature of the field.</p>","PeriodicalId":33315,"journal":{"name":"Frontiers in Artificial Intelligence","volume":"7 ","pages":"1374323"},"PeriodicalIF":3.0,"publicationDate":"2024-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11621073/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142802299","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Active learning with human heuristics: an algorithm robust to labeling bias. 人类启发式的主动学习:一种对标记偏差具有鲁棒性的算法。
IF 3
Frontiers in Artificial Intelligence Pub Date : 2024-11-19 eCollection Date: 2024-01-01 DOI: 10.3389/frai.2024.1491932
Sriram Ravichandran, Nandan Sudarsanam, Balaraman Ravindran, Konstantinos V Katsikopoulos
{"title":"Active learning with human heuristics: an algorithm robust to labeling bias.","authors":"Sriram Ravichandran, Nandan Sudarsanam, Balaraman Ravindran, Konstantinos V Katsikopoulos","doi":"10.3389/frai.2024.1491932","DOIUrl":"10.3389/frai.2024.1491932","url":null,"abstract":"<p><p>Active learning enables prediction models to achieve better performance faster by adaptively querying an oracle for the labels of data points. Sometimes the oracle is a human, for example when a medical diagnosis is provided by a doctor. According to the behavioral sciences, people, because they employ heuristics, might sometimes exhibit biases in labeling. How does modeling the oracle as a human heuristic affect the performance of active learning algorithms? If there is a drop in performance, can one design active learning algorithms robust to labeling bias? The present article provides answers. We investigate two established human heuristics (fast-and-frugal tree, tallying model) combined with four active learning algorithms (entropy sampling, multi-view learning, conventional information density, and, our proposal, inverse information density) and three standard classifiers (logistic regression, random forests, support vector machines), and apply their combinations to 15 datasets where people routinely provide labels, such as health and other domains like marketing and transportation. There are two main results. First, we show that if a heuristic provides labels, the performance of active learning algorithms significantly drops, sometimes below random. Hence, it is key to design active learning algorithms that are robust to labeling bias. Our second contribution is to provide such a robust algorithm. The proposed inverse information density algorithm, which is inspired by human psychology, achieves an overall improvement of 87% over the best of the other algorithms. In conclusion, designing and benchmarking active learning algorithms can benefit from incorporating the modeling of human heuristics.</p>","PeriodicalId":33315,"journal":{"name":"Frontiers in Artificial Intelligence","volume":"7 ","pages":"1491932"},"PeriodicalIF":3.0,"publicationDate":"2024-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11611880/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142772822","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Vision-language models for medical report generation and visual question answering: a review. 医学报告生成和视觉问答的视觉语言模型综述
IF 3
Frontiers in Artificial Intelligence Pub Date : 2024-11-19 eCollection Date: 2024-01-01 DOI: 10.3389/frai.2024.1430984
Iryna Hartsock, Ghulam Rasool
{"title":"Vision-language models for medical report generation and visual question answering: a review.","authors":"Iryna Hartsock, Ghulam Rasool","doi":"10.3389/frai.2024.1430984","DOIUrl":"10.3389/frai.2024.1430984","url":null,"abstract":"<p><p>Medical vision-language models (VLMs) combine computer vision (CV) and natural language processing (NLP) to analyze visual and textual medical data. Our paper reviews recent advancements in developing VLMs specialized for healthcare, focusing on publicly available models designed for medical report generation and visual question answering (VQA). We provide background on NLP and CV, explaining how techniques from both fields are integrated into VLMs, with visual and language data often fused using Transformer-based architectures to enable effective learning from multimodal data. Key areas we address include the exploration of 18 public medical vision-language datasets, in-depth analyses of the architectures and pre-training strategies of 16 recent noteworthy medical VLMs, and comprehensive discussion on evaluation metrics for assessing VLMs' performance in medical report generation and VQA. We also highlight current challenges facing medical VLM development, including limited data availability, concerns with data privacy, and lack of proper evaluation metrics, among others, while also proposing future directions to address these obstacles. Overall, our review summarizes the recent progress in developing VLMs to harness multimodal medical data for improved healthcare applications.</p>","PeriodicalId":33315,"journal":{"name":"Frontiers in Artificial Intelligence","volume":"7 ","pages":"1430984"},"PeriodicalIF":3.0,"publicationDate":"2024-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11611889/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142773002","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信