Ai Magazine最新文献

Explainable AI, energy and critical infrastructure systems 可解释的人工智能、能源和关键基础设施系统

IF 3.2 4区计算机科学

Ai Magazine Pub Date : 2025-09-23 DOI: 10.1002/aaai.70033

Francesco Leofante, André Artelt, Demetrios Eliades, Anna Korre, Francesca Toni, Tim Miller

引用次数: 0

Open-source AI at scale: Establishing an enterprise AI strategy through modular frameworks 大规模开源人工智能：通过模块化框架建立企业人工智能战略

IF 3.2 4区计算机科学

Ai Magazine Pub Date : 2025-09-22 DOI: 10.1002/aaai.70032

Serdar Kadıoğlu

引用次数: 0

Multimodal AI Teacher: Integrating Edge Computing and Reasoning Models for Enhanced Student Error Analysis 多模态人工智能教师：集成边缘计算和推理模型以增强学生错误分析

IF 3.2 4区计算机科学

Ai Magazine Pub Date : 2025-09-21 DOI: 10.1002/aaai.70030

Tianlong Xu, Yi-Fan Zhang, Zhendong Chu, Qingsong Wen

{"title":"Multimodal AI Teacher: Integrating Edge Computing and Reasoning Models for Enhanced Student Error Analysis","authors":"Tianlong Xu, Yi-Fan Zhang, Zhendong Chu, Qingsong Wen","doi":"10.1002/aaai.70030","DOIUrl":"https://doi.org/10.1002/aaai.70030","url":null,"abstract":"This paper extends our previously published work on the virtual AI teacher (VATE) system, presented at IAAI-25. VATE is designed to autonomously analyze and correct student errors in mathematical problem-solving using advanced large language models (LLMs). By incorporating student draft images as a primary input for reasoning, the system provides fine-grained error cause analysis and supports real-time, multi-round AI—student dialogues. In this extended version, we introduce a new snap-to-solve module for handling low-reasoning tasks using edge-deployed LLMs, enabling faster and partially offline interaction. We also include expanded benchmarking experiments, including human expert evaluations and ablation studies, to assess model performance and learning outcomes. Deployed on the Squirrel AI platform, VATE demonstrates high accuracy (78.3%) in error analysis and improves student learning efficiency, with strong user satisfaction. These results suggest that VATE is a scalable, cost-effective solution with the potential to transform educational practices.","PeriodicalId":7854,"journal":{"name":"Ai Magazine","volume":"46 3","pages":""},"PeriodicalIF":3.2,"publicationDate":"2025-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/aaai.70030","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145102329","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Automated vulnerability evaluation with large language models and vulnerability ontologies 使用大型语言模型和漏洞本体的自动化漏洞评估

IF 3.2 4区计算机科学

Ai Magazine Pub Date : 2025-09-15 DOI: 10.1002/aaai.70031

Rikhiya Ghosh, Hans-Martin von Stockhausen, Martin Schmitt, George Marica Vasile, Sanjeev Kumar Karn, Oladimeji Farri

{"title":"Automated vulnerability evaluation with large language models and vulnerability ontologies","authors":"Rikhiya Ghosh, Hans-Martin von Stockhausen, Martin Schmitt, George Marica Vasile, Sanjeev Kumar Karn, Oladimeji Farri","doi":"10.1002/aaai.70031","DOIUrl":"https://doi.org/10.1002/aaai.70031","url":null,"abstract":"The National Vulnerability Database (NVD) publishes over a thousand new vulnerabilities monthly, with a projected 25 percent increase in 2024, highlighting the crucial need for rapid vulnerability identification to mitigate cybersecurity attacks and save costs and resources. In this work, we propose using large language models (LLMs) to learn vulnerability evaluation from historical assessments of medical device vulnerabilities in a single manufacturer's portfolio. We highlight the effectiveness and challenges of using LLMs for automatic vulnerability evaluation and introduce a method to enrich historical data with cybersecurity ontologies, enabling the system to understand new vulnerabilities without retraining the LLM. Our LLM system integrates with the in-house application—Cybersecurity Management System (CSMS)—to help Siemens Healthineers (SHS) product cybersecurity experts efficiently assess the vulnerabilities in our products. Also, we present a comprehensive set of experiments that helps showcase the properties of the LLM and dataset, the various guardrails we have implemented to safeguard the system in production, and the guidelines for efficient integration of LLMs into the cybersecurity tool.","PeriodicalId":7854,"journal":{"name":"Ai Magazine","volume":"46 3","pages":""},"PeriodicalIF":3.2,"publicationDate":"2025-09-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/aaai.70031","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145057736","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

OnAIR: Applications of the NASA on-board artificial intelligence research platform OnAIR: NASA机载人工智能研究平台的应用

IF 3.2 4区计算机科学

Ai Magazine Pub Date : 2025-09-15 DOI: 10.1002/aaai.70020

Evana Gizzi, Connor Firth, Caleb Adams, James Berck, P. Timothy Chase Jr, Christian Cassamajor-Paul, Rachael Chertok, Lily Clough, Jonathan Davis, Melissa De La Cruz, Matthew Dosberg, Alan Gibson, Jonathan Hammer, Ibrahim Haroon, Michael A. Johnson, Brian Kempa, James Marshall, Patrick Maynard, Brett McKinney, Leyton McKinney, Michael Monaghan, Robin Onsay, Hayley Owens, Sam Pedrotty, Daniel Rogers, Mahmooda Sultana, Jivko Sinapov, Bethany Theiling, Aaron Woodard, Caroline Zouloumian, Connor Williams

{"title":"OnAIR: Applications of the NASA on-board artificial intelligence research platform","authors":"Evana Gizzi, Connor Firth, Caleb Adams, James Berck, P. Timothy Chase Jr, Christian Cassamajor-Paul, Rachael Chertok, Lily Clough, Jonathan Davis, Melissa De La Cruz, Matthew Dosberg, Alan Gibson, Jonathan Hammer, Ibrahim Haroon, Michael A. Johnson, Brian Kempa, James Marshall, Patrick Maynard, Brett McKinney, Leyton McKinney, Michael Monaghan, Robin Onsay, Hayley Owens, Sam Pedrotty, Daniel Rogers, Mahmooda Sultana, Jivko Sinapov, Bethany Theiling, Aaron Woodard, Caroline Zouloumian, Connor Williams","doi":"10.1002/aaai.70020","DOIUrl":"https://doi.org/10.1002/aaai.70020","url":null,"abstract":"Infusing artificial intelligence algorithms into production aerospace systems can be challenging due to costs, timelines, and a risk-averse industry. We introduce the Onboard Artificial Intelligence Research (OnAIR) platform, an open-source software pipeline and cognitive architecture tool that enables full life cycle AI research for on-board intelligent systems. We begin with a description and user walk-through of the OnAIR tool. Next, we describe four use cases of OnAIR for both research and deployed onboard applications, detailing their use of OnAIR and the benefits it provided to the development and function of each respective scenario. Lastly, we describe two upcoming planned deployments which will leverage OnAIR for crucial mission outcomes. We conclude with remarks on future work and goals for the forward progression of OnAIR as a tool to enable a larger AI and aerospace research community.","PeriodicalId":7854,"journal":{"name":"Ai Magazine","volume":"46 3","pages":""},"PeriodicalIF":3.2,"publicationDate":"2025-09-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/aaai.70020","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145057735","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Developing generative recommender systems for government subsidy programs with a new RQ-VAE model: Wello and the Korean government case 用新的RQ-VAE模型为政府补贴项目开发生成式推荐系统：Wello和韩国政府案例

IF 3.2 4区计算机科学

Ai Magazine Pub Date : 2025-09-08 DOI: 10.1002/aaai.70029

Ji Won Kim, Jae Hong Park, Yuri Anna Kim, Sang Jun Lee

{"title":"Developing generative recommender systems for government subsidy programs with a new RQ-VAE model: Wello and the Korean government case","authors":"Ji Won Kim, Jae Hong Park, Yuri Anna Kim, Sang Jun Lee","doi":"10.1002/aaai.70029","DOIUrl":"https://doi.org/10.1002/aaai.70029","url":null,"abstract":"According to an industry survey, many people miss opportunities to apply for government subsidy programs because they do not know how to apply. People also need to search manually and check whether these programs are suitable for them. To address this issue, our study developed a new generative recommender system with both users' information and government subsidy documents. Within our recommender system framework, we modify the existing Residual Quantization Variational Auto-Encoder (RQ-VAE) model to capture deep and abstract information from subsidy documents. Using semantic IDs generated for approximately 185,610 user click-stream histories and 240,000 documents, we train our recommender system to predict the semantic IDs of the next subsidy policy documents in which a user might be interested. In 2024, we successfully deployed our generative recommender system in Wello, a Korean Gov-Tech startup. In collaboration with the Korean government, our generative recommender system helped enhance program effectiveness by saving $7.8 million in unused funds and achieved $27.4 million in advertising efficiency gains. Also, Wello observed a 68% improvement in Click-Through-Ratio (CTR), increasing from 41.4% in the third quarter of 2024 to 69.6% in the fourth quarter of 2024. We thus anticipate that our generative recommender system will have a significant impact on both individuals and the government. ","PeriodicalId":7854,"journal":{"name":"Ai Magazine","volume":"46 3","pages":""},"PeriodicalIF":3.2,"publicationDate":"2025-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/aaai.70029","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145012508","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Evaluation and incident prevention in an enterprise AI assistant 企业AI助手的评估与事件预防

IF 3.2 4区计算机科学

Ai Magazine Pub Date : 2025-09-08 DOI: 10.1002/aaai.70028

Akash V. Maharaj, David Arbour, Daniel Lee, Uttaran Bhattacharya, Anup Rao, Austin Zane, Avi Feller, Kun Qian, Sajjadur Rahman, Yunyao Li

{"title":"Evaluation and incident prevention in an enterprise AI assistant","authors":"Akash V. Maharaj, David Arbour, Daniel Lee, Uttaran Bhattacharya, Anup Rao, Austin Zane, Avi Feller, Kun Qian, Sajjadur Rahman, Yunyao Li","doi":"10.1002/aaai.70028","DOIUrl":"https://doi.org/10.1002/aaai.70028","url":null,"abstract":"Enterprise AI Assistants are increasingly deployed in domains where accuracy is paramount, making each erroneous output a potentially significant incident. This paper presents a comprehensive framework for monitoring, benchmarking, and continuously improving such complex, multi-component systems under active development by multiple teams. Our approach encompasses three key elements: (1) a hierarchical “severity” framework for incident detection that identifies and categorizes errors while attributing component-specific error rates, facilitating targeted improvements; (2) a scalable and principled methodology for benchmark construction, evaluation, and deployment, designed to accommodate multiple development teams, mitigate overfitting risks, and assess the downstream impact of system modifications; and (3) a continual improvement strategy leveraging multidimensional evaluation, enabling the identification and implementation of diverse enhancement opportunities. By adopting this holistic framework, organizations can systematically enhance the reliability and performance of their AI Assistants, ensuring their efficacy in critical enterprise environments. We conclude by discussing how this multifaceted approach opens avenues for various classes of enhancements, including human-AI collaborative evaluation, paving the way for more robust and trustworthy AI systems. ","PeriodicalId":7854,"journal":{"name":"Ai Magazine","volume":"46 3","pages":""},"PeriodicalIF":3.2,"publicationDate":"2025-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/aaai.70028","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145012507","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Introduction to the special issue on innovative applications of artificial intelligence (IAAI 2025) 人工智能创新应用特刊（IAAI 2025）简介

IF 3.2 4区计算机科学

Ai Magazine Pub Date : 2025-09-08 DOI: 10.1002/aaai.70027

Serdar Kadıoğlu, Sean McGregor, Jan Seyler

引用次数: 0

Recent advances in finetuning multimodal large language models 多模态大语言模型调优的最新进展

IF 3.2 4区计算机科学

Ai Magazine Pub Date : 2025-09-03 DOI: 10.1002/aaai.70025

Zhen Wang, Lin Li, Long Chen

引用次数: 0

Toward robust, interactive, and human-aligned AI systems 朝着健壮的、交互式的、与人类一致的人工智能系统发展

IF 3.2 4区计算机科学

Ai Magazine Pub Date : 2025-08-29 DOI: 10.1002/aaai.70024

Daniel S. Brown

{"title":"Toward robust, interactive, and human-aligned AI systems","authors":"Daniel S. Brown","doi":"10.1002/aaai.70024","DOIUrl":"https://doi.org/10.1002/aaai.70024","url":null,"abstract":"Ensuring that AI systems do what we, as humans, actually want them to do is one of the biggest open research challenges in AI alignment and safety. My research seeks to directly address this challenge by enabling AI systems to interact with humans to learn aligned and robust behaviors. The way robots and other AI systems behave is often the result of optimizing a reward function. However, manually designing good reward functions is highly challenging and error-prone, even for domain experts. Although reward functions are often difficult to manually specify, human feedback in the form of demonstrations or preferences is often much easier to obtain but can be difficult to interpret due to ambiguity and noise. Thus, it is critical that AI systems take into account epistemic uncertainty over the human's true intent. As part of the AAAI New Faculty Highlight Program, I will give an overview of my research progress along the following fundamental research areas: (1) efficiently quantifying uncertainty over human intent, (2) directly optimizing behavior to be robust to uncertainty over human intent, and (3) actively querying for additional human input to reduce uncertainty over human intent.","PeriodicalId":7854,"journal":{"name":"Ai Magazine","volume":"46 3","pages":""},"PeriodicalIF":3.2,"publicationDate":"2025-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/aaai.70024","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144915217","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0