Jun-Jie Zhu, Meiqi Yang, Jinyue Jiang, Yiming Bai, Danqi Chen and Zhiyong Jason Ren*,
{"title":"Enabling GPTs for Expert-Level Environmental Engineering Question Answering","authors":"Jun-Jie Zhu, Meiqi Yang, Jinyue Jiang, Yiming Bai, Danqi Chen and Zhiyong Jason Ren*, ","doi":"10.1021/acs.estlett.4c0066510.1021/acs.estlett.4c00665","DOIUrl":null,"url":null,"abstract":"<p >Artificial intelligence (AI) holds significant potential for advancing research and development in the field of environmental science and engineering (ESE), but the development of domain-specific large language models (LLMs) in this field has not been reported. This study addresses this gap by evaluating the performance of advanced LLMs in answering expert-level, closed-book environmental engineering questions. We assessed two generative pretrained transformer (GPT) models and five fine-tuned models (FTMs) on an expert-level question answering (QA) data set, focusing on relevance (from 0 to 1), factuality (0 to 1), format, richness, QA difficulty level, and domain topic. Results show that GPT-4 achieves a relevance score of 0.644 and a factuality score of 0.791 based on 286 questions, indicating room for improvement, particularly for more difficult questions (scores dropped to below 0.5). Notably, FTMs with larger data sets resisted factuality degradation, highlighting the need for high-quality training materials. Inaccuracies and format issues are often linked to overtraining and catastrophic interference. This first investigation leverages expert-level textbooks to enhance LLM performance, thereby providing valuable insights and setting the stage for developing more robust domain-specific LLMs for environmental applications.</p>","PeriodicalId":37,"journal":{"name":"Environmental Science & Technology Letters Environ.","volume":"11 12","pages":"1327–1333 1327–1333"},"PeriodicalIF":8.9000,"publicationDate":"2024-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Environmental Science & Technology Letters Environ.","FirstCategoryId":"1","ListUrlMain":"https://pubs.acs.org/doi/10.1021/acs.estlett.4c00665","RegionNum":2,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ENVIRONMENTAL","Score":null,"Total":0}
引用次数: 0
Abstract
Artificial intelligence (AI) holds significant potential for advancing research and development in the field of environmental science and engineering (ESE), but the development of domain-specific large language models (LLMs) in this field has not been reported. This study addresses this gap by evaluating the performance of advanced LLMs in answering expert-level, closed-book environmental engineering questions. We assessed two generative pretrained transformer (GPT) models and five fine-tuned models (FTMs) on an expert-level question answering (QA) data set, focusing on relevance (from 0 to 1), factuality (0 to 1), format, richness, QA difficulty level, and domain topic. Results show that GPT-4 achieves a relevance score of 0.644 and a factuality score of 0.791 based on 286 questions, indicating room for improvement, particularly for more difficult questions (scores dropped to below 0.5). Notably, FTMs with larger data sets resisted factuality degradation, highlighting the need for high-quality training materials. Inaccuracies and format issues are often linked to overtraining and catastrophic interference. This first investigation leverages expert-level textbooks to enhance LLM performance, thereby providing valuable insights and setting the stage for developing more robust domain-specific LLMs for environmental applications.
期刊介绍:
Environmental Science & Technology Letters serves as an international forum for brief communications on experimental or theoretical results of exceptional timeliness in all aspects of environmental science, both pure and applied. Published as soon as accepted, these communications are summarized in monthly issues. Additionally, the journal features short reviews on emerging topics in environmental science and technology.