Shengzhi Huang , Qicong Wang , Wei Lu , Lingyu Liu , Zhenzhen Xu , Yong Huang
{"title":"论文评估:一种通用的、定量的、可解释的论文评估方法,由多代理系统提供动力","authors":"Shengzhi Huang , Qicong Wang , Wei Lu , Lingyu Liu , Zhenzhen Xu , Yong Huang","doi":"10.1016/j.ipm.2025.104225","DOIUrl":null,"url":null,"abstract":"<div><div>The immediate and efficient evaluation of scientific papers is crucial for advancing scientific progress. However, traditional peer review faces numerous challenges, including reviewer bias, limited expertise, and an overwhelming volume of publications. Recent advancements in large language models (LLMs) suggest their potential as promising evaluators, capable of approximating human cognition and understanding both ordinary and scientific language. In this study, we propose a novel AI-empowered paper evaluation method, PaperEval (PE), which utilizes a multi-agent system powered by LLMs to design evaluation criteria, assess paper quality along different dimensions, and generate explainable scores. We also introduce two variants of PE, Multi-round PaperEval (MPE) and Self-correcting PaperEval (SPE), which produce comparable scores and iteratively refine the evaluation criteria, respectively. To test our methods, we conducted a comprehensive analysis of three curated datasets, encompassing about 66,000 target papers of varying quality across the fields of mathematics, physics, chemistry, and medicine. The results show that our methods can effectively discern between high- and low-quality papers based on scores derived in four dimensions: Question, Method, Result, and Conclusion. Moreover, the results highlight the evaluation’s stability over time, the impact of comparative papers, the advantages of the multi-round evaluation strategy, and the varying correlation between AI ratings and scientific impact across different disciplines. Our method can seamlessly integrate into the existing scientific evaluation system, offering valuable insights for the development of AI-driven scientific evaluation.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"62 6","pages":"Article 104225"},"PeriodicalIF":7.4000,"publicationDate":"2025-05-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"PaperEval: A universal, quantitative, and explainable paper evaluation method powered by a multi-agent system\",\"authors\":\"Shengzhi Huang , Qicong Wang , Wei Lu , Lingyu Liu , Zhenzhen Xu , Yong Huang\",\"doi\":\"10.1016/j.ipm.2025.104225\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>The immediate and efficient evaluation of scientific papers is crucial for advancing scientific progress. However, traditional peer review faces numerous challenges, including reviewer bias, limited expertise, and an overwhelming volume of publications. Recent advancements in large language models (LLMs) suggest their potential as promising evaluators, capable of approximating human cognition and understanding both ordinary and scientific language. In this study, we propose a novel AI-empowered paper evaluation method, PaperEval (PE), which utilizes a multi-agent system powered by LLMs to design evaluation criteria, assess paper quality along different dimensions, and generate explainable scores. We also introduce two variants of PE, Multi-round PaperEval (MPE) and Self-correcting PaperEval (SPE), which produce comparable scores and iteratively refine the evaluation criteria, respectively. To test our methods, we conducted a comprehensive analysis of three curated datasets, encompassing about 66,000 target papers of varying quality across the fields of mathematics, physics, chemistry, and medicine. The results show that our methods can effectively discern between high- and low-quality papers based on scores derived in four dimensions: Question, Method, Result, and Conclusion. Moreover, the results highlight the evaluation’s stability over time, the impact of comparative papers, the advantages of the multi-round evaluation strategy, and the varying correlation between AI ratings and scientific impact across different disciplines. Our method can seamlessly integrate into the existing scientific evaluation system, offering valuable insights for the development of AI-driven scientific evaluation.</div></div>\",\"PeriodicalId\":50365,\"journal\":{\"name\":\"Information Processing & Management\",\"volume\":\"62 6\",\"pages\":\"Article 104225\"},\"PeriodicalIF\":7.4000,\"publicationDate\":\"2025-05-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Information Processing & Management\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0306457325001669\",\"RegionNum\":1,\"RegionCategory\":\"管理学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Processing & Management","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0306457325001669","RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
PaperEval: A universal, quantitative, and explainable paper evaluation method powered by a multi-agent system
The immediate and efficient evaluation of scientific papers is crucial for advancing scientific progress. However, traditional peer review faces numerous challenges, including reviewer bias, limited expertise, and an overwhelming volume of publications. Recent advancements in large language models (LLMs) suggest their potential as promising evaluators, capable of approximating human cognition and understanding both ordinary and scientific language. In this study, we propose a novel AI-empowered paper evaluation method, PaperEval (PE), which utilizes a multi-agent system powered by LLMs to design evaluation criteria, assess paper quality along different dimensions, and generate explainable scores. We also introduce two variants of PE, Multi-round PaperEval (MPE) and Self-correcting PaperEval (SPE), which produce comparable scores and iteratively refine the evaluation criteria, respectively. To test our methods, we conducted a comprehensive analysis of three curated datasets, encompassing about 66,000 target papers of varying quality across the fields of mathematics, physics, chemistry, and medicine. The results show that our methods can effectively discern between high- and low-quality papers based on scores derived in four dimensions: Question, Method, Result, and Conclusion. Moreover, the results highlight the evaluation’s stability over time, the impact of comparative papers, the advantages of the multi-round evaluation strategy, and the varying correlation between AI ratings and scientific impact across different disciplines. Our method can seamlessly integrate into the existing scientific evaluation system, offering valuable insights for the development of AI-driven scientific evaluation.
期刊介绍:
Information Processing and Management is dedicated to publishing cutting-edge original research at the convergence of computing and information science. Our scope encompasses theory, methods, and applications across various domains, including advertising, business, health, information science, information technology marketing, and social computing.
We aim to cater to the interests of both primary researchers and practitioners by offering an effective platform for the timely dissemination of advanced and topical issues in this interdisciplinary field. The journal places particular emphasis on original research articles, research survey articles, research method articles, and articles addressing critical applications of research. Join us in advancing knowledge and innovation at the intersection of computing and information science.