{"title":"Medical Education and artificial intelligence: Responsible and effective practice requires human oversight","authors":"Kevin W. Eva","doi":"10.1111/medu.15495","DOIUrl":null,"url":null,"abstract":"<p>I have a confession to make. I have been slow to generate an official policy statement for <i>Medical Education</i> about artificial intelligence (AI) because I find the discussion terribly boring. Don't confuse that statement with lack of interest—I consider the technology exhilarating, use it routinely, and marvel at its potential.<span><sup>1</sup></span> Don't confuse it either with being dismissive—I recognise, appreciate, and wish to help guard against the ethical harms that could be done from, among other things, loss of intellectual property and reinforcement of systemic bias.<span><sup>2</sup></span> However, I find most discussion about the use of AI in publishing (be it about writing, enabling better and faster peer review, or the need to guard against unscrupulous practices) to boil down to the same basic sentiment: Responsible and effective practice requires human oversight.</p><p>With over 14 500 seemingly viable AI resources readily available,<span><sup>3</sup></span> there is great risk of overgeneralization and I will not profess to having deep knowledge of the means through which each has been generated. I do, however, believe this class of technologies, as a whole, to best be conceived of as tools (that happen to be proliferating at unprecedented speed and with little empirical testing).<span><sup>4, 5</sup></span> Some of the panic the rate of development creates amounts to worry that we ourselves will become tools, used by the computers, but that is not the reality we are dealing with at the moment and there are very good reasons to not believe the futurists in that regard.<span><sup>6</sup></span> As such, we must focus on what all tools require for responsible and effective practice: Human, or at least biological,<span><sup>7</sup></span> oversight. So let's consider the role each group involved in journal publication has to play in that regard.</p><p>We encourage authors to use AI <span>if and when</span> it helps strengthen their capacity to improve awareness of pre-existing literature,<span><sup>8</sup></span> to formulate stronger research questions or to bolster research designs and analyses (i.e. any time it helps to make their scholarship better). We are not going to force disclosure of every way in which AI influenced their submissions because it would be impossible to craft a sufficiently detailed guideline (especially given that people are often unaware of how AI has been embedded in common software packages). Further, a dominant theme in our International Editorial Advisory Board's debate about this issue was that requiring such disclosure is likely to be increasingly nonsensical, tantamount to needing to disclose the use of Google, spell-check, a keyboard, or any other tool that is similarly omnipresent in academic work. If using AI was of fundamental importance to your project, then what made it so should be disclosed in the body of your paper. That standard, however, is the same as has always been applied to disclosing use of tools like nVivo or SPSS for primary analyses or any of countless databases for literature searches: authors are responsible for clearly describing aspects of their efforts that readers need to know to understand the rigour and replicability of the study. In doing so, it is of course important to keep in mind, just as we do routinely with other technologies, that any tool can be misused, requiring caution and investment in learning about the tool's strengths and limitations.<span><sup>1, 9, 10</sup></span> Responsible and effective practice requires human oversight.</p><p>Optimistically, we hope this position will improve equity in the field by reducing barriers for those who wish to publish in our pages despite their first language not being English. We understand the risk of other barriers being created by virtue of privileging those who can afford the technology, but remain hopeful that is a lesser challenge given the truism that computer technology gets cheaper with time.<span><sup>11</sup></span> There can be no doubt that AI hallucinates,<span><sup>10</sup></span> requiring individuals to double check its claims about the world while recognising that the author, not the computer, is accountable for the final text.<span><sup>12</sup></span> For those reasons, I would never myself dream of submitting a paper in a language other than one in which I was fluent without careful and triangulated effort to confirm that the translation said exactly what I intended it to. Responsible and effective practice requires human oversight.</p><p>Whether authors have used AI or not, peer review remains the best tool we have at our disposal for improving the work we are collectively undertaking as a field of study.<span><sup>13</sup></span> Given that AI is currently built on a corpus of knowledge that is predominantly English,<span><sup>2</sup></span> we need reviewers to raise questions about whether a project responsibly represents the state of knowledge in the world. That standard, however, is the same as has always been applied in attempts to judge the adequacy of a paper's framing. Similarly, while it would be inappropriate to submit a manuscript one received for peer review to an AI device without permission to do so, that aligns with the same confidentiality standard that has existed for decades. Asking a question of AI to clarify one's thinking or to contemplate clearer (or more courteous) ways of conveying one's concerns is encouraged <span>if and when</span> it improves the reviewer's capacity to offer feedback to authors or professional development for reviewers themselves.<span><sup>14</sup></span> Out of curiosity, I once submitted some of my own writing to ChatGPT with a request to ‘write a rejection letter’ (to see if it could predict the objections peer reviewers might raise). After the first response largely parroted back the claims I had made in the abstract, I instructed the computer to try again, stressing that I wanted a rejection letter. Its response was informative: ‘I am not programmed for critical appraisal.’ Even the AI ‘knew’ that responsible and effective practice requires human oversight.</p><p>As curators and stewards of the journal, we too pledge to use AI <span>if and when</span> it helps to improve reader, author, and reviewer experience. For example, AI has enabled implementation of a ‘free format’ submission system at <i>Medical Education</i>, so our authors no longer have to go through the tedium of formatting references in a specific way; authors will also note that most of the effort involved in making a submission now amounts to uploading a manuscript and confirming that the software has accurately identified author names, the title, abstract and so on. AI has been used for years to try to detect unethical publication practices such as duplicate submission and plagiarism. Similarly, the software we use to manage peer review has AI embedded that suggests reviewers it thinks to be particularly well suited to the content of the manuscript under consideration. While these systems will undoubtedly continue to improve, each is far from perfect. Truly fraudulent behaviour cannot be caught by any existing software while far more common lesser transgressions are generally over-called. Further, our editors must remain cognizant of the value of hearing diverse voices to inform our peer reviews if we are to facilitate a truly inclusive academic community.<span><sup>15</sup></span> As a result, we will not automate decision-making or empower AI to conduct it. Instead, these resources will continue to be used as tools, taking advantage of their greater capacity to flag potential issues and opportunities while continuing to investigate them with the care and thoughtfulness required to yield the best outcomes we can achieve. Responsible and effective practice requires human oversight.</p><p>While less of a policy than a perspective, I offer this editorial as proof that I did (eventually) conclude it necessary to share these views for the sake of transparency, however boring (i.e. non-reactive or reinforcing of the status quo) they happen to be. As technology changes, our policies will continue to evolve, but for now we encourage everyone involved in academic publishing to use whatever tools they have available for improving the field and its capacity to improve health through education. That, to my mind, defines responsible and effective oversight.</p>","PeriodicalId":18370,"journal":{"name":"Medical Education","volume":"58 11","pages":"1260-1261"},"PeriodicalIF":4.9000,"publicationDate":"2024-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/medu.15495","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Medical Education","FirstCategoryId":"95","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1111/medu.15495","RegionNum":1,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"EDUCATION, SCIENTIFIC DISCIPLINES","Score":null,"Total":0}
引用次数: 0
Abstract
I have a confession to make. I have been slow to generate an official policy statement for Medical Education about artificial intelligence (AI) because I find the discussion terribly boring. Don't confuse that statement with lack of interest—I consider the technology exhilarating, use it routinely, and marvel at its potential.1 Don't confuse it either with being dismissive—I recognise, appreciate, and wish to help guard against the ethical harms that could be done from, among other things, loss of intellectual property and reinforcement of systemic bias.2 However, I find most discussion about the use of AI in publishing (be it about writing, enabling better and faster peer review, or the need to guard against unscrupulous practices) to boil down to the same basic sentiment: Responsible and effective practice requires human oversight.
With over 14 500 seemingly viable AI resources readily available,3 there is great risk of overgeneralization and I will not profess to having deep knowledge of the means through which each has been generated. I do, however, believe this class of technologies, as a whole, to best be conceived of as tools (that happen to be proliferating at unprecedented speed and with little empirical testing).4, 5 Some of the panic the rate of development creates amounts to worry that we ourselves will become tools, used by the computers, but that is not the reality we are dealing with at the moment and there are very good reasons to not believe the futurists in that regard.6 As such, we must focus on what all tools require for responsible and effective practice: Human, or at least biological,7 oversight. So let's consider the role each group involved in journal publication has to play in that regard.
We encourage authors to use AI if and when it helps strengthen their capacity to improve awareness of pre-existing literature,8 to formulate stronger research questions or to bolster research designs and analyses (i.e. any time it helps to make their scholarship better). We are not going to force disclosure of every way in which AI influenced their submissions because it would be impossible to craft a sufficiently detailed guideline (especially given that people are often unaware of how AI has been embedded in common software packages). Further, a dominant theme in our International Editorial Advisory Board's debate about this issue was that requiring such disclosure is likely to be increasingly nonsensical, tantamount to needing to disclose the use of Google, spell-check, a keyboard, or any other tool that is similarly omnipresent in academic work. If using AI was of fundamental importance to your project, then what made it so should be disclosed in the body of your paper. That standard, however, is the same as has always been applied to disclosing use of tools like nVivo or SPSS for primary analyses or any of countless databases for literature searches: authors are responsible for clearly describing aspects of their efforts that readers need to know to understand the rigour and replicability of the study. In doing so, it is of course important to keep in mind, just as we do routinely with other technologies, that any tool can be misused, requiring caution and investment in learning about the tool's strengths and limitations.1, 9, 10 Responsible and effective practice requires human oversight.
Optimistically, we hope this position will improve equity in the field by reducing barriers for those who wish to publish in our pages despite their first language not being English. We understand the risk of other barriers being created by virtue of privileging those who can afford the technology, but remain hopeful that is a lesser challenge given the truism that computer technology gets cheaper with time.11 There can be no doubt that AI hallucinates,10 requiring individuals to double check its claims about the world while recognising that the author, not the computer, is accountable for the final text.12 For those reasons, I would never myself dream of submitting a paper in a language other than one in which I was fluent without careful and triangulated effort to confirm that the translation said exactly what I intended it to. Responsible and effective practice requires human oversight.
Whether authors have used AI or not, peer review remains the best tool we have at our disposal for improving the work we are collectively undertaking as a field of study.13 Given that AI is currently built on a corpus of knowledge that is predominantly English,2 we need reviewers to raise questions about whether a project responsibly represents the state of knowledge in the world. That standard, however, is the same as has always been applied in attempts to judge the adequacy of a paper's framing. Similarly, while it would be inappropriate to submit a manuscript one received for peer review to an AI device without permission to do so, that aligns with the same confidentiality standard that has existed for decades. Asking a question of AI to clarify one's thinking or to contemplate clearer (or more courteous) ways of conveying one's concerns is encouraged if and when it improves the reviewer's capacity to offer feedback to authors or professional development for reviewers themselves.14 Out of curiosity, I once submitted some of my own writing to ChatGPT with a request to ‘write a rejection letter’ (to see if it could predict the objections peer reviewers might raise). After the first response largely parroted back the claims I had made in the abstract, I instructed the computer to try again, stressing that I wanted a rejection letter. Its response was informative: ‘I am not programmed for critical appraisal.’ Even the AI ‘knew’ that responsible and effective practice requires human oversight.
As curators and stewards of the journal, we too pledge to use AI if and when it helps to improve reader, author, and reviewer experience. For example, AI has enabled implementation of a ‘free format’ submission system at Medical Education, so our authors no longer have to go through the tedium of formatting references in a specific way; authors will also note that most of the effort involved in making a submission now amounts to uploading a manuscript and confirming that the software has accurately identified author names, the title, abstract and so on. AI has been used for years to try to detect unethical publication practices such as duplicate submission and plagiarism. Similarly, the software we use to manage peer review has AI embedded that suggests reviewers it thinks to be particularly well suited to the content of the manuscript under consideration. While these systems will undoubtedly continue to improve, each is far from perfect. Truly fraudulent behaviour cannot be caught by any existing software while far more common lesser transgressions are generally over-called. Further, our editors must remain cognizant of the value of hearing diverse voices to inform our peer reviews if we are to facilitate a truly inclusive academic community.15 As a result, we will not automate decision-making or empower AI to conduct it. Instead, these resources will continue to be used as tools, taking advantage of their greater capacity to flag potential issues and opportunities while continuing to investigate them with the care and thoughtfulness required to yield the best outcomes we can achieve. Responsible and effective practice requires human oversight.
While less of a policy than a perspective, I offer this editorial as proof that I did (eventually) conclude it necessary to share these views for the sake of transparency, however boring (i.e. non-reactive or reinforcing of the status quo) they happen to be. As technology changes, our policies will continue to evolve, but for now we encourage everyone involved in academic publishing to use whatever tools they have available for improving the field and its capacity to improve health through education. That, to my mind, defines responsible and effective oversight.
期刊介绍:
Medical Education seeks to be the pre-eminent journal in the field of education for health care professionals, and publishes material of the highest quality, reflecting world wide or provocative issues and perspectives.
The journal welcomes high quality papers on all aspects of health professional education including;
-undergraduate education
-postgraduate training
-continuing professional development
-interprofessional education