Namkee Oh, Jongman Kim, Sunghae Park, Sunghyo An, Eunjin Lee, Hayeon Do, Jiyoung Baik, Suk Min Gwon, Jinsoo Rhu, Gyu-Seong Choi, Seonmin Park, Jai Young Cho, Hae Won Lee, Boram Lee, Eun Sung Jeong, Jeong-Moo Lee, YoungRok Choi, Jieun Kwon, Kyeong Deok Kim, Seok-Hwan Kim, Gwang-Sik Chun
{"title":"Large Language Model-Assisted Surgical Consent Forms in Non-English Language: Content Analysis and Readability Evaluation.","authors":"Namkee Oh, Jongman Kim, Sunghae Park, Sunghyo An, Eunjin Lee, Hayeon Do, Jiyoung Baik, Suk Min Gwon, Jinsoo Rhu, Gyu-Seong Choi, Seonmin Park, Jai Young Cho, Hae Won Lee, Boram Lee, Eun Sung Jeong, Jeong-Moo Lee, YoungRok Choi, Jieun Kwon, Kyeong Deok Kim, Seok-Hwan Kim, Gwang-Sik Chun","doi":"10.2196/73222","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Surgical consent forms convey critical information; yet, their complex language can limit patient comprehension. Large language models (LLMs) can simplify complex information and improve readability, but evidence of the impact of LLM-generated modifications on content preservation in non-English consent forms is lacking.</p><p><strong>Objective: </strong>This study evaluates the impact of LLM-assisted editing on the readability and content quality of surgical consent forms in Korean-particularly consent documents for standardized liver resection-across multiple institutions.</p><p><strong>Methods: </strong>Standardized liver resection consent forms were collected from 7 South Korean medical institutions, and these forms were simplified using ChatGPT-4o. Thereafter, readability was assessed using KReaD and Natmal indices, while text structure was evaluated based on character count, word count, sentence count, words per sentence, and difficult word ratio. Content quality was analyzed across 4 domains-risk, benefit, alternative, and overall impression-using evaluations from 7 liver resection specialists. Statistical comparisons were conducted using paired 2-sided t tests, and a linear mixed-effects model was applied to account for institutional and evaluator variability.</p><p><strong>Results: </strong>Artificial intelligence-assisted editing significantly improved readability, reducing the KReaD score from 1777 (SD 28.47) to 1335.6 (SD 59.95) (P<.001) and the Natmal score from 1452.3 (SD 88.67) to 1245.3 (SD 96.96) (P=.007). Sentence length and difficult word ratio decreased significantly, contributing to increased accessibility (P<.05). However, content quality analysis showed a decline in the risk description scores (before: 2.29, SD 0.47 vs after: 1.92, SD 0.32; P=.06) and overall impression scores (before: 2.21, SD 0.49 vs after: 1.71, SD 0.64; P=.13). The linear mixed-effects model confirmed significant reductions in risk descriptions (β₁=-0.371; P=.01) and overall impression (β₁=-0.500; P=.03), suggesting potential omissions in critical safety information. Despite this, qualitative analysis indicated that evaluators did not find explicit omissions but perceived the text as overly simplified and less professional.</p><p><strong>Conclusions: </strong>Although LLM-assisted surgical consent forms significantly enhance readability, they may compromise certain aspects of content completeness, particularly in risk disclosure. These findings highlight the need for a balanced approach that maintains accessibility while ensuring medical and legal accuracy. Future research should include patient-centered evaluations to assess comprehension and informed decision-making as well as broader multilingual validation to determine LLM applicability across diverse health care settings.</p>","PeriodicalId":16337,"journal":{"name":"Journal of Medical Internet Research","volume":"27 ","pages":"e73222"},"PeriodicalIF":5.8000,"publicationDate":"2025-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Medical Internet Research","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.2196/73222","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"HEALTH CARE SCIENCES & SERVICES","Score":null,"Total":0}
引用次数: 0
Abstract
Background: Surgical consent forms convey critical information; yet, their complex language can limit patient comprehension. Large language models (LLMs) can simplify complex information and improve readability, but evidence of the impact of LLM-generated modifications on content preservation in non-English consent forms is lacking.
Objective: This study evaluates the impact of LLM-assisted editing on the readability and content quality of surgical consent forms in Korean-particularly consent documents for standardized liver resection-across multiple institutions.
Methods: Standardized liver resection consent forms were collected from 7 South Korean medical institutions, and these forms were simplified using ChatGPT-4o. Thereafter, readability was assessed using KReaD and Natmal indices, while text structure was evaluated based on character count, word count, sentence count, words per sentence, and difficult word ratio. Content quality was analyzed across 4 domains-risk, benefit, alternative, and overall impression-using evaluations from 7 liver resection specialists. Statistical comparisons were conducted using paired 2-sided t tests, and a linear mixed-effects model was applied to account for institutional and evaluator variability.
Results: Artificial intelligence-assisted editing significantly improved readability, reducing the KReaD score from 1777 (SD 28.47) to 1335.6 (SD 59.95) (P<.001) and the Natmal score from 1452.3 (SD 88.67) to 1245.3 (SD 96.96) (P=.007). Sentence length and difficult word ratio decreased significantly, contributing to increased accessibility (P<.05). However, content quality analysis showed a decline in the risk description scores (before: 2.29, SD 0.47 vs after: 1.92, SD 0.32; P=.06) and overall impression scores (before: 2.21, SD 0.49 vs after: 1.71, SD 0.64; P=.13). The linear mixed-effects model confirmed significant reductions in risk descriptions (β₁=-0.371; P=.01) and overall impression (β₁=-0.500; P=.03), suggesting potential omissions in critical safety information. Despite this, qualitative analysis indicated that evaluators did not find explicit omissions but perceived the text as overly simplified and less professional.
Conclusions: Although LLM-assisted surgical consent forms significantly enhance readability, they may compromise certain aspects of content completeness, particularly in risk disclosure. These findings highlight the need for a balanced approach that maintains accessibility while ensuring medical and legal accuracy. Future research should include patient-centered evaluations to assess comprehension and informed decision-making as well as broader multilingual validation to determine LLM applicability across diverse health care settings.
期刊介绍:
The Journal of Medical Internet Research (JMIR) is a highly respected publication in the field of health informatics and health services. With a founding date in 1999, JMIR has been a pioneer in the field for over two decades.
As a leader in the industry, the journal focuses on digital health, data science, health informatics, and emerging technologies for health, medicine, and biomedical research. It is recognized as a top publication in these disciplines, ranking in the first quartile (Q1) by Impact Factor.
Notably, JMIR holds the prestigious position of being ranked #1 on Google Scholar within the "Medical Informatics" discipline.