{"title":"Biomedical Language Analysis as a Tool for Curriculum Analysis and Mapping.","authors":"Stephan Bandelow, Mark Clunes","doi":"10.1177/23821205261441390","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Content outlines for medical school curricula commonly rely on hierarchically structured learning objectives (LOs) at program, course, module and lecture level. At the most fine-grained level, these LOs contain specific biomedical terminology. The biomedical terms can be classified and augmented with semantic and relational information via the Unified Medical Language System (UMLS).</p><p><strong>Methods: </strong>We analyzed the LOs in the preclinical years of spiraled MD curriculum, using natural language processing (NLP) and the UMLS database to add semantic information, to determine the progression of analytical complexity and spiral curriculum design. The complete set of lecture-level LOs for the 2 years of preclinical teaching comprised 6086 unique LOs with 6612 sentences. To analyze progression over time, the LOs were grouped by teaching module in temporal order of delivery.</p><p><strong>Results: </strong>Six thousand one hundred eighty-nine action verbs were extracted and assigned numerical scores according to Bloom's taxonomy. Bloom scores per module showed the use of increasingly complex action verbs as the curriculum progresses. Matching the LOs against the UMLS database yielded 6454 unique biomedical concepts. Scoring each concept as novel only on first appearance showed that the proportion of novel concepts decreases over time. Using the UMLS semantic tags, the proportion of disease-related concepts increased as the curriculum progressed.</p><p><strong>Conclusions: </strong>To our knowledge, this is the first systematic NLP analysis of a medical school curriculum, incorporating standardized medical language dictionaries. The results show a clear progression of increasingly complex analytical tasks, and increasing clinical content, in the curriculum over time. Concepts are revisited as indicated by the decreasing proportion of novel concepts, supporting the design goals of a spiral curriculum. Curriculum evaluations can improve objectivity and depth via systematic parsing of large bodies of natural language information, like the lecture-level LO content analyzed here, as well as providing evidence for accreditation.</p>","PeriodicalId":45121,"journal":{"name":"Journal of Medical Education and Curricular Development","volume":"13 ","pages":"23821205261441390"},"PeriodicalIF":1.6000,"publicationDate":"2026-04-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC13058177/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Medical Education and Curricular Development","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1177/23821205261441390","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2026/1/1 0:00:00","PubModel":"eCollection","JCR":"Q2","JCRName":"EDUCATION, SCIENTIFIC DISCIPLINES","Score":null,"Total":0}
引用次数: 0
Abstract
Background: Content outlines for medical school curricula commonly rely on hierarchically structured learning objectives (LOs) at program, course, module and lecture level. At the most fine-grained level, these LOs contain specific biomedical terminology. The biomedical terms can be classified and augmented with semantic and relational information via the Unified Medical Language System (UMLS).
Methods: We analyzed the LOs in the preclinical years of spiraled MD curriculum, using natural language processing (NLP) and the UMLS database to add semantic information, to determine the progression of analytical complexity and spiral curriculum design. The complete set of lecture-level LOs for the 2 years of preclinical teaching comprised 6086 unique LOs with 6612 sentences. To analyze progression over time, the LOs were grouped by teaching module in temporal order of delivery.
Results: Six thousand one hundred eighty-nine action verbs were extracted and assigned numerical scores according to Bloom's taxonomy. Bloom scores per module showed the use of increasingly complex action verbs as the curriculum progresses. Matching the LOs against the UMLS database yielded 6454 unique biomedical concepts. Scoring each concept as novel only on first appearance showed that the proportion of novel concepts decreases over time. Using the UMLS semantic tags, the proportion of disease-related concepts increased as the curriculum progressed.
Conclusions: To our knowledge, this is the first systematic NLP analysis of a medical school curriculum, incorporating standardized medical language dictionaries. The results show a clear progression of increasingly complex analytical tasks, and increasing clinical content, in the curriculum over time. Concepts are revisited as indicated by the decreasing proportion of novel concepts, supporting the design goals of a spiral curriculum. Curriculum evaluations can improve objectivity and depth via systematic parsing of large bodies of natural language information, like the lecture-level LO content analyzed here, as well as providing evidence for accreditation.