{"title":"MEDI-SLATE: medical imaging slide-lecture aligned teaching ensemble.","authors":"Motaleb Hossen Manik, Zabirul Islam, Ge Wang","doi":"10.1186/s42492-026-00218-0","DOIUrl":null,"url":null,"abstract":"<p><p>Slide-based lectures remain the primary means by which undergraduate students learn about the mathematical, physical, and systems-level foundations of medical imaging. However, despite their central educational role, no openly available dataset pairs imaging lecture slides with clean, well-aligned explanatory narration suitable for scientific and educational research. The authors introduced MEDI-SLATE: medical imaging slide-lecture aligned teaching ensemble, constructed from a complete undergraduate biomedical engineering medical imaging course. The dataset contains 1117 high-resolution slides paired with refined narration derived from classroom audio through automatic speech recognition, followed by careful manual cleanup. MEDI-SLATE encompasses linear systems, Fourier analysis, signal processing, X-ray physics, computed tomography, positron emission tomography/single photon emission computed tomography, magnetic resonance imaging , ultrasound, and optical imaging. In addition to the slide-text pairs, the dataset includes lecture-level difficulty tags, key ideas, common student misunderstandings, and practice questions sourced directly from the instructor's materials. A fully reproducible preprocessing pipeline covering slide extraction, narration refinement, alignment, and corpus-level analyses is provided. MEDI-SLATE offers a high-fidelity, openly available resource for medical imaging education, curriculum development, multimodal learning research, and creation of artificial intelligence-assisted instructional tools, with all data and codes released for transparent use and future extension.</p>","PeriodicalId":29931,"journal":{"name":"Visual Computing for Industry Biomedicine and Art","volume":"9 1","pages":""},"PeriodicalIF":6.0000,"publicationDate":"2026-04-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC13079244/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Visual Computing for Industry Biomedicine and Art","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1186/s42492-026-00218-0","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0
Abstract
Slide-based lectures remain the primary means by which undergraduate students learn about the mathematical, physical, and systems-level foundations of medical imaging. However, despite their central educational role, no openly available dataset pairs imaging lecture slides with clean, well-aligned explanatory narration suitable for scientific and educational research. The authors introduced MEDI-SLATE: medical imaging slide-lecture aligned teaching ensemble, constructed from a complete undergraduate biomedical engineering medical imaging course. The dataset contains 1117 high-resolution slides paired with refined narration derived from classroom audio through automatic speech recognition, followed by careful manual cleanup. MEDI-SLATE encompasses linear systems, Fourier analysis, signal processing, X-ray physics, computed tomography, positron emission tomography/single photon emission computed tomography, magnetic resonance imaging , ultrasound, and optical imaging. In addition to the slide-text pairs, the dataset includes lecture-level difficulty tags, key ideas, common student misunderstandings, and practice questions sourced directly from the instructor's materials. A fully reproducible preprocessing pipeline covering slide extraction, narration refinement, alignment, and corpus-level analyses is provided. MEDI-SLATE offers a high-fidelity, openly available resource for medical imaging education, curriculum development, multimodal learning research, and creation of artificial intelligence-assisted instructional tools, with all data and codes released for transparent use and future extension.