Morgan Vaterkowski, Nadir Ammour, Christel Daniel, Emmanuelle Kempf
{"title":"How to (Semi)-Automatically Spot Prescreening Oriented Eligibility Criteria.","authors":"Morgan Vaterkowski, Nadir Ammour, Christel Daniel, Emmanuelle Kempf","doi":"10.3233/SHTI251546","DOIUrl":null,"url":null,"abstract":"<p><p>Clinical Trial (CT) Recruitment Support Systems (CTRSS) querying Electronic Health Records (EHR) for patient-trial matching during CT execution have been expanding. Since free text CT eligibility criteria (EC) are not readily suitable for the automation of the EHR querying, the configuration of EHR-based CTRSS requires a time-consuming and usually manual processing of EC focusing on those that are the most relevant at the pre-inclusion (prescreening) step. The aim of this study is to provide a methodological approach to semi-automatically detect Prescreening-Oriented Eligibility Criteria (POEC) and build a library of POEC usable in the context of the development and evaluation of EHR-based Clinical Trial Recruitment Support Systems (CTRSS). We proposed an approach for decomposing free text EC into standardized elements and developing a rule-based algorithm to semi-automatically detect POEC. In addition, this paper describes the characteristics of a publicly available POEC library usable for CTRSS evaluation. An annotation framework consisting in 96 patterns of elementary EC categorized in 17 domains was used to annotate 381 free text EC from 20 CT dedicated to various cancer types. This training dataset was used to develop a rule-based algorithm detecting POEC. This study provides a methodological approach to (semi)-automatically spot POEC and store them in a library considering advances in the field of CTRSS. The PENELOPE-C2Q pipeline is designed to feed the PENELOPE POEC library, both having the potential to facilitate the reuse of EHR data for better participation of patients to research.</p>","PeriodicalId":94357,"journal":{"name":"Studies in health technology and informatics","volume":"332 ","pages":"288-292"},"PeriodicalIF":0.0000,"publicationDate":"2025-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Studies in health technology and informatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3233/SHTI251546","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Clinical Trial (CT) Recruitment Support Systems (CTRSS) querying Electronic Health Records (EHR) for patient-trial matching during CT execution have been expanding. Since free text CT eligibility criteria (EC) are not readily suitable for the automation of the EHR querying, the configuration of EHR-based CTRSS requires a time-consuming and usually manual processing of EC focusing on those that are the most relevant at the pre-inclusion (prescreening) step. The aim of this study is to provide a methodological approach to semi-automatically detect Prescreening-Oriented Eligibility Criteria (POEC) and build a library of POEC usable in the context of the development and evaluation of EHR-based Clinical Trial Recruitment Support Systems (CTRSS). We proposed an approach for decomposing free text EC into standardized elements and developing a rule-based algorithm to semi-automatically detect POEC. In addition, this paper describes the characteristics of a publicly available POEC library usable for CTRSS evaluation. An annotation framework consisting in 96 patterns of elementary EC categorized in 17 domains was used to annotate 381 free text EC from 20 CT dedicated to various cancer types. This training dataset was used to develop a rule-based algorithm detecting POEC. This study provides a methodological approach to (semi)-automatically spot POEC and store them in a library considering advances in the field of CTRSS. The PENELOPE-C2Q pipeline is designed to feed the PENELOPE POEC library, both having the potential to facilitate the reuse of EHR data for better participation of patients to research.