Tetyana Perchyk, Isabella de Vere Hunt, Brian D Nicholson, Luke Mounce, Kate Sykes, Yoryos Lyratzopoulos, Agnieszka Lemanska, Katriina L Whitaker, Robert S Kerrison
{"title":"Development and evaluation of codelists for identifying marginalised groups in primary care","authors":"Tetyana Perchyk, Isabella de Vere Hunt, Brian D Nicholson, Luke Mounce, Kate Sykes, Yoryos Lyratzopoulos, Agnieszka Lemanska, Katriina L Whitaker, Robert S Kerrison","doi":"10.1101/2024.09.11.24313391","DOIUrl":null,"url":null,"abstract":"Background. Primary care electronic health records provide a rich source of information for inequalities research. However, the reliability and validity of the research derived from these records depends on the completeness and resolution of the codelists used to identify marginalised populations. Aim. The aim of this project was to develop comprehensive codelists for identifying ethnic minorities, people with learning disabilities (LD), people with severe mental illness (SMI) and people who are transgender. Design and setting. This study was a codelist development project, conducted using primary care data from the United Kingdom. Method. Groups of interest were defined a priori. Relevant clinical codes were identified by searching Clinical Practice Research Datalink (CPRD) publications, codelist repositories and the CPRD code browser. Relevant codelists were downloaded and merged according to marginalised group. Duplicates were removed and remaining codes reviewed by two general practitioners. Comprehensiveness was assessed in a representative CPRD population of 10,966,759 people, by comparing the frequencies of individuals identified when using the curated codelists, compared to commonly used alternatives. Results. A total of 52 codelists were identified. 1,420 unique codes were selected after removal of duplicates and GP review. Compared with comparator codelists, an additional 48,017 (76.6%), 52,953 (68.9%) and 508 (36.9%) people with a LD, SMI or transgender code were identified. The frequencies identified for ethnicity were consistent with expectations for the UK population. Conclusion. The codelists curated through this project will improve inequalities research by improving standards of identifying marginalised groups in primary care data.","PeriodicalId":501023,"journal":{"name":"medRxiv - Primary Care Research","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"medRxiv - Primary Care Research","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1101/2024.09.11.24313391","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Background. Primary care electronic health records provide a rich source of information for inequalities research. However, the reliability and validity of the research derived from these records depends on the completeness and resolution of the codelists used to identify marginalised populations. Aim. The aim of this project was to develop comprehensive codelists for identifying ethnic minorities, people with learning disabilities (LD), people with severe mental illness (SMI) and people who are transgender. Design and setting. This study was a codelist development project, conducted using primary care data from the United Kingdom. Method. Groups of interest were defined a priori. Relevant clinical codes were identified by searching Clinical Practice Research Datalink (CPRD) publications, codelist repositories and the CPRD code browser. Relevant codelists were downloaded and merged according to marginalised group. Duplicates were removed and remaining codes reviewed by two general practitioners. Comprehensiveness was assessed in a representative CPRD population of 10,966,759 people, by comparing the frequencies of individuals identified when using the curated codelists, compared to commonly used alternatives. Results. A total of 52 codelists were identified. 1,420 unique codes were selected after removal of duplicates and GP review. Compared with comparator codelists, an additional 48,017 (76.6%), 52,953 (68.9%) and 508 (36.9%) people with a LD, SMI or transgender code were identified. The frequencies identified for ethnicity were consistent with expectations for the UK population. Conclusion. The codelists curated through this project will improve inequalities research by improving standards of identifying marginalised groups in primary care data.