{"title":"Understanding South Australia’s blood products usage patterns and outcomes, using data linkage.","authors":"M. Palfy, Christopher Radbone","doi":"10.23889/ijpds.v7i3.1975","DOIUrl":null,"url":null,"abstract":"ObjectivesThe purpose of this analytical activity was to ensure confidence in the technical capability for extracting, linking, and integrating public hospital inpatient data, public pathology blood transfusions records and blood tests, to optimise records linkage allowing patterns and trends to be then analysed with confidence. \nApproachThe SURE secure data platform was essential to ensure data governance and security requirements were met while integrating health data spanning 18 months (January 2018 - June 2019). Data sources came in multiple formats of varying quality. R was chosen for its data wrangling abilities and reproducibility. \nThe phases were: \n \nSource data loading and cleaning \nLinking hospital inpatient and blood transfusions records \nSummarising linked transfusion data \nLinking inpatient and blood tests data \nSummarising linked tests data \nIntegrating hospital data with summarised transfusion and summarised tests data \nDeriving additional variables based on summarised data \n \nResultsFrom 143,192 transfusion records, 55,053 (38.4%) were excluded as they did not meet the inclusion criteria (e.g., hospital or blood product out-of-scope). \nFrom 7,897,451 blood test records, 238,013 (3.0%) were excluded, mostly of poor quality (missing/invalid hospital code). \nInitially 91.4% of transfusion records were matched with hospital inpatient records. The linkage rate for state-wide blood test records was 62.3% for tests records, noting the low match rate was attributed to tests not performed on public hospital patients, as the blood test data was statewide. \nLinkage process was improved by adding additional patient codes from public pathology’s internal patient identifiers. The linkage rate improved to 95.5% for transfusion records and 64.4% for test records. \nConclusion12 different data sources, with differing file types and formats, needed coding to achieve standardised results, enabling future reproducibility. Over one hundred business rules were implemented to produce a robust solution for future data updates. End results were analysed, and it was determined that linkage and integration quality exceeded previous similar attempts in terms of match rate and accuracy.","PeriodicalId":36483,"journal":{"name":"International Journal of Population Data Science","volume":" ","pages":""},"PeriodicalIF":1.6000,"publicationDate":"2022-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Population Data Science","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.23889/ijpds.v7i3.1975","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"HEALTH CARE SCIENCES & SERVICES","Score":null,"Total":0}
引用次数: 0
Abstract
ObjectivesThe purpose of this analytical activity was to ensure confidence in the technical capability for extracting, linking, and integrating public hospital inpatient data, public pathology blood transfusions records and blood tests, to optimise records linkage allowing patterns and trends to be then analysed with confidence.
ApproachThe SURE secure data platform was essential to ensure data governance and security requirements were met while integrating health data spanning 18 months (January 2018 - June 2019). Data sources came in multiple formats of varying quality. R was chosen for its data wrangling abilities and reproducibility.
The phases were:
Source data loading and cleaning
Linking hospital inpatient and blood transfusions records
Summarising linked transfusion data
Linking inpatient and blood tests data
Summarising linked tests data
Integrating hospital data with summarised transfusion and summarised tests data
Deriving additional variables based on summarised data
ResultsFrom 143,192 transfusion records, 55,053 (38.4%) were excluded as they did not meet the inclusion criteria (e.g., hospital or blood product out-of-scope).
From 7,897,451 blood test records, 238,013 (3.0%) were excluded, mostly of poor quality (missing/invalid hospital code).
Initially 91.4% of transfusion records were matched with hospital inpatient records. The linkage rate for state-wide blood test records was 62.3% for tests records, noting the low match rate was attributed to tests not performed on public hospital patients, as the blood test data was statewide.
Linkage process was improved by adding additional patient codes from public pathology’s internal patient identifiers. The linkage rate improved to 95.5% for transfusion records and 64.4% for test records.
Conclusion12 different data sources, with differing file types and formats, needed coding to achieve standardised results, enabling future reproducibility. Over one hundred business rules were implemented to produce a robust solution for future data updates. End results were analysed, and it was determined that linkage and integration quality exceeded previous similar attempts in terms of match rate and accuracy.