Edmond Pui Hang Choi, Ellie Bostwick Andres, Heidi Sze Lok Fan, Lai Ming Ho, Alice Wai Chi Fung, Kevin Wing Chung Lau, Neda Hei Tung Ng, Monique Yeung, Janice Mary Johnston
{"title":"Using self-generated identification codes to match anonymous longitudinal data in a sexual health study of secondary school students: a cohort study.","authors":"Edmond Pui Hang Choi, Ellie Bostwick Andres, Heidi Sze Lok Fan, Lai Ming Ho, Alice Wai Chi Fung, Kevin Wing Chung Lau, Neda Hei Tung Ng, Monique Yeung, Janice Mary Johnston","doi":"10.1186/s12911-025-03028-1","DOIUrl":null,"url":null,"abstract":"<p><strong>Objective: </strong>This study aimed to (i) describe the procedures for generating self-generated identification codes (SGICs) in a prospective longitudinal evaluation of a sexual health program for secondary school students in Hong Kong; (ii) outline the matching strategies and processes; (iii) examine rates of successful matching and associated factors; and (iv) compare the responses of participants whose data could be matched to those whose data could not.</p><p><strong>Methods: </strong>A prospective longitudinal cohort study was conducted. The SGIC comprised a 5-element code with 4 digits and 3 letters. A matching algorithm was developed to link baseline and follow-up data collected from students in Years 1 to 3 (n = 1,064) during the 2019-2020 school year. Matching success and associated factors were analyzed, and responses from matched and unmatched participants were compared.</p><p><strong>Results: </strong>The rate of perfectly matched cases was 49.06%, while 23.59% were partially matched, and 27.35% were unmatched. Logistic regression analysis revealed that male students (adjusted odds ratio [aOR]: 0.63) and Year 1 students (vs. Year 3; aOR: 0.56) were less likely to be perfectly matched. Compared to unmatched cases, perfectly and partially matched cases were less likely to have missing values and more likely to exhibit positive attitudes toward the sexual health program and related topics, such as the importance of sexual health, equal relationships, and condom use.</p><p><strong>Conclusion: </strong>The use of SGICs successfully matched approximately 72.65% of the study sample over a one-year period. These findings highlight the potential of SGICs as a tool for longitudinal data matching while underscoring the need for further refinement of code generation processes and matching algorithms to minimize data wastage and improve effectiveness.</p>","PeriodicalId":9340,"journal":{"name":"BMC Medical Informatics and Decision Making","volume":"25 1","pages":"201"},"PeriodicalIF":3.3000,"publicationDate":"2025-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12131654/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMC Medical Informatics and Decision Making","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1186/s12911-025-03028-1","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MEDICAL INFORMATICS","Score":null,"Total":0}
引用次数: 0
Abstract
Objective: This study aimed to (i) describe the procedures for generating self-generated identification codes (SGICs) in a prospective longitudinal evaluation of a sexual health program for secondary school students in Hong Kong; (ii) outline the matching strategies and processes; (iii) examine rates of successful matching and associated factors; and (iv) compare the responses of participants whose data could be matched to those whose data could not.
Methods: A prospective longitudinal cohort study was conducted. The SGIC comprised a 5-element code with 4 digits and 3 letters. A matching algorithm was developed to link baseline and follow-up data collected from students in Years 1 to 3 (n = 1,064) during the 2019-2020 school year. Matching success and associated factors were analyzed, and responses from matched and unmatched participants were compared.
Results: The rate of perfectly matched cases was 49.06%, while 23.59% were partially matched, and 27.35% were unmatched. Logistic regression analysis revealed that male students (adjusted odds ratio [aOR]: 0.63) and Year 1 students (vs. Year 3; aOR: 0.56) were less likely to be perfectly matched. Compared to unmatched cases, perfectly and partially matched cases were less likely to have missing values and more likely to exhibit positive attitudes toward the sexual health program and related topics, such as the importance of sexual health, equal relationships, and condom use.
Conclusion: The use of SGICs successfully matched approximately 72.65% of the study sample over a one-year period. These findings highlight the potential of SGICs as a tool for longitudinal data matching while underscoring the need for further refinement of code generation processes and matching algorithms to minimize data wastage and improve effectiveness.
期刊介绍:
BMC Medical Informatics and Decision Making is an open access journal publishing original peer-reviewed research articles in relation to the design, development, implementation, use, and evaluation of health information technologies and decision-making for human health.