{"title":"Exploring the use of Artificial Genomes for Genome-wide Association Studies through the lens of Utility and Privacy.","authors":"Xinyue Wang, Sitao Min, Jaideep Vaidya","doi":"","DOIUrl":null,"url":null,"abstract":"<p><p>Collaborative Genome-wide association studies (GWAS) have the potential to uncover rare genetic variant-trait associations by leveraging larger datasets and diverse population samples. Despite this potential, privacy concerns and cumbersome review processes for data validation and collaborator selection hinder their broader implementation. Advances in generative models present a possible solution by generating synthetic datasets that closely resemble real genomic data, thus enhancing privacy and expediting the review process. This study assesses the capability of deep generative models to produce artificial genomic data for GWAS applications. We evaluate two state-of-the-art models on real-world datasets, identifying significant limitations in their ability to generate high-quality artificial genomes. Furthermore, we demonstrate that prevailing privacy measures, mainly based on membership inference attacks, are inadequate for providing insightful privacy evaluations. Our findings highlight the critical challenges and suggest future directions for the effective use of artificial genomes in GWAS.</p>","PeriodicalId":72180,"journal":{"name":"AMIA ... Annual Symposium proceedings. AMIA Symposium","volume":"2024 ","pages":"1196-1205"},"PeriodicalIF":0.0000,"publicationDate":"2025-05-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12099349/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"AMIA ... Annual Symposium proceedings. AMIA Symposium","FirstCategoryId":"1085","ListUrlMain":"","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/1/1 0:00:00","PubModel":"eCollection","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Collaborative Genome-wide association studies (GWAS) have the potential to uncover rare genetic variant-trait associations by leveraging larger datasets and diverse population samples. Despite this potential, privacy concerns and cumbersome review processes for data validation and collaborator selection hinder their broader implementation. Advances in generative models present a possible solution by generating synthetic datasets that closely resemble real genomic data, thus enhancing privacy and expediting the review process. This study assesses the capability of deep generative models to produce artificial genomic data for GWAS applications. We evaluate two state-of-the-art models on real-world datasets, identifying significant limitations in their ability to generate high-quality artificial genomes. Furthermore, we demonstrate that prevailing privacy measures, mainly based on membership inference attacks, are inadequate for providing insightful privacy evaluations. Our findings highlight the critical challenges and suggest future directions for the effective use of artificial genomes in GWAS.