Samantha Gonzales, Jane Zizhen Zhao, Na Young Choi, Prabha Acharya, Sehoon Jeong, Xuexia Wang, Moo-Yeal Lee
{"title":"SOX7: Autism Associated Gene Identified by Analysis of Multi-Omics Data.","authors":"Samantha Gonzales, Jane Zizhen Zhao, Na Young Choi, Prabha Acharya, Sehoon Jeong, Xuexia Wang, Moo-Yeal Lee","doi":"10.1101/2023.05.26.542456","DOIUrl":null,"url":null,"abstract":"<p><p>Genome-wide association studies and next generation sequencing data analyses based on DNA information have identified thousands of mutations associated with autism spectrum disorder (ASD). However, more than 99% of identified mutations are non-coding. Thus, it is unclear which of these mutations might be functional and thus potentially causal variants. Transcriptomic profiling using total RNA-sequencing has been one of the most utilized approaches to link protein levels to genetic information at the molecular level. The transcriptome captures molecular genomic complexity that the DNA sequence solely does not. Some mutations alter a gene's DNA sequence but do not necessarily change expression and/or protein function. To date, few common variants reliably associated with the diagnosis status of ASD despite consistently high estimates of heritability. In addition, reliable biomarkers used to diagnose ASD or molecular mechanisms to define the severity of ASD do not exist. Therefore, it is necessary to integrate DNA and RNA testing together to identify true causal genes and propose useful biomarkers for ASD. We performed gene-based association studies with adaptive test using genome-wide association studies (GWAS) summary statistics with two large GWAS datasets (ASD 2019 data: 18,382 ASD cases and 27,969 controls [discovery data]; ASD 2017 data: 6,197 ASD cases and 7,377 controls [replication data]) which were obtained from the Psychiatric Genomics Consortium (PGC). In addition, we investigated differential expression between ASD cases and controls for genes identified in gene-based GWAS with two RNA-seq datasets (GSE211154: 20 cases and 19 controls; GSE30573: 3 cases and 3 controls). We identified 5 genes significantly associated with ASD in ASD 2019 data ( <i>KIZ-AS1</i> , <i>p</i> =8.67×10 <sup>-10</sup> ; <i>KIZ</i> , <i>p</i> =1.16×10 <sup>-9</sup> ; <i>XRN2</i> , <i>p</i> =7.73×10 <sup>-9</sup> ; <i>SOX7</i> , <i>p</i> =2.22×10 <sup>-7</sup> ; <i>LOC101929229</i> also known as <i>PINX1-DT</i> , <i>p</i> =2.14×10 <sup>-6</sup> ). Among these 5 genes, gene <i>SOX7</i> ( <i>p</i> =0.00087) and <i>LOC101929229</i> ( <i>p</i> =0.009) were replicated in ASD 2017 data. <i>KIZ-AS1</i> ( <i>p</i> =0.059) and <i>KIZ</i> ( <i>p</i> =0.06) were close to the boundary of replication in ASD 2017 data. Genes <i>SOX7</i> ( <i>p</i> =0.036 in all samples; <i>p</i> =0.044 in white samples) indicated significant expression differences between cases and controls in the GSE211154 RNA-seq data. Furthermore, gene <i>SOX7</i> was upregulated in cases than in controls in the GSE30573 RNA-seq data ( <i>p</i> =0.0017; Benjamini-Hochberg adjusted <i>p</i> =0.0085). <i>SOX7</i> encodes a member of the SOX (SRY-related HMG-box) family of transcription factors pivotally contributing to determining of the cell fate and identity in many lineages. The encoded protein may act as a transcriptional regulator after forming a protein complex with other proteins leading to autism. Gene <i>SOX7</i> in the transcription factor family could be associated with ASD. This finding may provide new diagnostic and therapeutic strategies for ASD.</p>","PeriodicalId":72407,"journal":{"name":"bioRxiv : the preprint server for biology","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2025-03-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_pdf/4a/22/nihpp-2023.05.26.542456v1.PMC10245991.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"bioRxiv : the preprint server for biology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1101/2023.05.26.542456","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Genome-wide association studies and next generation sequencing data analyses based on DNA information have identified thousands of mutations associated with autism spectrum disorder (ASD). However, more than 99% of identified mutations are non-coding. Thus, it is unclear which of these mutations might be functional and thus potentially causal variants. Transcriptomic profiling using total RNA-sequencing has been one of the most utilized approaches to link protein levels to genetic information at the molecular level. The transcriptome captures molecular genomic complexity that the DNA sequence solely does not. Some mutations alter a gene's DNA sequence but do not necessarily change expression and/or protein function. To date, few common variants reliably associated with the diagnosis status of ASD despite consistently high estimates of heritability. In addition, reliable biomarkers used to diagnose ASD or molecular mechanisms to define the severity of ASD do not exist. Therefore, it is necessary to integrate DNA and RNA testing together to identify true causal genes and propose useful biomarkers for ASD. We performed gene-based association studies with adaptive test using genome-wide association studies (GWAS) summary statistics with two large GWAS datasets (ASD 2019 data: 18,382 ASD cases and 27,969 controls [discovery data]; ASD 2017 data: 6,197 ASD cases and 7,377 controls [replication data]) which were obtained from the Psychiatric Genomics Consortium (PGC). In addition, we investigated differential expression between ASD cases and controls for genes identified in gene-based GWAS with two RNA-seq datasets (GSE211154: 20 cases and 19 controls; GSE30573: 3 cases and 3 controls). We identified 5 genes significantly associated with ASD in ASD 2019 data ( KIZ-AS1 , p =8.67×10 -10 ; KIZ , p =1.16×10 -9 ; XRN2 , p =7.73×10 -9 ; SOX7 , p =2.22×10 -7 ; LOC101929229 also known as PINX1-DT , p =2.14×10 -6 ). Among these 5 genes, gene SOX7 ( p =0.00087) and LOC101929229 ( p =0.009) were replicated in ASD 2017 data. KIZ-AS1 ( p =0.059) and KIZ ( p =0.06) were close to the boundary of replication in ASD 2017 data. Genes SOX7 ( p =0.036 in all samples; p =0.044 in white samples) indicated significant expression differences between cases and controls in the GSE211154 RNA-seq data. Furthermore, gene SOX7 was upregulated in cases than in controls in the GSE30573 RNA-seq data ( p =0.0017; Benjamini-Hochberg adjusted p =0.0085). SOX7 encodes a member of the SOX (SRY-related HMG-box) family of transcription factors pivotally contributing to determining of the cell fate and identity in many lineages. The encoded protein may act as a transcriptional regulator after forming a protein complex with other proteins leading to autism. Gene SOX7 in the transcription factor family could be associated with ASD. This finding may provide new diagnostic and therapeutic strategies for ASD.