Philip C Dishuck, Katherine M Munson, Alexandra P Lewis, Max L Dougherty, Jason G Underwood, William T Harvey, PingHsun Hsieh, Tomi Pastinen, Evan E Eichler
{"title":"Structural variation, selection, and diversification of the NPIP gene family from the human pangenome.","authors":"Philip C Dishuck, Katherine M Munson, Alexandra P Lewis, Max L Dougherty, Jason G Underwood, William T Harvey, PingHsun Hsieh, Tomi Pastinen, Evan E Eichler","doi":"10.1016/j.xgen.2025.100977","DOIUrl":null,"url":null,"abstract":"<p><p>The NPIP gene family is among the most positively selected gene families in humans/apes and drives independent duplication in primate lineages. These duplications promote genetic instability, leading to recurrent disease-associated microduplication and microdeletion syndromes. Despite its importance, little is known about its function or variation in humans, as short-read sequencing cannot distinguish high-identity duplications. Using long-read assemblies of 169 human haplotypes, we find extreme variation in the content and organization of NPIP loci. We identify fixed and polymorphic paralogs and observe ongoing positive selection. With long-read RNA sequencing (RNA-seq), we create paralog-specific gene models, the majority of which were not previously documented, and observe paralog-specific tissue specificity. This analysis of an exceptionally dynamic gene family provides candidates for future functional study.</p>","PeriodicalId":72539,"journal":{"name":"Cell genomics","volume":" ","pages":"100977"},"PeriodicalIF":11.1000,"publicationDate":"2025-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Cell genomics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1016/j.xgen.2025.100977","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CELL BIOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
The NPIP gene family is among the most positively selected gene families in humans/apes and drives independent duplication in primate lineages. These duplications promote genetic instability, leading to recurrent disease-associated microduplication and microdeletion syndromes. Despite its importance, little is known about its function or variation in humans, as short-read sequencing cannot distinguish high-identity duplications. Using long-read assemblies of 169 human haplotypes, we find extreme variation in the content and organization of NPIP loci. We identify fixed and polymorphic paralogs and observe ongoing positive selection. With long-read RNA sequencing (RNA-seq), we create paralog-specific gene models, the majority of which were not previously documented, and observe paralog-specific tissue specificity. This analysis of an exceptionally dynamic gene family provides candidates for future functional study.