Integrative exome sequencing and machine learning identify MICB and interferon pathway genes as contributors to SSc risk.

IF 20.3 1区 医学 Q1 RHEUMATOLOGY
Shamika Ketkar, Hongzheng Dai, Lindsay Burrage, David Murdock, Brian Dawson, Marialbert Acosta-Herrera, Martin Kerick, Javier Martin, Kevin Wilhelm, Jennifer Kay Asmussen, Olivier Lichtarge, Regeneron Genetics Center, Shervin Assassi, Maureen D Mayes, Brendan H Lee
{"title":"Integrative exome sequencing and machine learning identify MICB and interferon pathway genes as contributors to SSc risk.","authors":"Shamika Ketkar, Hongzheng Dai, Lindsay Burrage, David Murdock, Brian Dawson, Marialbert Acosta-Herrera, Martin Kerick, Javier Martin, Kevin Wilhelm, Jennifer Kay Asmussen, Olivier Lichtarge, Regeneron Genetics Center, Shervin Assassi, Maureen D Mayes, Brendan H Lee","doi":"10.1016/j.ard.2025.05.009","DOIUrl":null,"url":null,"abstract":"<p><strong>Objectives: </strong>Systemic sclerosis (SSc) is a complex autoimmune disease with both known and unidentified genetic contributors. While genome-wide association studies (GWAS) have implicated multiple loci, many reside in noncoding regions. We aimed to identify novel protein-coding variants and pathogenic pathways using exome sequencing (ES) integrated with an Evolutionary Action-Machine Learning (EAML) framework, single-cell RNA sequencing (scRNA-seq), and expression quantitative trait locus (eQTL) analysis.</p><p><strong>Methods: </strong>GWAS was conducted in 2,559 SSc cases and 893 controls of Caucasian ancestry, with replication in 9,846 cases and 18,333 controls of European ancestry. EAML prioritized genes with high-impact missense variants predictive of disease. Public scRNA-seq data from SSc and control skin biopsies were analyzed to localize gene expression across cell types. Whole blood eQTL data were used to identify regulatory effects of risk variants.</p><p><strong>Results: </strong>A novel SSc risk locus at MICB (rs2516497, P = 3.66 × 10<sup>-13</sup>) was identified and replicated. EAML highlighted 284 genes enriched in interferon signaling. scRNA-seq localized MICB and NOTCH4 to fibroblasts and endothelial cells, while HLA class II genes were enriched in macrophages and fibroblasts. eQTL analysis confirmed regulatory effects at MICB, NOTCH4, and other prioritized genes, linking SSc-associated variants to transcriptional dysregulation.</p><p><strong>Conclusions: </strong>This integrative genomic study identifies novel risk loci and mechanistic pathways in SSc, highlighting MICB, NOTCH4, and interferon-related genes. The findings provide insight into the cellular and regulatory architecture of SSc and support the utility of combining ES, machine learning, scRNA-seq, and eQTL data in complex disease genetics.</p>","PeriodicalId":8087,"journal":{"name":"Annals of the Rheumatic Diseases","volume":" ","pages":""},"PeriodicalIF":20.3000,"publicationDate":"2025-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Annals of the Rheumatic Diseases","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1016/j.ard.2025.05.009","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"RHEUMATOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

Objectives: Systemic sclerosis (SSc) is a complex autoimmune disease with both known and unidentified genetic contributors. While genome-wide association studies (GWAS) have implicated multiple loci, many reside in noncoding regions. We aimed to identify novel protein-coding variants and pathogenic pathways using exome sequencing (ES) integrated with an Evolutionary Action-Machine Learning (EAML) framework, single-cell RNA sequencing (scRNA-seq), and expression quantitative trait locus (eQTL) analysis.

Methods: GWAS was conducted in 2,559 SSc cases and 893 controls of Caucasian ancestry, with replication in 9,846 cases and 18,333 controls of European ancestry. EAML prioritized genes with high-impact missense variants predictive of disease. Public scRNA-seq data from SSc and control skin biopsies were analyzed to localize gene expression across cell types. Whole blood eQTL data were used to identify regulatory effects of risk variants.

Results: A novel SSc risk locus at MICB (rs2516497, P = 3.66 × 10-13) was identified and replicated. EAML highlighted 284 genes enriched in interferon signaling. scRNA-seq localized MICB and NOTCH4 to fibroblasts and endothelial cells, while HLA class II genes were enriched in macrophages and fibroblasts. eQTL analysis confirmed regulatory effects at MICB, NOTCH4, and other prioritized genes, linking SSc-associated variants to transcriptional dysregulation.

Conclusions: This integrative genomic study identifies novel risk loci and mechanistic pathways in SSc, highlighting MICB, NOTCH4, and interferon-related genes. The findings provide insight into the cellular and regulatory architecture of SSc and support the utility of combining ES, machine learning, scRNA-seq, and eQTL data in complex disease genetics.

综合外显子组测序和机器学习确定MICB和干扰素途径基因是SSc风险的因素。
目的:系统性硬化症(SSc)是一种复杂的自身免疫性疾病,具有已知和未知的遗传因素。虽然全基因组关联研究(GWAS)涉及多个位点,但许多位点位于非编码区。我们的目标是利用外显子组测序(ES)、进化动作-机器学习(EAML)框架、单细胞RNA测序(scRNA-seq)和表达数量性状位点(eQTL)分析来鉴定新的蛋白质编码变异和致病途径。方法:GWAS在2559例SSc患者和893例对照白种人中进行,在9846例和18333例对照欧洲血统中进行重复。EAML优先考虑具有预测疾病的高影响错义变异的基因。分析来自SSc和对照皮肤活检的公开scRNA-seq数据,以定位不同细胞类型的基因表达。全血eQTL数据用于确定风险变异的调节作用。结果:在MICB中发现了一个新的SSc风险位点(rs2516497, P = 3.66 × 10-13)并进行了重复。EAML突出了284个富含干扰素信号的基因。scRNA-seq将MICB和NOTCH4定位于成纤维细胞和内皮细胞,而HLA II类基因则在巨噬细胞和成纤维细胞中富集。eQTL分析证实了MICB、NOTCH4和其他优先基因的调控作用,将ssc相关变异与转录失调联系起来。结论:这项整合基因组研究确定了SSc中新的风险位点和机制途径,突出了MICB、NOTCH4和干扰素相关基因。这些发现为SSc的细胞和调控结构提供了见解,并支持将ES、机器学习、scRNA-seq和eQTL数据结合在复杂疾病遗传学中的应用。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Annals of the Rheumatic Diseases
Annals of the Rheumatic Diseases 医学-风湿病学
CiteScore
35.00
自引率
9.90%
发文量
3728
审稿时长
1.4 months
期刊介绍: Annals of the Rheumatic Diseases (ARD) is an international peer-reviewed journal covering all aspects of rheumatology, which includes the full spectrum of musculoskeletal conditions, arthritic disease, and connective tissue disorders. ARD publishes basic, clinical, and translational scientific research, including the most important recommendations for the management of various conditions.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信