Liangying Yin, Menghui Liu, Yujia Shi, Jinghong Qiu, Hon-cheong So
{"title":"Direct causal variable discovery leveraging the invariance principle: application in biomedical studies","authors":"Liangying Yin, Menghui Liu, Yujia Shi, Jinghong Qiu, Hon-cheong So","doi":"10.1101/2024.08.29.24312763","DOIUrl":null,"url":null,"abstract":"Accurate identification of direct causal(parental) variables for a target is of primary interest in many applications, especially in biomedicine. It could promote our understanding of the underlying pathophysiological mechanism and facilitate the discovery of new biomarkers and therapeutic targets for studied clinical outcomes. However, many researchers are inclined to resort to association-based machine learning methods to identify outcome-associated variables. And many of the identified variables may prove to be irrelevant. On the other hand, there is a lack of an efficient method for reliable parental set identification, especially in high-dimensional settings (e.g., biomedicine).","PeriodicalId":501071,"journal":{"name":"medRxiv - Epidemiology","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"medRxiv - Epidemiology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1101/2024.08.29.24312763","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Accurate identification of direct causal(parental) variables for a target is of primary interest in many applications, especially in biomedicine. It could promote our understanding of the underlying pathophysiological mechanism and facilitate the discovery of new biomarkers and therapeutic targets for studied clinical outcomes. However, many researchers are inclined to resort to association-based machine learning methods to identify outcome-associated variables. And many of the identified variables may prove to be irrelevant. On the other hand, there is a lack of an efficient method for reliable parental set identification, especially in high-dimensional settings (e.g., biomedicine).