{"title":"特征对齐的深度对比学习:来自住房-家庭关系推断的见解","authors":"Xiao Qian, Shangjia Dong, Rachel Davidson","doi":"10.1016/j.compenvurbsys.2025.102328","DOIUrl":null,"url":null,"abstract":"<div><div>Housing and household characteristics are key determinants of social and economic well-being, yet our understanding of their interrelationships remains limited. This study addresses this knowledge gap by developing a deep contrastive learning (DCL) model to infer housing-household relationships using the American Community Survey (ACS) Public Use Microdata Sample (PUMS). More broadly, the proposed model is suitable for a class of problems where the goal is to learn joint relationships between two distinct entities without explicitly labeled ground truth data. Our proposed dual-encoder DCL approach leverages co-occurrence patterns in PUMS and introduces a bisect K-means clustering method to overcome the absence of ground truth labels. The dual-encoder DCL architecture is designed to handle the semantic differences between housing (building) and household (people) features while mitigating noise introduced by clustering. To validate the model, we generate a synthetic ground truth dataset and conduct comprehensive evaluations. The model further demonstrates its superior performance in capturing housing-household relationships in Delaware compared to state-of-the-art methods. A transferability test in North Carolina confirms its generalizability across diverse sociodemographic and geographic contexts. Finally, the post-hoc explainable AI analysis using SHAP values reveals that tenure status and mortgage information play a more significant role in housing-household matching than traditionally emphasized factors such as the number of persons and rooms.</div></div>","PeriodicalId":48241,"journal":{"name":"Computers Environment and Urban Systems","volume":"121 ","pages":"Article 102328"},"PeriodicalIF":8.3000,"publicationDate":"2025-08-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Deep contrastive learning for feature alignment: Insights from housing-household relationship inference\",\"authors\":\"Xiao Qian, Shangjia Dong, Rachel Davidson\",\"doi\":\"10.1016/j.compenvurbsys.2025.102328\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Housing and household characteristics are key determinants of social and economic well-being, yet our understanding of their interrelationships remains limited. This study addresses this knowledge gap by developing a deep contrastive learning (DCL) model to infer housing-household relationships using the American Community Survey (ACS) Public Use Microdata Sample (PUMS). More broadly, the proposed model is suitable for a class of problems where the goal is to learn joint relationships between two distinct entities without explicitly labeled ground truth data. Our proposed dual-encoder DCL approach leverages co-occurrence patterns in PUMS and introduces a bisect K-means clustering method to overcome the absence of ground truth labels. The dual-encoder DCL architecture is designed to handle the semantic differences between housing (building) and household (people) features while mitigating noise introduced by clustering. To validate the model, we generate a synthetic ground truth dataset and conduct comprehensive evaluations. The model further demonstrates its superior performance in capturing housing-household relationships in Delaware compared to state-of-the-art methods. A transferability test in North Carolina confirms its generalizability across diverse sociodemographic and geographic contexts. Finally, the post-hoc explainable AI analysis using SHAP values reveals that tenure status and mortgage information play a more significant role in housing-household matching than traditionally emphasized factors such as the number of persons and rooms.</div></div>\",\"PeriodicalId\":48241,\"journal\":{\"name\":\"Computers Environment and Urban Systems\",\"volume\":\"121 \",\"pages\":\"Article 102328\"},\"PeriodicalIF\":8.3000,\"publicationDate\":\"2025-08-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computers Environment and Urban Systems\",\"FirstCategoryId\":\"89\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S019897152500081X\",\"RegionNum\":1,\"RegionCategory\":\"地球科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ENVIRONMENTAL STUDIES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers Environment and Urban Systems","FirstCategoryId":"89","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S019897152500081X","RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENVIRONMENTAL STUDIES","Score":null,"Total":0}
Deep contrastive learning for feature alignment: Insights from housing-household relationship inference
Housing and household characteristics are key determinants of social and economic well-being, yet our understanding of their interrelationships remains limited. This study addresses this knowledge gap by developing a deep contrastive learning (DCL) model to infer housing-household relationships using the American Community Survey (ACS) Public Use Microdata Sample (PUMS). More broadly, the proposed model is suitable for a class of problems where the goal is to learn joint relationships between two distinct entities without explicitly labeled ground truth data. Our proposed dual-encoder DCL approach leverages co-occurrence patterns in PUMS and introduces a bisect K-means clustering method to overcome the absence of ground truth labels. The dual-encoder DCL architecture is designed to handle the semantic differences between housing (building) and household (people) features while mitigating noise introduced by clustering. To validate the model, we generate a synthetic ground truth dataset and conduct comprehensive evaluations. The model further demonstrates its superior performance in capturing housing-household relationships in Delaware compared to state-of-the-art methods. A transferability test in North Carolina confirms its generalizability across diverse sociodemographic and geographic contexts. Finally, the post-hoc explainable AI analysis using SHAP values reveals that tenure status and mortgage information play a more significant role in housing-household matching than traditionally emphasized factors such as the number of persons and rooms.
期刊介绍:
Computers, Environment and Urban Systemsis an interdisciplinary journal publishing cutting-edge and innovative computer-based research on environmental and urban systems, that privileges the geospatial perspective. The journal welcomes original high quality scholarship of a theoretical, applied or technological nature, and provides a stimulating presentation of perspectives, research developments, overviews of important new technologies and uses of major computational, information-based, and visualization innovations. Applied and theoretical contributions demonstrate the scope of computer-based analysis fostering a better understanding of environmental and urban systems, their spatial scope and their dynamics.