{"title":"Uncovering University Application Patterns Through Graph Representation Learning","authors":"Hendrik Santoso Sugiarto, Yozef Tjandra","doi":"10.1007/s40745-025-00611-1","DOIUrl":null,"url":null,"abstract":"<div><p>In university admissions, interaction networks naturally emerge between prospective students and available majors. Understanding hidden patterns in such a vast network is crucial for decision-making but poses technical challenges due to its complexity and data limitations. Many existing models rely heavily on user profiling, raising privacy concerns and making data collection difficult. Instead, this work extracts meaningful insights using only the adjacency information of the network, avoiding the need for personal data. We leverage Graph Convolutional Networks (GCN) to generate compact representations for major recommendation and clustering tasks. Our GCN-based approach outperforms classical methods such as popularity-based and Non-negative Matrix Factorization (NMF), as well as the neural Generalized Matrix Factorization (GMF) model, achieving up to 61.06% and 12.17% improvements in smaller (dimension 40) and larger (dimension 80) embeddings, respectively. Furthermore, hierarchical clustering on these embeddings reveals implicit patterns in student preferences, particularly regarding fields of study and geographic locations, even without explicit data on these attributes. These findings demonstrate that meaningful insights can be derived from interaction networks while mitigating privacy concerns associated with user profiling.</p></div>","PeriodicalId":36280,"journal":{"name":"Annals of Data Science","volume":"12 4","pages":"1343 - 1368"},"PeriodicalIF":0.0000,"publicationDate":"2025-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Annals of Data Science","FirstCategoryId":"1085","ListUrlMain":"https://link.springer.com/article/10.1007/s40745-025-00611-1","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"Decision Sciences","Score":null,"Total":0}
引用次数: 0
Abstract
In university admissions, interaction networks naturally emerge between prospective students and available majors. Understanding hidden patterns in such a vast network is crucial for decision-making but poses technical challenges due to its complexity and data limitations. Many existing models rely heavily on user profiling, raising privacy concerns and making data collection difficult. Instead, this work extracts meaningful insights using only the adjacency information of the network, avoiding the need for personal data. We leverage Graph Convolutional Networks (GCN) to generate compact representations for major recommendation and clustering tasks. Our GCN-based approach outperforms classical methods such as popularity-based and Non-negative Matrix Factorization (NMF), as well as the neural Generalized Matrix Factorization (GMF) model, achieving up to 61.06% and 12.17% improvements in smaller (dimension 40) and larger (dimension 80) embeddings, respectively. Furthermore, hierarchical clustering on these embeddings reveals implicit patterns in student preferences, particularly regarding fields of study and geographic locations, even without explicit data on these attributes. These findings demonstrate that meaningful insights can be derived from interaction networks while mitigating privacy concerns associated with user profiling.
期刊介绍:
Annals of Data Science (ADS) publishes cutting-edge research findings, experimental results and case studies of data science. Although Data Science is regarded as an interdisciplinary field of using mathematics, statistics, databases, data mining, high-performance computing, knowledge management and virtualization to discover knowledge from Big Data, it should have its own scientific contents, such as axioms, laws and rules, which are fundamentally important for experts in different fields to explore their own interests from Big Data. ADS encourages contributors to address such challenging problems at this exchange platform. At present, how to discover knowledge from heterogeneous data under Big Data environment needs to be addressed. ADS is a series of volumes edited by either the editorial office or guest editors. Guest editors will be responsible for call-for-papers and the review process for high-quality contributions in their volumes.