Jody Clay-Warner , Hui Yi , Tenshi Kawashima , Jiacheng Li , David Okech , Fred Hassan Konteh
{"title":"A comparison of top-coding strategies for aggregated relational data","authors":"Jody Clay-Warner , Hui Yi , Tenshi Kawashima , Jiacheng Li , David Okech , Fred Hassan Konteh","doi":"10.1016/j.socnet.2025.05.006","DOIUrl":null,"url":null,"abstract":"<div><div>Aggregated relational data are commonly used in conjunction with scale-up methods to measure network size. In this approach, the number of people respondents report knowing in subpopulations of known size are scaled-up to estimate the size of their personal network. Because this method is sensitive to reporting errors, researchers often top-code responses about subpopulations of known size, although there is no consensus on how to select the top-code value. Here, we compare several top-coding methods, including new approaches that utilize Dunbar’s number, using datasets collected from two aggregated relational data surveys, one from Shanghai and one from Kambia, Sierra Leone. We employ three metrics to evaluate the top-coding strategies: mean error rates in the estimation of the subpopulations of known size, error rate in estimation of the target population, and degree mean. We find that the top-coding strategies all perform equally well in the estimation of the subpopulations of known size in both datasets. The strategies based on Dunbar’s number, however, performed better than the other strategies in the estimation of the target population in Kambia. In addition, the Dunbar’s number approaches produced substantially smaller degree means in both datasets. We examine these findings wholistically and provide suggestions for how researchers should approach top-coding decisions. We ultimately conclude that there is not a one-size-fits-all solution for top-coding and that researchers should systematically examine key indicators from the data to determine if top-coding is necessary and, if so, what top-coding strategy is appropriate.</div></div>","PeriodicalId":48353,"journal":{"name":"Social Networks","volume":"83 ","pages":"Pages 50-61"},"PeriodicalIF":2.4000,"publicationDate":"2025-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Social Networks","FirstCategoryId":"90","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0378873325000371","RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ANTHROPOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Aggregated relational data are commonly used in conjunction with scale-up methods to measure network size. In this approach, the number of people respondents report knowing in subpopulations of known size are scaled-up to estimate the size of their personal network. Because this method is sensitive to reporting errors, researchers often top-code responses about subpopulations of known size, although there is no consensus on how to select the top-code value. Here, we compare several top-coding methods, including new approaches that utilize Dunbar’s number, using datasets collected from two aggregated relational data surveys, one from Shanghai and one from Kambia, Sierra Leone. We employ three metrics to evaluate the top-coding strategies: mean error rates in the estimation of the subpopulations of known size, error rate in estimation of the target population, and degree mean. We find that the top-coding strategies all perform equally well in the estimation of the subpopulations of known size in both datasets. The strategies based on Dunbar’s number, however, performed better than the other strategies in the estimation of the target population in Kambia. In addition, the Dunbar’s number approaches produced substantially smaller degree means in both datasets. We examine these findings wholistically and provide suggestions for how researchers should approach top-coding decisions. We ultimately conclude that there is not a one-size-fits-all solution for top-coding and that researchers should systematically examine key indicators from the data to determine if top-coding is necessary and, if so, what top-coding strategy is appropriate.
期刊介绍:
Social Networks is an interdisciplinary and international quarterly. It provides a common forum for representatives of anthropology, sociology, history, social psychology, political science, human geography, biology, economics, communications science and other disciplines who share an interest in the study of the empirical structure of social relations and associations that may be expressed in network form. It publishes both theoretical and substantive papers. Critical reviews of major theoretical or methodological approaches using the notion of networks in the analysis of social behaviour are also included, as are reviews of recent books dealing with social networks and social structure.