This study investigates the historical diffusion and migration patterns of Chinese surnames by analyzing their spatial correlograms. The primary objectives are to identify typical correlogram categories, characterize each category, and explore the factors influencing the historical diffusion and migration processes that have shaped the spatial distributions of Chinese surnames.
The data used in this study come from China's National Citizen Identity Information Center (NCIIC), which provides surname and prefecture information for 1.28 billion individuals. We calculate spatial correlograms to assess surname autocorrelation across varying geographic distances and apply cluster analysis to classify the 380 most common surnames, covering 97% of the population, into five categories based on their spatial correlograms. We examine the characteristics of correlograms across these categories and propose an index to capture the overall geographic distribution of surnames in a category.
In the analysis, five distinct categories of spatial correlograms are identified: C (cline), SC (slight cline), IBD (isolation by distance), D (depression), and IBD + D (isolation by distance + depression). Surnames in category C exhibit a broad and even distribution, with high autocorrelation in adjacent regions and a large diffusion range. Surnames in category SC show lower autocorrelation than those in category C but still exhibit a large diffusion range. Surnames in category IBD are highly concentrated in specific regions, with low autocorrelation and a smaller diffusion range. Surnames in both categories D and IBD + D display long-distance autocorrelation, featuring a distinct depression in their correlograms.
Surnames with long histories and significant influence, such as those in category C, tend to be broadly and evenly distributed, reflecting prolonged diffusion processes. Conversely, surnames with more recent origins or those that have experienced isolation, such as those in category IBD, typically exhibit more concentrated distributions. The study also highlights the role of large-scale, long-distance migration events in shaping Chinese surname distributions, particularly for surnames in categories D and IBD + D.