自然科学与社会科学:人口普查数据分析检测母语的变化

Christian M. Choy, M. Co, Matthew J. Fogel, Clarke D. Garrioch, C. Leung, Ekaterina Martchenko
{"title":"自然科学与社会科学:人口普查数据分析检测母语的变化","authors":"Christian M. Choy, M. Co, Matthew J. Fogel, Clarke D. Garrioch, C. Leung, Ekaterina Martchenko","doi":"10.1109/IMCOM51814.2021.9377412","DOIUrl":null,"url":null,"abstract":"As we are living in a global environment, it is not unusual to have more than one languages or dialects used in a country. Examples include Canada in the Americas, Singapore in Asia, and Switzerland in Europe. With the initiatives of globalization, many people immigrate or live in a country other than their birthplace. As a result, different people in the same country may have different home language (i.e., first language). For instance, as a nation composed of a highly diverse language population, Canada provides a unique opportunity to study the factors causing certain languages (or families of language) to be lost over subsequent generations among allophones (i.e., people whose mother tongue is neither English or French). In this paper, we focus on census data analytics. Specifically, we analyze census microdata by exploring machine learning and data mining techniques-such as decision tree induction, random forest, and categorical naive Bayes-to study the influence of various social and economic factors on the probability that allophones adopt official languages as their language spoken at home. This study is a showcase where natural sciences and engineering (NSE) meet social sciences, in which NSE solutions (e.g., census data analytics) are applicable for the study of social science related phenomena (e.g., successful detection of shifts in home languages).","PeriodicalId":275121,"journal":{"name":"2021 15th International Conference on Ubiquitous Information Management and Communication (IMCOM)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-01-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":"{\"title\":\"Natural Sciences Meet Social Sciences: Census Data Analytics for Detecting Home Language Shifts\",\"authors\":\"Christian M. Choy, M. Co, Matthew J. Fogel, Clarke D. Garrioch, C. Leung, Ekaterina Martchenko\",\"doi\":\"10.1109/IMCOM51814.2021.9377412\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"As we are living in a global environment, it is not unusual to have more than one languages or dialects used in a country. Examples include Canada in the Americas, Singapore in Asia, and Switzerland in Europe. With the initiatives of globalization, many people immigrate or live in a country other than their birthplace. As a result, different people in the same country may have different home language (i.e., first language). For instance, as a nation composed of a highly diverse language population, Canada provides a unique opportunity to study the factors causing certain languages (or families of language) to be lost over subsequent generations among allophones (i.e., people whose mother tongue is neither English or French). In this paper, we focus on census data analytics. Specifically, we analyze census microdata by exploring machine learning and data mining techniques-such as decision tree induction, random forest, and categorical naive Bayes-to study the influence of various social and economic factors on the probability that allophones adopt official languages as their language spoken at home. This study is a showcase where natural sciences and engineering (NSE) meet social sciences, in which NSE solutions (e.g., census data analytics) are applicable for the study of social science related phenomena (e.g., successful detection of shifts in home languages).\",\"PeriodicalId\":275121,\"journal\":{\"name\":\"2021 15th International Conference on Ubiquitous Information Management and Communication (IMCOM)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-01-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"8\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 15th International Conference on Ubiquitous Information Management and Communication (IMCOM)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IMCOM51814.2021.9377412\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 15th International Conference on Ubiquitous Information Management and Communication (IMCOM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IMCOM51814.2021.9377412","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 8

摘要

由于我们生活在一个全球化的环境中,一个国家使用一种以上的语言或方言是很正常的。例如美洲的加拿大、亚洲的新加坡和欧洲的瑞士。随着全球化的发展,许多人移民或居住在他们出生地以外的国家。因此,同一国家的不同的人可能有不同的母语(即第一语言)。例如,作为一个由高度多样化的语言人口组成的国家,加拿大提供了一个独特的机会来研究导致某些语言(或语言家族)在音素(即母语既不是英语也不是法语的人)的后代中丢失的因素。本文主要研究人口普查数据分析。具体来说,我们通过探索机器学习和数据挖掘技术(如决策树归纳、随机森林和分类朴素贝叶斯)来分析人口普查微数据,以研究各种社会和经济因素对音素采用官方语言作为其在家中使用的语言的概率的影响。这项研究展示了自然科学和工程(NSE)与社会科学的结合,其中NSE解决方案(例如,人口普查数据分析)适用于社会科学相关现象的研究(例如,成功检测家庭语言的变化)。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Natural Sciences Meet Social Sciences: Census Data Analytics for Detecting Home Language Shifts
As we are living in a global environment, it is not unusual to have more than one languages or dialects used in a country. Examples include Canada in the Americas, Singapore in Asia, and Switzerland in Europe. With the initiatives of globalization, many people immigrate or live in a country other than their birthplace. As a result, different people in the same country may have different home language (i.e., first language). For instance, as a nation composed of a highly diverse language population, Canada provides a unique opportunity to study the factors causing certain languages (or families of language) to be lost over subsequent generations among allophones (i.e., people whose mother tongue is neither English or French). In this paper, we focus on census data analytics. Specifically, we analyze census microdata by exploring machine learning and data mining techniques-such as decision tree induction, random forest, and categorical naive Bayes-to study the influence of various social and economic factors on the probability that allophones adopt official languages as their language spoken at home. This study is a showcase where natural sciences and engineering (NSE) meet social sciences, in which NSE solutions (e.g., census data analytics) are applicable for the study of social science related phenomena (e.g., successful detection of shifts in home languages).
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信