Names, Nicknames, and Spelling Errors: Protecting Participant Identity in Learning Analytics of Online Discussions

Elaine Farrow, Johanna D. Moore, D. Gašević
{"title":"Names, Nicknames, and Spelling Errors: Protecting Participant Identity in Learning Analytics of Online Discussions","authors":"Elaine Farrow, Johanna D. Moore, D. Gašević","doi":"10.1145/3576050.3576070","DOIUrl":null,"url":null,"abstract":"Messages exchanged between participants in online discussion forums often contain personal names and other details that need to be redacted before the data is used for research purposes in learning analytics. However, removing the names entirely makes it harder to track the exchange of ideas between individuals within a message thread and across threads, and thereby reduces the value of this type of conversational data. In contrast, the consistent use of pseudonyms allows contributions from individuals to be tracked across messages, while also hiding the real identities of the contributors. Several factors can make it difficult to identify all instances of personal names that refer to the same individual, including spelling errors and the use of shortened forms. We developed a semi-automated approach for replacing personal names with consistent pseudonyms. We evaluated our approach on a data set of over 1,700 messages exchanged during a distance-learning course, and compared it to a general-purpose pseudonymisation tool that used deep neural networks to identify names to be redacted. We found that our tailored approach out-performed the general-purpose tool in both precision and recall, correctly identifying all but 31 substitutions out of 2,888.","PeriodicalId":394433,"journal":{"name":"LAK23: 13th International Learning Analytics and Knowledge Conference","volume":"25 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-03-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"LAK23: 13th International Learning Analytics and Knowledge Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3576050.3576070","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Messages exchanged between participants in online discussion forums often contain personal names and other details that need to be redacted before the data is used for research purposes in learning analytics. However, removing the names entirely makes it harder to track the exchange of ideas between individuals within a message thread and across threads, and thereby reduces the value of this type of conversational data. In contrast, the consistent use of pseudonyms allows contributions from individuals to be tracked across messages, while also hiding the real identities of the contributors. Several factors can make it difficult to identify all instances of personal names that refer to the same individual, including spelling errors and the use of shortened forms. We developed a semi-automated approach for replacing personal names with consistent pseudonyms. We evaluated our approach on a data set of over 1,700 messages exchanged during a distance-learning course, and compared it to a general-purpose pseudonymisation tool that used deep neural networks to identify names to be redacted. We found that our tailored approach out-performed the general-purpose tool in both precision and recall, correctly identifying all but 31 substitutions out of 2,888.
姓名、昵称和拼写错误:在在线讨论的学习分析中保护参与者身份
在线论坛参与者之间交换的消息通常包含个人姓名和其他详细信息,这些信息需要在数据用于学习分析的研究目的之前进行编辑。但是,完全删除名称会使跟踪消息线程内和线程间个人之间的思想交换变得更加困难,从而降低了这种类型的会话数据的价值。相比之下,假名的持续使用使得来自个人的贡献可以在消息中被跟踪,同时也隐藏了贡献者的真实身份。有几个因素会使识别指同一个人的所有人名变得困难,包括拼写错误和缩略形式的使用。我们开发了一种半自动的方法,用一致的假名替换个人姓名。我们在远程学习课程中交换的超过1,700条消息的数据集上评估了我们的方法,并将其与使用深度神经网络识别要编辑的名称的通用假名化工具进行了比较。我们发现,我们量身定制的方法在准确率和召回率方面都优于通用工具,在2,888个替换项中,除了31个之外,其他所有替换项都正确识别。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信