Bantuweb: a digital library for resource scarce South African languages

A. Holy, Alon Bresler, Osher Shuman, Catherine Chavula, H. Suleman
{"title":"Bantuweb: a digital library for resource scarce South African languages","authors":"A. Holy, Alon Bresler, Osher Shuman, Catherine Chavula, H. Suleman","doi":"10.1145/3129416.3129446","DOIUrl":null,"url":null,"abstract":"South Africa is a linguistically diverse country: it is a home to 11 official languages of which nine, excluding English and Afrikaans, are Resource Scarce Languages (RSLs). Accordingly, many South Africans struggle to access information written in their native languages on the Web. Unfortunately, lack of access to information hinders social economic growth. This paper proposes a Web based digital library to act as a central repository for content written in these languages that is crawled from the Web, and generated or contributed by a community of users. Gamification features have been incorporated into the digital library to motivate users to contribute content to strengthen the collection of resources and to increase community participation. Specifically, the paper: (i) proposes a ranking algorithm, smart interleaving, to aggregate and rank multilingual search results effectively from collections of varying size; and (ii) investigates which gamification features, among leaderboard, notifications, virtual points and level, motivate users to contribute content in the context of South African RSLs. The results show that users were motivated to contribute more content to reach the next level than improving their leaderboard ranking or virtual points. Further, the overall results on merging and ranking multilingual search results show no significant improvement in using smart interleaving.","PeriodicalId":269578,"journal":{"name":"Research Conference of the South African Institute of Computer Scientists and Information Technologists","volume":"58 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"26","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Research Conference of the South African Institute of Computer Scientists and Information Technologists","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3129416.3129446","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 26

Abstract

South Africa is a linguistically diverse country: it is a home to 11 official languages of which nine, excluding English and Afrikaans, are Resource Scarce Languages (RSLs). Accordingly, many South Africans struggle to access information written in their native languages on the Web. Unfortunately, lack of access to information hinders social economic growth. This paper proposes a Web based digital library to act as a central repository for content written in these languages that is crawled from the Web, and generated or contributed by a community of users. Gamification features have been incorporated into the digital library to motivate users to contribute content to strengthen the collection of resources and to increase community participation. Specifically, the paper: (i) proposes a ranking algorithm, smart interleaving, to aggregate and rank multilingual search results effectively from collections of varying size; and (ii) investigates which gamification features, among leaderboard, notifications, virtual points and level, motivate users to contribute content in the context of South African RSLs. The results show that users were motivated to contribute more content to reach the next level than improving their leaderboard ranking or virtual points. Further, the overall results on merging and ranking multilingual search results show no significant improvement in using smart interleaving.
Bantuweb:资源稀缺的南非语言的数字图书馆
南非是一个语言多样化的国家:它是11种官方语言的家园,其中9种,除了英语和南非荷兰语,是资源稀缺语言(RSLs)。因此,许多南非人很难在网上获得用他们的母语写的信息。不幸的是,缺乏获取信息的途径阻碍了社会经济的发展。本文提出了一个基于Web的数字图书馆,作为从Web上抓取的用这些语言编写的内容的中央存储库,并由用户社区生成或贡献。数码图书馆已加入游戏化功能,以激励用户贡献内容,加强资源的收集和增加社区参与。具体而言,本文:(i)提出了一种排序算法,智能交错,从不同大小的集合中有效地聚合和排序多语言搜索结果;(ii)调查排行榜、通知、虚拟积分和关卡等游戏化功能能够激励用户在南非RSLs中贡献内容。结果显示,比起提高排行榜排名或虚拟积分,用户更愿意贡献更多内容以达到下一个关卡。此外,在合并和排序多语言搜索结果的总体结果显示,使用智能交错没有显著的改善。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信