Automatic Vocal Completion for Indonesian Language Based on Recurrent Neural Network

Agi Prasetiadi, Asti Dwi Sripamuji, Risa Riski Amalia, Julian Saputra, Imada Ramadhanti
{"title":"Automatic Vocal Completion for Indonesian Language Based on Recurrent Neural Network","authors":"Agi Prasetiadi, Asti Dwi Sripamuji, Risa Riski Amalia, Julian Saputra, Imada Ramadhanti","doi":"10.25299/itjrd.2024.14171","DOIUrl":null,"url":null,"abstract":"Most Indonesian social media users under the age of 25 use various words, which are now often referred to as slang, including abbreviations in communicating. Not only causes, but this variation also poses challenges for the natural language processing of Indonesian. The previous researchers tried to improve the Recurrent Neural Network to correct errors at the character level with an accuracy of 83.76%. This study aims to normalize abbreviated words in Indonesian into complete words using a Recurrent Neural Network in the form of Bidirected Long Short-Term Memory and Gated Recurrent Unit. The dataset is built with several weight confgurations from 3-Gram to 6-Gram consisting of words without vowels and complete words with vowels. Our model is the frst model in the world that tries to fnd incomplete Indonesian words, which eventually become fully lettered sentences with an accuracy of 97.44%.","PeriodicalId":484232,"journal":{"name":"IT journal research and development","volume":" 12","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IT journal research and development","FirstCategoryId":"0","ListUrlMain":"https://doi.org/10.25299/itjrd.2024.14171","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Most Indonesian social media users under the age of 25 use various words, which are now often referred to as slang, including abbreviations in communicating. Not only causes, but this variation also poses challenges for the natural language processing of Indonesian. The previous researchers tried to improve the Recurrent Neural Network to correct errors at the character level with an accuracy of 83.76%. This study aims to normalize abbreviated words in Indonesian into complete words using a Recurrent Neural Network in the form of Bidirected Long Short-Term Memory and Gated Recurrent Unit. The dataset is built with several weight confgurations from 3-Gram to 6-Gram consisting of words without vowels and complete words with vowels. Our model is the frst model in the world that tries to fnd incomplete Indonesian words, which eventually become fully lettered sentences with an accuracy of 97.44%.
基于递归神经网络的印尼语自动语音补全功能
大多数 25 岁以下的印尼社交媒体用户在交流时使用各种词语,包括缩写,这些词语现在通常被称为俚语。不仅如此,这种变化也给印尼语的自然语言处理带来了挑战。之前的研究人员尝试改进递归神经网络,以纠正字符层面的错误,准确率为 83.76%。本研究旨在使用双向长短期记忆和门控递归单元形式的递归神经网络将印尼语缩写词规范化为完整词。数据集采用了从 3 格兰到 6 格兰的多种权重配置,包括不含元音的单词和含元音的完整单词。我们的模型是世界上首个尝试查找不完整印尼语单词的模型,这些单词最终变成了完整字母句子,准确率高达 97.44%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信