Harmony search for hyperparameters optimization of a low resource language transformer model trained with a novel parallel corpus Ocelotl Nahuatl – Spanish

Máximo Enrique Pacheco Martínez, Maya Carrillo Ruiz, María de Lourdes Sandoval Solís
{"title":"Harmony search for hyperparameters optimization of a low resource language transformer model trained with a novel parallel corpus Ocelotl Nahuatl – Spanish","authors":"Máximo Enrique Pacheco Martínez,&nbsp;Maya Carrillo Ruiz,&nbsp;María de Lourdes Sandoval Solís","doi":"10.1016/j.sasc.2024.200152","DOIUrl":null,"url":null,"abstract":"<div><div>Nahuatl, a low-resource language, does not have an online translator application. Instead, resources are limited to dictionaries, web pages, or digital books. Given this condition, it is vital to provide as much support to the language as possible. This research aims to enhance the BLEU score in machine translation by applying the harmony search heuristic method to state-of-the-art transformers models. This is conducted by finding the optimal hyperparameter settings for the models. Models are trained and tested using a fresh moderate-size parallel corpus of 1.5k phrases. By utilizing harmony search, the study shows an improvement in the BLEU score, enhancing it by 2.569%. In order to accomplish this, various factors related to the hyperparameters need to be considered. The application of harmony search with transformers can be extended to various parallel corpora or models, taking these considerations into account.</div></div>","PeriodicalId":101205,"journal":{"name":"Systems and Soft Computing","volume":"6 ","pages":"Article 200152"},"PeriodicalIF":0.0000,"publicationDate":"2024-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Systems and Soft Computing","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2772941924000814","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Nahuatl, a low-resource language, does not have an online translator application. Instead, resources are limited to dictionaries, web pages, or digital books. Given this condition, it is vital to provide as much support to the language as possible. This research aims to enhance the BLEU score in machine translation by applying the harmony search heuristic method to state-of-the-art transformers models. This is conducted by finding the optimal hyperparameter settings for the models. Models are trained and tested using a fresh moderate-size parallel corpus of 1.5k phrases. By utilizing harmony search, the study shows an improvement in the BLEU score, enhancing it by 2.569%. In order to accomplish this, various factors related to the hyperparameters need to be considered. The application of harmony search with transformers can be extended to various parallel corpora or models, taking these considerations into account.
用新型并行语料库 Ocelotl Nahuatl - 西班牙语训练的低资源语言转换器模型超参数优化的和谐搜索
纳瓦特尔语是一种低资源语言,没有在线翻译应用程序。相反,资源仅限于字典、网页或数字图书。鉴于这种情况,为该语言提供尽可能多的支持至关重要。本研究旨在将和谐搜索启发式方法应用于最先进的转换器模型,从而提高机器翻译的 BLEU 分数。具体方法是为模型找到最佳超参数设置。我们使用由 1.5k 个短语组成的新鲜中等规模平行语料库对模型进行了训练和测试。通过使用和谐搜索,研究显示 BLEU 分数有所改善,提高了 2.569%。为了实现这一目标,需要考虑与超参数相关的各种因素。考虑到这些因素,带有转换器的和谐搜索的应用可以扩展到各种并行语料库或模型。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
2.20
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信