Farhan Dhanani, Muhammad Rafi, Muhammad Atif Tahir
{"title":"有趣的翻译:小而强大的开源变形金刚让英语 PUN-ny 实体在法语中栩栩如生!","authors":"Farhan Dhanani, Muhammad Rafi, Muhammad Atif Tahir","doi":"10.1016/j.csl.2024.101739","DOIUrl":null,"url":null,"abstract":"<div><div>Recent advancements in transformer-based language models have demonstrated substantial progress in producing good translations. Despite these achievements, challenges persist in translating playful requests, especially when users intentionally introduce humor. Deciphering the hidden pun among such playful requests is one of the major difficulties for modern language models, which causes user dissatisfaction. This paper targets a specific niche of humor translation, which is the translation of English-named entities containing puns into French using small-scale open-sourced transformer models. The transformer architecture serves as a foundation for popular language models like chatGPT. It allows learning long-range contextual relationships within sequences. The main novelty of the paper is the proposed extractive question/answering (Q/A) styled technique based on the transformers to find relevant translations for the provided English nouns using the openly available parallel corpora. To evaluate the effectiveness of our method, we utilize a dataset provided by the JOKER CLEF automatic pun and humor translation 2022 team. The dataset contains single-word nouns from popular novels, anime, movies, and games, each containing a pun. The discussed methodology and experimental framework are adaptable and can be extended to any language pair for which an open, available parallel corpus exists. This flexibility underscores the broader applicability of our findings and suggests the potential for enhancing humor translation across various language combinations.</div></div>","PeriodicalId":50638,"journal":{"name":"Computer Speech and Language","volume":"90 ","pages":"Article 101739"},"PeriodicalIF":3.1000,"publicationDate":"2024-10-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Tickling translations: Small but mighty open-sourced transformers bring English PUN-ny entities to life in French!\",\"authors\":\"Farhan Dhanani, Muhammad Rafi, Muhammad Atif Tahir\",\"doi\":\"10.1016/j.csl.2024.101739\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Recent advancements in transformer-based language models have demonstrated substantial progress in producing good translations. Despite these achievements, challenges persist in translating playful requests, especially when users intentionally introduce humor. Deciphering the hidden pun among such playful requests is one of the major difficulties for modern language models, which causes user dissatisfaction. This paper targets a specific niche of humor translation, which is the translation of English-named entities containing puns into French using small-scale open-sourced transformer models. The transformer architecture serves as a foundation for popular language models like chatGPT. It allows learning long-range contextual relationships within sequences. The main novelty of the paper is the proposed extractive question/answering (Q/A) styled technique based on the transformers to find relevant translations for the provided English nouns using the openly available parallel corpora. To evaluate the effectiveness of our method, we utilize a dataset provided by the JOKER CLEF automatic pun and humor translation 2022 team. The dataset contains single-word nouns from popular novels, anime, movies, and games, each containing a pun. The discussed methodology and experimental framework are adaptable and can be extended to any language pair for which an open, available parallel corpus exists. This flexibility underscores the broader applicability of our findings and suggests the potential for enhancing humor translation across various language combinations.</div></div>\",\"PeriodicalId\":50638,\"journal\":{\"name\":\"Computer Speech and Language\",\"volume\":\"90 \",\"pages\":\"Article 101739\"},\"PeriodicalIF\":3.1000,\"publicationDate\":\"2024-10-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computer Speech and Language\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0885230824001220\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer Speech and Language","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0885230824001220","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
Tickling translations: Small but mighty open-sourced transformers bring English PUN-ny entities to life in French!
Recent advancements in transformer-based language models have demonstrated substantial progress in producing good translations. Despite these achievements, challenges persist in translating playful requests, especially when users intentionally introduce humor. Deciphering the hidden pun among such playful requests is one of the major difficulties for modern language models, which causes user dissatisfaction. This paper targets a specific niche of humor translation, which is the translation of English-named entities containing puns into French using small-scale open-sourced transformer models. The transformer architecture serves as a foundation for popular language models like chatGPT. It allows learning long-range contextual relationships within sequences. The main novelty of the paper is the proposed extractive question/answering (Q/A) styled technique based on the transformers to find relevant translations for the provided English nouns using the openly available parallel corpora. To evaluate the effectiveness of our method, we utilize a dataset provided by the JOKER CLEF automatic pun and humor translation 2022 team. The dataset contains single-word nouns from popular novels, anime, movies, and games, each containing a pun. The discussed methodology and experimental framework are adaptable and can be extended to any language pair for which an open, available parallel corpus exists. This flexibility underscores the broader applicability of our findings and suggests the potential for enhancing humor translation across various language combinations.
期刊介绍:
Computer Speech & Language publishes reports of original research related to the recognition, understanding, production, coding and mining of speech and language.
The speech and language sciences have a long history, but it is only relatively recently that large-scale implementation of and experimentation with complex models of speech and language processing has become feasible. Such research is often carried out somewhat separately by practitioners of artificial intelligence, computer science, electronic engineering, information retrieval, linguistics, phonetics, or psychology.