{"title":"Funny words detection via Contrastive Representations and Pre-trained Language Model","authors":"Yiming Du, Zelin Tian","doi":"10.1109/AINIT54228.2021.00078","DOIUrl":null,"url":null,"abstract":"Funniness detection of news headlines is a challenging task in computational linguistics. However, most existing works on funniness detection mainly tackle the scenario by simply judging whether a sentence is humorous, whose result is unstable due to factors such as sentence length. To solve this issue, in this paper, our idea is to fine-grained mine the detailed information of the words and the contextual relationship between different words in the sentence, which help to evaluate the correlation between keywords and the funniness of news headlines quantitatively. Specifically, we propose a funny words detection algorithm based on the contrastive representations learning and BERT model. To quantify the impact of different words on the degree of humor, we first subtract the funniness grades of the original news headlines and the funniness grades of the original news headlines with a single word replaced. Both funniness grades are predicted with a pre-trained model, which is supervised by a a threshold to limit the amount of data and ensure the validity of data. To ensure the accuracy of our prediction, we further introduce the contrastive learning to constrain the differences of news headlines before and after word replacement. Finally, according to the Root Mean Square Error (RMSE) matrix in our experiment, we develop a BERT model with mixed sequence embedding to generate a table about words and their corresponding funniness improvement about the news headlines.","PeriodicalId":326400,"journal":{"name":"2021 2nd International Seminar on Artificial Intelligence, Networking and Information Technology (AINIT)","volume":"27 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 2nd International Seminar on Artificial Intelligence, Networking and Information Technology (AINIT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/AINIT54228.2021.00078","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Funniness detection of news headlines is a challenging task in computational linguistics. However, most existing works on funniness detection mainly tackle the scenario by simply judging whether a sentence is humorous, whose result is unstable due to factors such as sentence length. To solve this issue, in this paper, our idea is to fine-grained mine the detailed information of the words and the contextual relationship between different words in the sentence, which help to evaluate the correlation between keywords and the funniness of news headlines quantitatively. Specifically, we propose a funny words detection algorithm based on the contrastive representations learning and BERT model. To quantify the impact of different words on the degree of humor, we first subtract the funniness grades of the original news headlines and the funniness grades of the original news headlines with a single word replaced. Both funniness grades are predicted with a pre-trained model, which is supervised by a a threshold to limit the amount of data and ensure the validity of data. To ensure the accuracy of our prediction, we further introduce the contrastive learning to constrain the differences of news headlines before and after word replacement. Finally, according to the Root Mean Square Error (RMSE) matrix in our experiment, we develop a BERT model with mixed sequence embedding to generate a table about words and their corresponding funniness improvement about the news headlines.