{"title":"Assessment of Humorous Speech by Automatic Heuristic-based Feature Selection","authors":"Derry Pramono Adi, Agustinus Bimo Gumelar, Ralin Pramasuri Arta Meisa, Siska Susilowati","doi":"10.1109/iSemantic50169.2020.9234228","DOIUrl":null,"url":null,"abstract":"Following the amount of data and file size, the dimensions of the features can also change, causing heavy usage load on computers by simple multiplication. As technology progressed, we generate clearer sound files, resulting in more High Definition (HD) data with a direct impact on its size. Since many records are critically needed for further analysis, reducing files count and sacrificing clearer sound files is not feasible. In selecting features that best represent humorous speech, we need to implement the Feature Selection (FS) techniques. The FS acts as helpers in computing features with more than ten features/attributes. The purpose of this research is to find the FS technique with the highest accuracy of Random Forest classification, specifically for humorous speech. Unlike the usual FS techniques, we chose to employ the heuristic-based FS techniques, namely, Particle Swarm Optimization, Ant Colony Optimization, Cuckoo Search, and Firefly Algorithm. We applied the FS techniques in WEKA, over their simplification of usage; also jAudio of GUI-based feature extraction for the same reason. Moreover, we used the speech data from the UR-FUNNY dataset, which comprised 10.000 sound clips of both humorous and non-humorous speech by TED Talks speakers.","PeriodicalId":345558,"journal":{"name":"2020 International Seminar on Application for Technology of Information and Communication (iSemantic)","volume":"30 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 International Seminar on Application for Technology of Information and Communication (iSemantic)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/iSemantic50169.2020.9234228","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
Following the amount of data and file size, the dimensions of the features can also change, causing heavy usage load on computers by simple multiplication. As technology progressed, we generate clearer sound files, resulting in more High Definition (HD) data with a direct impact on its size. Since many records are critically needed for further analysis, reducing files count and sacrificing clearer sound files is not feasible. In selecting features that best represent humorous speech, we need to implement the Feature Selection (FS) techniques. The FS acts as helpers in computing features with more than ten features/attributes. The purpose of this research is to find the FS technique with the highest accuracy of Random Forest classification, specifically for humorous speech. Unlike the usual FS techniques, we chose to employ the heuristic-based FS techniques, namely, Particle Swarm Optimization, Ant Colony Optimization, Cuckoo Search, and Firefly Algorithm. We applied the FS techniques in WEKA, over their simplification of usage; also jAudio of GUI-based feature extraction for the same reason. Moreover, we used the speech data from the UR-FUNNY dataset, which comprised 10.000 sound clips of both humorous and non-humorous speech by TED Talks speakers.