{"title":"Text Mining an Automatic Short Answer Grading (ASAG), Comparison of Three Methods of Cosine Similarity, Jaccard Similarity and Dice's Coefficient","authors":"Tri Wahyuningsih, Henderi Henderi, W. Winarno","doi":"10.47738/JADS.V2I2.31","DOIUrl":null,"url":null,"abstract":"This study aims to find correlation assessment of Automatic Short Answer Grading (ASAG) by comparing three methods of Cosine Similarity, Jaccard Similarity and Dice Coefficient by providing one reference answer. From the results of computing using Python programming language and data processing using spreadsheets, it was obtained that the Dice Coefficient method had the highest correlation average value of 0.76, followed by Cosine Similarity with an average correlation value of 0.76, and the lowest correlation average value was the Jaccard method with a value of 0.69. The contribution to this study is the use of three methods in one data, whereas the previous research only used 1 method for 1 data or 2 methods for 1 data. So, the value in this study resulted in a more complete comparison and accuracy of data.","PeriodicalId":341738,"journal":{"name":"Journal of Applied Data Sciences","volume":"23 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"11","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Applied Data Sciences","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.47738/JADS.V2I2.31","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 11
Abstract
This study aims to find correlation assessment of Automatic Short Answer Grading (ASAG) by comparing three methods of Cosine Similarity, Jaccard Similarity and Dice Coefficient by providing one reference answer. From the results of computing using Python programming language and data processing using spreadsheets, it was obtained that the Dice Coefficient method had the highest correlation average value of 0.76, followed by Cosine Similarity with an average correlation value of 0.76, and the lowest correlation average value was the Jaccard method with a value of 0.69. The contribution to this study is the use of three methods in one data, whereas the previous research only used 1 method for 1 data or 2 methods for 1 data. So, the value in this study resulted in a more complete comparison and accuracy of data.