{"title":"Indonesian Parsing using Probabilistic Context-Free Grammar (PCFG) and Viterbi-Cocke Younger Kasami (Viterbi-CYK)","authors":"D. E. Cahyani, L. Gumilar, Ajie Pangestu","doi":"10.1109/ISRITI51436.2020.9315395","DOIUrl":null,"url":null,"abstract":"Parsing is a tool for understanding natural grammar patterns. The problem of structural ambiguity in identifying sentence patterns often occurs in parsing. Syntactic parsing is one approach to solving structural ambiguity problems using the Probabilistic Context-Free Grammar (PCFG) and Viterbi-Cocke Younger Kasami (Viterbi-CYK) methods. Meanwhile, a large number of Indonesian language resources are needed as machine knowledge to parse. This research build a parsing of Indonesian sentence patterns with Indonesian Tagged corpus resource then solve the ambiguity problem of Indonesian sentence pattern parsing using PCFG and Viterbi-CYK algorithms. The corpus data is processed to obtain grammar rules using the PCFG algorithm. Then, the sentence on the corpus is processed by the PCFG rule that generated and uses the Viterbi-CYK algorithm to get the parse tree taken based on the highest probability value. The results of the research produced an average value of similarity production rules which the highest values is 92.95%. This shows that the Indonesian parsing successfully parses Indonesian sentence and can solve the problem of structural ambiguity in the parsing of Indonesian sentence patterns.","PeriodicalId":325920,"journal":{"name":"2020 3rd International Seminar on Research of Information Technology and Intelligent Systems (ISRITI)","volume":"29 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 3rd International Seminar on Research of Information Technology and Intelligent Systems (ISRITI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISRITI51436.2020.9315395","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3
Abstract
Parsing is a tool for understanding natural grammar patterns. The problem of structural ambiguity in identifying sentence patterns often occurs in parsing. Syntactic parsing is one approach to solving structural ambiguity problems using the Probabilistic Context-Free Grammar (PCFG) and Viterbi-Cocke Younger Kasami (Viterbi-CYK) methods. Meanwhile, a large number of Indonesian language resources are needed as machine knowledge to parse. This research build a parsing of Indonesian sentence patterns with Indonesian Tagged corpus resource then solve the ambiguity problem of Indonesian sentence pattern parsing using PCFG and Viterbi-CYK algorithms. The corpus data is processed to obtain grammar rules using the PCFG algorithm. Then, the sentence on the corpus is processed by the PCFG rule that generated and uses the Viterbi-CYK algorithm to get the parse tree taken based on the highest probability value. The results of the research produced an average value of similarity production rules which the highest values is 92.95%. This shows that the Indonesian parsing successfully parses Indonesian sentence and can solve the problem of structural ambiguity in the parsing of Indonesian sentence patterns.