Amelia Devi Putri Ariyanto, C. Fatichah, Diana Purwitasari
{"title":"印尼语文本信息提取的语义角色标注:文献综述","authors":"Amelia Devi Putri Ariyanto, C. Fatichah, Diana Purwitasari","doi":"10.1109/ISITIA59021.2023.10221008","DOIUrl":null,"url":null,"abstract":"The information extraction process includes Semantic Role Labeling (SRL) as one of its sub-tasks. SRL aims to determine the semantic role of each entity within a sentence by examining the meaning of the predicate. This helps construct the sentence structure by identifying the relationships between predicates and their corresponding arguments. SRL development is less common than Named Entity Recognition (NER) for information extraction because SRL annotation process is complicated, and labeling results are sometimes ambiguous. In event extraction problem, the use of NER alone is insufficient. Identifying location entities generated by NER is still inaccurate because geographic coordinates indicate locations irrelevant to actual events. On the other hand, SRL can detect locations precisely and in depth according to actual events. Even though the annotation process is complicated, the SRL can be adjusted according to the required domain and its ontology so that SRL can extract location entities down to the event level.. This research aims to offer a comprehensive analysis concerning the advancement of Semantic Role Labeling (SRL) for extracting information from Indonesian texts. Indonesian is a low-resource language with a different character from English and only has very little literature, so it is interesting to study. The papers used for the review process came from IEEE, Science Direct, and Google Scholar from 2013 to 2023, and 15 papers were found that matched the research objectives. The study results show that most papers use Indonesian-language news articles as their dataset because they use formal language, which usually has a good language structure. The methods used in SRLs are mostly rule-based. A weakness of the rule-based development method is that the rules are very dependent on a particular language or problem domain. Thus, further work can use a transformer-based deep learning approach to perform SRL on Indonesian-language texts.","PeriodicalId":116682,"journal":{"name":"2023 International Seminar on Intelligent Technology and Its Applications (ISITIA)","volume":"39 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Semantic Role Labeling for Information Extraction on Indonesian Texts: A Literature Review\",\"authors\":\"Amelia Devi Putri Ariyanto, C. Fatichah, Diana Purwitasari\",\"doi\":\"10.1109/ISITIA59021.2023.10221008\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The information extraction process includes Semantic Role Labeling (SRL) as one of its sub-tasks. SRL aims to determine the semantic role of each entity within a sentence by examining the meaning of the predicate. This helps construct the sentence structure by identifying the relationships between predicates and their corresponding arguments. SRL development is less common than Named Entity Recognition (NER) for information extraction because SRL annotation process is complicated, and labeling results are sometimes ambiguous. In event extraction problem, the use of NER alone is insufficient. Identifying location entities generated by NER is still inaccurate because geographic coordinates indicate locations irrelevant to actual events. On the other hand, SRL can detect locations precisely and in depth according to actual events. Even though the annotation process is complicated, the SRL can be adjusted according to the required domain and its ontology so that SRL can extract location entities down to the event level.. This research aims to offer a comprehensive analysis concerning the advancement of Semantic Role Labeling (SRL) for extracting information from Indonesian texts. Indonesian is a low-resource language with a different character from English and only has very little literature, so it is interesting to study. The papers used for the review process came from IEEE, Science Direct, and Google Scholar from 2013 to 2023, and 15 papers were found that matched the research objectives. The study results show that most papers use Indonesian-language news articles as their dataset because they use formal language, which usually has a good language structure. The methods used in SRLs are mostly rule-based. A weakness of the rule-based development method is that the rules are very dependent on a particular language or problem domain. Thus, further work can use a transformer-based deep learning approach to perform SRL on Indonesian-language texts.\",\"PeriodicalId\":116682,\"journal\":{\"name\":\"2023 International Seminar on Intelligent Technology and Its Applications (ISITIA)\",\"volume\":\"39 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-07-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2023 International Seminar on Intelligent Technology and Its Applications (ISITIA)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ISITIA59021.2023.10221008\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 International Seminar on Intelligent Technology and Its Applications (ISITIA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISITIA59021.2023.10221008","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Semantic Role Labeling for Information Extraction on Indonesian Texts: A Literature Review
The information extraction process includes Semantic Role Labeling (SRL) as one of its sub-tasks. SRL aims to determine the semantic role of each entity within a sentence by examining the meaning of the predicate. This helps construct the sentence structure by identifying the relationships between predicates and their corresponding arguments. SRL development is less common than Named Entity Recognition (NER) for information extraction because SRL annotation process is complicated, and labeling results are sometimes ambiguous. In event extraction problem, the use of NER alone is insufficient. Identifying location entities generated by NER is still inaccurate because geographic coordinates indicate locations irrelevant to actual events. On the other hand, SRL can detect locations precisely and in depth according to actual events. Even though the annotation process is complicated, the SRL can be adjusted according to the required domain and its ontology so that SRL can extract location entities down to the event level.. This research aims to offer a comprehensive analysis concerning the advancement of Semantic Role Labeling (SRL) for extracting information from Indonesian texts. Indonesian is a low-resource language with a different character from English and only has very little literature, so it is interesting to study. The papers used for the review process came from IEEE, Science Direct, and Google Scholar from 2013 to 2023, and 15 papers were found that matched the research objectives. The study results show that most papers use Indonesian-language news articles as their dataset because they use formal language, which usually has a good language structure. The methods used in SRLs are mostly rule-based. A weakness of the rule-based development method is that the rules are very dependent on a particular language or problem domain. Thus, further work can use a transformer-based deep learning approach to perform SRL on Indonesian-language texts.