{"title":"Building customized Named Entity Recognition models for specific process automation tasks","authors":"Vasile Ionut Iga, G. Silaghi","doi":"10.1109/SYNASC57785.2022.00041","DOIUrl":null,"url":null,"abstract":"In the context of a project aiming to build human-behaving robots for process automation, named entity recognition (NER) becomes one of the first tasks to solve. This paper presents our experience on building NER models for recognizing specific entities of interest, with the help of the state-of-the-art pre-trained BERT model. Noticing that the model built with the help of a general knowledge dataset scores poor results in retrieving entities specific to our particular use cases, we constructed two datasets tailored for our context and trained BERT-based models on it. We show that properly constructing the specific datasets is sufficient in order to obtain a good entity classification performance, without further increasing the model learning time.","PeriodicalId":446065,"journal":{"name":"2022 24th International Symposium on Symbolic and Numeric Algorithms for Scientific Computing (SYNASC)","volume":"90 1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 24th International Symposium on Symbolic and Numeric Algorithms for Scientific Computing (SYNASC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SYNASC57785.2022.00041","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
In the context of a project aiming to build human-behaving robots for process automation, named entity recognition (NER) becomes one of the first tasks to solve. This paper presents our experience on building NER models for recognizing specific entities of interest, with the help of the state-of-the-art pre-trained BERT model. Noticing that the model built with the help of a general knowledge dataset scores poor results in retrieving entities specific to our particular use cases, we constructed two datasets tailored for our context and trained BERT-based models on it. We show that properly constructing the specific datasets is sufficient in order to obtain a good entity classification performance, without further increasing the model learning time.