{"title":"Path-constrained Viterbi Algorithm: an Alternative to State-transition Feature for Conditional Random Fields","authors":"Yulin Ren, Dehua Li","doi":"10.1109/icomssc45026.2018.8941699","DOIUrl":null,"url":null,"abstract":"The state-transition feature that considers the transition between states is an important feature used in Conditional random fields (CRFs). However, integrating the state-transition feature into the CRF model always results in a long training time. It is believed that there is no study focusing on this problem. Therefore, this paper proposes a path-constrained Viterbi algorithm to substitute the use of the state-transition feature, where the core idea involves pruning the paths (or state transitions) that does not exist in real-world data in the Viterbi decoding process. The proposed method is simple but effective. The experimental results obtained for four natural language processing (NLP) tasks, i.e., Chinese word segmentation (CWS), Named Entity Recognition (NER), text chunking, and part-of-speech (POS) tagging, demonstrate that the proposed method achieved performance close to that using state-transition feature, and hence saving as much as half the training time in total.","PeriodicalId":332213,"journal":{"name":"2018 International Computers, Signals and Systems Conference (ICOMSSC)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 International Computers, Signals and Systems Conference (ICOMSSC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/icomssc45026.2018.8941699","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
The state-transition feature that considers the transition between states is an important feature used in Conditional random fields (CRFs). However, integrating the state-transition feature into the CRF model always results in a long training time. It is believed that there is no study focusing on this problem. Therefore, this paper proposes a path-constrained Viterbi algorithm to substitute the use of the state-transition feature, where the core idea involves pruning the paths (or state transitions) that does not exist in real-world data in the Viterbi decoding process. The proposed method is simple but effective. The experimental results obtained for four natural language processing (NLP) tasks, i.e., Chinese word segmentation (CWS), Named Entity Recognition (NER), text chunking, and part-of-speech (POS) tagging, demonstrate that the proposed method achieved performance close to that using state-transition feature, and hence saving as much as half the training time in total.
考虑状态间转换的状态转换特征是条件随机场(Conditional random field, CRFs)中的一个重要特征。然而,将状态转换特征集成到CRF模型中往往会导致较长的训练时间。据信目前还没有针对这一问题的研究。因此,本文提出了一种路径约束的Viterbi算法来替代状态转换特征的使用,其核心思想是在Viterbi解码过程中对真实数据中不存在的路径(或状态转换)进行修剪。该方法简单有效。对中文分词(CWS)、命名实体识别(NER)、文本分块和词性标注(POS) 4个自然语言处理(NLP)任务的实验结果表明,该方法的性能接近使用状态转移特征的训练方法,从而节省了多达一半的训练时间。