Dynamic cascades with bidirectional bootstrapping for spontaneous facial action unit detection

2009 3rd International Conference on Affective Computing and Intelligent Interaction and Workshops Pub Date : 2009-12-08 DOI:10.1109/ACII.2009.5349603

Yunfeng Zhu, F. D. L. Torre, J. Cohn, Yujin Zhang

{"title":"Dynamic cascades with bidirectional bootstrapping for spontaneous facial action unit detection","authors":"Yunfeng Zhu, F. D. L. Torre, J. Cohn, Yujin Zhang","doi":"10.1109/ACII.2009.5349603","DOIUrl":null,"url":null,"abstract":"A relatively unexplored problem in facial expression analysis is how to select the positive and negative samples with which to train classifiers for expression recognition. Typically, for each action unit (AU) or other expression, the peak frames are selected as positive class and the negative samples are selected from other AUs. This approach suffers from at least two drawbacks. One, because many state of the art classifiers, such as Support Vector Machines (SVMs), fail to scale well with increases in the number of training samples (e.g. for the worse case in SVM), it may be infeasible to use all potential training data. Two, it often is unclear how best to choose the positive and negative samples. If we only label the peaks as positive samples, a large imbalance will result between positive and negative samples, especially for infrequent AU. On the other hand, if all frames from onset to offset are labeled as positive, many may differ minimally or not at all from the negative class. Frames near onsets and offsets often differ little from those that precede them. In this paper, we propose Dynamic Cascades with Bidirectional Bootstrapping (DCBB) to address these issues. DCBB optimally selects positive and negative class samples in training sets. In experimental evaluations in non-posed video from the RU-FACS Database, DCBB yielded improved performance for action unit recognition relative to alternative approaches.","PeriodicalId":330737,"journal":{"name":"2009 3rd International Conference on Affective Computing and Intelligent Interaction and Workshops","volume":"5 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2009-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"29","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2009 3rd International Conference on Affective Computing and Intelligent Interaction and Workshops","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ACII.2009.5349603","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 29

Abstract

A relatively unexplored problem in facial expression analysis is how to select the positive and negative samples with which to train classifiers for expression recognition. Typically, for each action unit (AU) or other expression, the peak frames are selected as positive class and the negative samples are selected from other AUs. This approach suffers from at least two drawbacks. One, because many state of the art classifiers, such as Support Vector Machines (SVMs), fail to scale well with increases in the number of training samples (e.g. for the worse case in SVM), it may be infeasible to use all potential training data. Two, it often is unclear how best to choose the positive and negative samples. If we only label the peaks as positive samples, a large imbalance will result between positive and negative samples, especially for infrequent AU. On the other hand, if all frames from onset to offset are labeled as positive, many may differ minimally or not at all from the negative class. Frames near onsets and offsets often differ little from those that precede them. In this paper, we propose Dynamic Cascades with Bidirectional Bootstrapping (DCBB) to address these issues. DCBB optimally selects positive and negative class samples in training sets. In experimental evaluations in non-posed video from the RU-FACS Database, DCBB yielded improved performance for action unit recognition relative to alternative approaches.

查看原文本刊更多论文

用于自发面部动作单元检测的双向自举动态级联

面部表情分析中一个相对未被探索的问题是如何选择正样本和负样本来训练分类器进行表情识别。通常，对于每个动作单元(AU)或其他表达，选择峰值帧作为阳性类，从其他AU中选择阴性样本。这种方法至少有两个缺点。首先，由于许多最先进的分类器，如支持向量机(SVM)，不能很好地随训练样本数量的增加而扩展(例如，对于支持向量机的最坏情况)，使用所有潜在的训练数据可能是不可行的。第二，通常不清楚如何最好地选择阳性和阴性样本。如果我们只将峰值标记为正样本，那么正样本和负样本之间会产生很大的不平衡，特别是对于不常见的AU。另一方面，如果从开始到偏移的所有帧都被标记为正帧，那么许多帧可能与负帧相差很小或根本没有差别。在起始点和偏移点附近的帧通常与它们之前的帧差别不大。在本文中，我们提出了带有双向引导(DCBB)的动态级联来解决这些问题。DCBB最优地选择训练集中的正类样本和负类样本。在来自RU-FACS数据库的非姿态视频的实验评估中，相对于其他方法，DCBB在动作单元识别方面产生了更好的性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2009 3rd International Conference on Affective Computing and Intelligent Interaction and Workshops

自引率

0.00%

发文量