{"title":"视频解析异常检测","authors":"Borislav Antic, B. Ommer","doi":"10.1109/ICCV.2011.6126525","DOIUrl":null,"url":null,"abstract":"Detecting abnormalities in video is a challenging problem since the class of all irregular objects and behaviors is infinite and thus no (or by far not enough) abnormal training samples are available. Consequently, a standard setting is to find abnormalities without actually knowing what they are because we have not been shown abnormal examples during training. However, although the training data does not define what an abnormality looks like, the main paradigm in this field is to directly search for individual abnormal local patches or image regions independent of another. To address this problem we parse video frames by establishing a set of hypotheses that jointly explain all the foreground while, at same time, trying to find normal training samples that explain the hypotheses. Consequently, we can avoid a direct detection of abnormalities. They are discovered indirectly as those hypotheses which are needed for covering the foreground without finding an explanation by normal samples for themselves. We present a probabilistic model that localizes abnormalities using statistical inference. On the challenging dataset of [15] it outperforms the state-of-the-art by 7% to achieve a frame-based abnormality classification performance of 91% and the localization performance improves by 32% to 76%.","PeriodicalId":6391,"journal":{"name":"2011 International Conference on Computer Vision","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2011-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"152","resultStr":"{\"title\":\"Video parsing for abnormality detection\",\"authors\":\"Borislav Antic, B. Ommer\",\"doi\":\"10.1109/ICCV.2011.6126525\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Detecting abnormalities in video is a challenging problem since the class of all irregular objects and behaviors is infinite and thus no (or by far not enough) abnormal training samples are available. Consequently, a standard setting is to find abnormalities without actually knowing what they are because we have not been shown abnormal examples during training. However, although the training data does not define what an abnormality looks like, the main paradigm in this field is to directly search for individual abnormal local patches or image regions independent of another. To address this problem we parse video frames by establishing a set of hypotheses that jointly explain all the foreground while, at same time, trying to find normal training samples that explain the hypotheses. Consequently, we can avoid a direct detection of abnormalities. They are discovered indirectly as those hypotheses which are needed for covering the foreground without finding an explanation by normal samples for themselves. We present a probabilistic model that localizes abnormalities using statistical inference. On the challenging dataset of [15] it outperforms the state-of-the-art by 7% to achieve a frame-based abnormality classification performance of 91% and the localization performance improves by 32% to 76%.\",\"PeriodicalId\":6391,\"journal\":{\"name\":\"2011 International Conference on Computer Vision\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2011-11-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"152\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2011 International Conference on Computer Vision\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICCV.2011.6126525\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2011 International Conference on Computer Vision","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCV.2011.6126525","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Detecting abnormalities in video is a challenging problem since the class of all irregular objects and behaviors is infinite and thus no (or by far not enough) abnormal training samples are available. Consequently, a standard setting is to find abnormalities without actually knowing what they are because we have not been shown abnormal examples during training. However, although the training data does not define what an abnormality looks like, the main paradigm in this field is to directly search for individual abnormal local patches or image regions independent of another. To address this problem we parse video frames by establishing a set of hypotheses that jointly explain all the foreground while, at same time, trying to find normal training samples that explain the hypotheses. Consequently, we can avoid a direct detection of abnormalities. They are discovered indirectly as those hypotheses which are needed for covering the foreground without finding an explanation by normal samples for themselves. We present a probabilistic model that localizes abnormalities using statistical inference. On the challenging dataset of [15] it outperforms the state-of-the-art by 7% to achieve a frame-based abnormality classification performance of 91% and the localization performance improves by 32% to 76%.