{"title":"视觉特征识别的表征","authors":"B. Mathew, A. Davis, R. Evans","doi":"10.1109/WWC.2003.1249052","DOIUrl":null,"url":null,"abstract":"Natural human interfaces are a key to realizing the dream of ubiquitous computing. This implies that embedded systems must be capable of sophisticated perception tasks. This paper analyzes the nature of a visual feature recognition workload. Visual feature recognition is a key component of a number of important applications, e.g. gesture based interfaces, lip tracking to augment speech recognition, smart cameras, automated surveillance systems, robotic vision, etc. Given the power sensitive nature of the embedded space and the natural conflict between low-power and high-performance implementations, a precise understanding of these algorithms is an important step in developing efficient visual feature recognition applications for the embedded space. In particular, this work analyzes the performance characteristics of flesh toning, face detection and face recognition codes based on well known algorithms. We show that the problem can be decomposed into a pipeline of filters which could lead to efficient implementations as stream processors. With better than 92% hit rate for a modest 16KB L1 data cache, the algorithms have memory system behavior commensurate with embedded processors. However, our results indicate that their execution requirements strain the performance available on current embedded systems.","PeriodicalId":432745,"journal":{"name":"2003 IEEE International Conference on Communications (Cat. No.03CH37441)","volume":"5 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2003-12-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"19","resultStr":"{\"title\":\"A characterization of visual feature recognition\",\"authors\":\"B. Mathew, A. Davis, R. Evans\",\"doi\":\"10.1109/WWC.2003.1249052\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Natural human interfaces are a key to realizing the dream of ubiquitous computing. This implies that embedded systems must be capable of sophisticated perception tasks. This paper analyzes the nature of a visual feature recognition workload. Visual feature recognition is a key component of a number of important applications, e.g. gesture based interfaces, lip tracking to augment speech recognition, smart cameras, automated surveillance systems, robotic vision, etc. Given the power sensitive nature of the embedded space and the natural conflict between low-power and high-performance implementations, a precise understanding of these algorithms is an important step in developing efficient visual feature recognition applications for the embedded space. In particular, this work analyzes the performance characteristics of flesh toning, face detection and face recognition codes based on well known algorithms. We show that the problem can be decomposed into a pipeline of filters which could lead to efficient implementations as stream processors. With better than 92% hit rate for a modest 16KB L1 data cache, the algorithms have memory system behavior commensurate with embedded processors. However, our results indicate that their execution requirements strain the performance available on current embedded systems.\",\"PeriodicalId\":432745,\"journal\":{\"name\":\"2003 IEEE International Conference on Communications (Cat. No.03CH37441)\",\"volume\":\"5 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2003-12-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"19\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2003 IEEE International Conference on Communications (Cat. No.03CH37441)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/WWC.2003.1249052\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2003 IEEE International Conference on Communications (Cat. No.03CH37441)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/WWC.2003.1249052","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Natural human interfaces are a key to realizing the dream of ubiquitous computing. This implies that embedded systems must be capable of sophisticated perception tasks. This paper analyzes the nature of a visual feature recognition workload. Visual feature recognition is a key component of a number of important applications, e.g. gesture based interfaces, lip tracking to augment speech recognition, smart cameras, automated surveillance systems, robotic vision, etc. Given the power sensitive nature of the embedded space and the natural conflict between low-power and high-performance implementations, a precise understanding of these algorithms is an important step in developing efficient visual feature recognition applications for the embedded space. In particular, this work analyzes the performance characteristics of flesh toning, face detection and face recognition codes based on well known algorithms. We show that the problem can be decomposed into a pipeline of filters which could lead to efficient implementations as stream processors. With better than 92% hit rate for a modest 16KB L1 data cache, the algorithms have memory system behavior commensurate with embedded processors. However, our results indicate that their execution requirements strain the performance available on current embedded systems.