{"title":"多处理器Plessey角检测器中平台可扩展任务分区和多级缓冲","authors":"Guan Yu, G. Lafruit, P. Schelkens","doi":"10.1109/ACSD.2007.58","DOIUrl":null,"url":null,"abstract":"The Plessey corner detector is a key technological component in scene analysis, stereo matching, and object tracking. Due to its high computation complexity, earlier fast implementations mainly focused on hardware implementations. This paper explores the viability of a multi-processor software implementation. A scalable task partitioning for efficiently mapping the Plessey algorithm on a multi-processor platform is proposed. The task partition ensures platform scalability, low inter-processor communication overhead and a well-balanced workload in each task. In addition, a multilevel buffering scheme is presented, minimizing the external memory accesses in each task to one image pixel read per calculated corner response value. The effectiveness of the proposed task partition and buffering scheme has been verified on (i) a cycle accurate simulator with shared memory and (ii) a multiple-TI-C64 DSP board using a message passing paradigm. The proposed solution combines good platform scalability with an additional 30% speedup gain over straightforward parallelization schemes.","PeriodicalId":323657,"journal":{"name":"Seventh International Conference on Application of Concurrency to System Design (ACSD 2007)","volume":"100 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2007-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Platform-scalable Task Partition and Multilevel Buffering in Multi-processor Plessey Corner Detector\",\"authors\":\"Guan Yu, G. Lafruit, P. Schelkens\",\"doi\":\"10.1109/ACSD.2007.58\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The Plessey corner detector is a key technological component in scene analysis, stereo matching, and object tracking. Due to its high computation complexity, earlier fast implementations mainly focused on hardware implementations. This paper explores the viability of a multi-processor software implementation. A scalable task partitioning for efficiently mapping the Plessey algorithm on a multi-processor platform is proposed. The task partition ensures platform scalability, low inter-processor communication overhead and a well-balanced workload in each task. In addition, a multilevel buffering scheme is presented, minimizing the external memory accesses in each task to one image pixel read per calculated corner response value. The effectiveness of the proposed task partition and buffering scheme has been verified on (i) a cycle accurate simulator with shared memory and (ii) a multiple-TI-C64 DSP board using a message passing paradigm. The proposed solution combines good platform scalability with an additional 30% speedup gain over straightforward parallelization schemes.\",\"PeriodicalId\":323657,\"journal\":{\"name\":\"Seventh International Conference on Application of Concurrency to System Design (ACSD 2007)\",\"volume\":\"100 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2007-07-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Seventh International Conference on Application of Concurrency to System Design (ACSD 2007)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ACSD.2007.58\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Seventh International Conference on Application of Concurrency to System Design (ACSD 2007)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ACSD.2007.58","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Platform-scalable Task Partition and Multilevel Buffering in Multi-processor Plessey Corner Detector
The Plessey corner detector is a key technological component in scene analysis, stereo matching, and object tracking. Due to its high computation complexity, earlier fast implementations mainly focused on hardware implementations. This paper explores the viability of a multi-processor software implementation. A scalable task partitioning for efficiently mapping the Plessey algorithm on a multi-processor platform is proposed. The task partition ensures platform scalability, low inter-processor communication overhead and a well-balanced workload in each task. In addition, a multilevel buffering scheme is presented, minimizing the external memory accesses in each task to one image pixel read per calculated corner response value. The effectiveness of the proposed task partition and buffering scheme has been verified on (i) a cycle accurate simulator with shared memory and (ii) a multiple-TI-C64 DSP board using a message passing paradigm. The proposed solution combines good platform scalability with an additional 30% speedup gain over straightforward parallelization schemes.