{"title":"A Performance Review of Recent Corner Detectors","authors":"M. Awrangjeb, Guojun Lu","doi":"10.1109/DICTA.2013.6691475","DOIUrl":"https://doi.org/10.1109/DICTA.2013.6691475","url":null,"abstract":"Contour-based corner detectors directly or indirectly estimate a significance measure (eg, curvature) on the points of a planar curve and select the curvature extrema points as corners. A number of promising contour-based corner detectors have recently been proposed. They mainly differ in how the curvature is estimated on each point of the given curve. As the curvature on a digital curve can only be approximated, it is important to estimate a curvature that remains stable against significant noises, for example, geometric transformations and compression, on the curve. Moreover, in many applications, for instance, in content-based image retrieval, a fast corner detector is a prerequisite. So, it is also a primary characteristic that how much time a corner detector takes for corner detection in a given image. In addition, different authors evaluated their detectors on different platforms using different evaluation systems. Evaluation systems that depend on human judgements and visual identification of corners are manual and too subjective. Application of a manual system on a large test database will be expensive. Therefore, it is important to evaluate the detectors on a common platform using an automatic evaluation system. This paper first reviews six most recent and highly performed corner detectors and analyse their theoretical running time. Then it uses an automatic evaluation system to analyse their performance. Both the robustness to noise and efficiency are estimated to rank the detectors.","PeriodicalId":231632,"journal":{"name":"2013 International Conference on Digital Image Computing: Techniques and Applications (DICTA)","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121420034","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Pose Estimation of Ad-Hoc Mobile Camera Networks","authors":"Zsolt Sánta, Z. Kato","doi":"10.1109/DICTA.2013.6691514","DOIUrl":"https://doi.org/10.1109/DICTA.2013.6691514","url":null,"abstract":"An algorithm is proposed for the pose estimation of ad-hoc mobile camera networks with overlapping views. The main challenge is to estimate camera parameters with respect to the 3D scene without any specific calibration pattern, hence allowing for a consistent, camera-independent world coordinate system. The only assumption about the scene is that it contains a planar surface patch of a low-rank texture, which is visible in at least two cameras. Such low-rank patterns are quite common in urban environments. The proposed algorithm consists of three main steps: relative pose estimation of the cameras within the network, followed by the localization of the network within the 3D scene using a low-rank surface patch, and finally the estimation of a consistent scale for the whole system. The algorithm follows a distributed architecture, hence the computing power of the participating mobile devices are efficiently used. The performance and robustness of the proposed algorithm have been analyzed on both synthetic and real data. Experimental results confirmed the relevance and applicability of the method.","PeriodicalId":231632,"journal":{"name":"2013 International Conference on Digital Image Computing: Techniques and Applications (DICTA)","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128477930","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An Improved Restoration Method for Non-Uniformly Warped Images Using Optical Flow Technique","authors":"K. K. Halder, M. Tahtali, S. Anavatti","doi":"10.1109/DICTA.2013.6691481","DOIUrl":"https://doi.org/10.1109/DICTA.2013.6691481","url":null,"abstract":"A high precision and fast image restoration method is proposed to restore a geometrically corrected image from the atmospheric turbulence degraded video sequence of a static scenery. In this approach, we employ an optical flow technique to register all the frames of the distorted video to a reference frame and determine the flow fields. We use the First Register Then Average And Subtract-variant (FRTAASv) method to correct the geometric distortions using the computed flow fields. We present a performance comparison between our proposed restoration method and earlier Minimum Sum of Squared Differences (MSSD) image registration based FRTAASv method in terms of computational time and accuracy. Simulation experiments show that our proposed method provides higher accuracy with quicker processing time.","PeriodicalId":231632,"journal":{"name":"2013 International Conference on Digital Image Computing: Techniques and Applications (DICTA)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128918931","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xinyu Wei, Long Sha, P. Lucey, S. Morgan, S. Sridharan
{"title":"Large-Scale Analysis of Formations in Soccer","authors":"Xinyu Wei, Long Sha, P. Lucey, S. Morgan, S. Sridharan","doi":"10.1109/DICTA.2013.6691503","DOIUrl":"https://doi.org/10.1109/DICTA.2013.6691503","url":null,"abstract":"Due to the demand for better and deeper analysis in sports, organizations (both professional teams and broadcasters) are looking to use spatiotemporal data in the form of player tracking information to obtain an advantage over their competitors. However, due to the large volume of data, its unstructured nature, and lack of associated team activity labels (e.g. strategic/tactical), effective and efficient strategies to deal with such data have yet to be deployed. A bottleneck restricting such solutions is the lack of a suitable representation (i.e. ordering of players) which is immune to the potentially infinite number of possible permutations of player orderings, in addition to the high dimensionality of temporal signal (e.g. a game of soccer last for 90 mins). Leveraging a recent method which utilizes a \"role-representation\", as well as a feature reduction strategy that uses a spatiotemporal bilinear basis model to form a compact spatiotemporal representation. Using this representation, we find the most likely formation patterns of a team associated with match events across nearly 14 hours of continuous player and ball tracking data in soccer. Additionally, we show that we can accurately segment a match into distinct game phases and detect highlights. (i.e. shots, corners, free-kicks, etc) completely automatically using a decision-tree formulation.","PeriodicalId":231632,"journal":{"name":"2013 International Conference on Digital Image Computing: Techniques and Applications (DICTA)","volume":"68 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128484516","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Sabanadesan Umakanthan, S. Denman, C. Fookes, S. Sridharan
{"title":"Semi-Binary Based Video Features for Activity Representation","authors":"Sabanadesan Umakanthan, S. Denman, C. Fookes, S. Sridharan","doi":"10.1109/DICTA.2013.6691527","DOIUrl":"https://doi.org/10.1109/DICTA.2013.6691527","url":null,"abstract":"Efficient and effective feature detection and representation is an important consideration when processing videos, and a large number of applications such as motion analysis, 3D scene understanding, tracking etc depend on this. Amongst several feature description methods, local features are becoming increasingly popular for representing videos because of their simplicity and efficiency. While they achieve state-of-the-art performance with low computational complexity, their performance is still too limited for real world applications. Furthermore, rapid increases in the uptake of mobile devices has increased the demand for algorithms that can run with reduced memory and computational requirements. In this paper we propose a semi binary based feature detector-descriptor based on the BRISK detector, which can detect and represent videos with significantly reduced computational requirements, while achieving comparable performance to the state of the art spatio- temporal feature descriptors. First, the BRISK feature detector is applied on a frame by frame basis to detect interest points, then the detected key points are compared against consecutive frames for significant motion. Key points with significant motion are encoded with the BRISK descriptor in the spatial domain and Motion Boundary Histogram in the temporal domain. This descriptor is not only lightweight but also has lower memory requirements because of the binary nature of the BRISK descriptor, allowing the possibility of applications using hand held devices. We evaluate the combination of detector-descriptor performance in the context of action classification with a standard, popular bag-of-features with SVM framework. Experiments are carried out on two popular datasets with varying complexity and we demonstrate comparable performance with other descriptors with reduced computational complexity.","PeriodicalId":231632,"journal":{"name":"2013 International Conference on Digital Image Computing: Techniques and Applications (DICTA)","volume":"82 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132723667","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
M. Hossain, A. Muhit, M. Pickering, J. Scarvell, Paul N. Smith
{"title":"A 3D-2D Image Registration Algorithm for Kinematic Analysis of the Knee after Total Knee Arthroplasty (TKA)","authors":"M. Hossain, A. Muhit, M. Pickering, J. Scarvell, Paul N. Smith","doi":"10.1109/DICTA.2013.6691472","DOIUrl":"https://doi.org/10.1109/DICTA.2013.6691472","url":null,"abstract":"Total Knee Arthroplasty or TKA is a surgical procedure for relief of significant, disabling pain caused by severe arthritis. In TKA component design, 3D-2D registration using single plane fluoroscopy is very important for analyzing 3D knee kinematics. The purpose of this study is to determine the precision provided by a new 3D-2D registration algorithm for the kinematic analysis of TKA components. In this paper, we compare kinematic measurements obtained by our new 3D-2D registration algorithm with measurements provided by the gold standard Roentgen Stereo analysis (RSA). The main focus of the study is on the out-of-plane translation and rotation movements which are difficult to measure precisely using a single plane approach. From our experimental results we found that in the proposed algorithm the standard deviation of the error for out-of-plane translation is 0.38mm for the implant which compares favorably to RSA and is significantly lower than other previously proposed approaches can provide.","PeriodicalId":231632,"journal":{"name":"2013 International Conference on Digital Image Computing: Techniques and Applications (DICTA)","volume":"55 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123815260","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Daniela Moctezuma, C. Conde, Isaac Martín de Diego, E. Cabello
{"title":"Incremental Learning with Soft-Biometric Features for People Re-Identification in Multi-Camera Environments","authors":"Daniela Moctezuma, C. Conde, Isaac Martín de Diego, E. Cabello","doi":"10.1109/DICTA.2013.6691500","DOIUrl":"https://doi.org/10.1109/DICTA.2013.6691500","url":null,"abstract":"In this paper, a solution for the appearance based people re-identification problem in a non-overlapping multicamera surveillance environment is presented. For this purpose, an incremental learning approach and a SVM classifier have been considered. The proposed methods update the appearance model across different camera conditions in three different ways: based on time lapses, on change of camera and on the automatic selection of the most representative samples. In order to test the proposed methods, a complete database was acquired at Barajas international airport (the MUBA proposed database). Further the well known PETS 2006 and PETS 2009 databases were considered. The system has been designed for video surveillance security. The main idea of this system is that, in an initial point, the suspect is manually identified by the user. Then, from that moment, the system is able to identify the selected subject across the different cameras in the surveillance area. The results obtained show the importance of the model update and the huge potential of the incremental learning approach.","PeriodicalId":231632,"journal":{"name":"2013 International Conference on Digital Image Computing: Techniques and Applications (DICTA)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130627927","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Sketch-Based Image Retrieval by Size-Adaptive and Noise-Robust Feature Description","authors":"Houssem Chatbri, K. Kameyama, P. Kwan","doi":"10.1109/DICTA.2013.6691528","DOIUrl":"https://doi.org/10.1109/DICTA.2013.6691528","url":null,"abstract":"We review available methods for Sketch-Based Image Retrieval (SBIR) and we discuss their limitations. Then, we present two SBIR algorithms: The first algorithm extracts shape features by using support regions calculated for each sketch point, and the second algorithm adapts the Shape Context descriptor to make it scale invariant and enhances its performance in presence of noise. Both algorithms share the property of calculating the feature extraction window according to the sketch size. Experiments and comparative evaluation with state-of-the-art methods show that the proposed algorithms are competitive in distinctiveness capability and robust against noise.","PeriodicalId":231632,"journal":{"name":"2013 International Conference on Digital Image Computing: Techniques and Applications (DICTA)","volume":"87 3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130885896","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Long Sha, P. Lucey, S. Morgan, D. Pease, S. Sridharan
{"title":"Swimmer Localization from a Moving Camera","authors":"Long Sha, P. Lucey, S. Morgan, D. Pease, S. Sridharan","doi":"10.1109/DICTA.2013.6691533","DOIUrl":"https://doi.org/10.1109/DICTA.2013.6691533","url":null,"abstract":"At the highest level of competitive sport, nearly all performances of athletes (both training and competitive) are chronicled using video. Video is then often viewed by expert coaches/analysts who then manually label important performance indicators to gauge performance. Stroke-rate and pacing are important performance measures in swimming, and these are previously digitised manually by a human. This is problematic as annotating large volumes of video can be costly, and time-consuming. Further, since it is difficult to accurately estimate the position of the swimmer at each frame, measures such as stroke rate are generally aggregated over an entire swimming lap. Vision-based techniques which can automatically, objectively and reliably track the swimmer and their location can potentially solve these issues and allow for large-scale analysis of a swimmer across many videos. However, the aquatic environment is challenging due to fluctuations in scene from splashes, reflections and because swimmers are frequently submerged at different points in a race. In this paper, we temporally segment races into distinct and sequential states, and propose a multimodal approach which employs individual detectors tuned to each race state. Our approach allows the swimmer to be located and tracked smoothly in each frame despite a diverse range of constraints. We test our approach on a video dataset compiled at the 2012 Australian Short Course Swimming Championships.","PeriodicalId":231632,"journal":{"name":"2013 International Conference on Digital Image Computing: Techniques and Applications (DICTA)","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115980958","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"3D Registration in Dark Environments Using RGB-D Cameras","authors":"K. Yousif, A. Bab-Hadiashar, R. Hoseinnezhad","doi":"10.1109/DICTA.2013.6691470","DOIUrl":"https://doi.org/10.1109/DICTA.2013.6691470","url":null,"abstract":"This paper presents a new approach to align corresponding 3D points obtained by different frames in environments with varied illumination using an RGB-D camera (Microsoft Kinect). Our method switches between the RGB and IR images for feature extraction based on the brightness level of the images. The corresponding visual features are matched using their descriptors and outliers (false matches) are removed using a rank ordered statistics based robust estimation approach.The estimated 3D transformations are finally refined using an Iterative Closest Point (ICP) approach. We show that our method is able to obtain accurate transformation estimation between frames in dark environments (typical office environments with no artificial lighting). We finally present a real-time Visual Odometry (VO) system that concatenates the estimated camera transformations between sequential frames and obtains a global camera pose estimate with respect to a fixed reference frame that outperforms the state-of-the-art methods in both lit and dark environments.","PeriodicalId":231632,"journal":{"name":"2013 International Conference on Digital Image Computing: Techniques and Applications (DICTA)","volume":"288 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-12-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122474369","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}