{"title":"Emphatic human interaction analysis for cognitive environments","authors":"C. Regazzoni","doi":"10.1145/1877868.1877870","DOIUrl":"https://doi.org/10.1145/1877868.1877870","url":null,"abstract":"Understanding the dynamic evolution of complex scenes where multiple patterns interact according to a hidden semantic goal is an issue of current intelligent environments. This issue is made somehow more complex due to the more spread and intensive use of camera systems to help human operators in the monitoring task. Analyzing multimedia data provided by wide set of cameras simultaneously monitoring different environments makes it necessary not only to focus the attention of human operators on relevant occurring events, but also to actively support their decision about optimal reactions to be taken to manage abnormal situations. Cognitive tasks to be modeled in integrated intelligent systems become not only multisensor data processing and scene understanding, but also proactive decision making: a recognized abnormal interactive situation occurring in the scene must be possibly controlled in such a way that divergence from normal event flow can not compromise security level of an environment.\u0000 Cognitive environments often aim at friendly improving the usefulness of a given physical space by humans according to a given paradigm and objective of use. To this end, they often employ pervasive communications tools to send messages to cooperative humans in a given environment to help me in real time situations they are living, in order to help them to accomplish their tasks in a more smooth and effective way. To do so, they can use situation assessment tools interpreting available sensor data in terms of dynamic state and events generated by objects present in their scene and their interactions. In many cases, assessed situation can be not only estimated but also predicted, if dynamic models of it are available.\u0000 Capability of predicting behavior of objects along a given interaction situation can be interpreted as a way to directly evaluate not only evolution of actions of a given object in a contextual framework determined by the interacting object, but also as a way to estimate and to predict (based on a indirect observation and an appropriate model) the subjective emotional and motivational hidden variables that carried the object to decide a certain action to be performed on the basis of subjectively sensed data. Therefore, if appropriate models are available a sort of empathic interaction analysis can be performed that should allow a cognitive environment to be \"immersively\" connected with interacting entities, being able to predict actions they will take in given contextual situation.\u0000 Cognitive environments can take advantage of such an empathic interaction analysis in case they can be in communication with some of the humans involved in a given interaction, for example by using wireless terminals or varying message panels in a physical environment. In this case it comes out that it becomes interesting to study which architecture and processing methods can be used to design cognitive environments intelligence as a set of concurring continuous loo","PeriodicalId":360789,"journal":{"name":"ACM/IEEE international workshop on Analysis and Retrieval of Tracked Events and Motion in Imagery Stream","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114702667","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
C. Spampinato, D. Giordano, R. Salvo, Y. Chen-Burger, Robert B. Fisher, G. Nadarajan
{"title":"Automatic fish classification for underwater species behavior understanding","authors":"C. Spampinato, D. Giordano, R. Salvo, Y. Chen-Burger, Robert B. Fisher, G. Nadarajan","doi":"10.1145/1877868.1877881","DOIUrl":"https://doi.org/10.1145/1877868.1877881","url":null,"abstract":"The aim of this work is to propose an automatic fish classification system that operates in the natural underwater environment to assist marine biologists in understanding subehavior. Fish classification is performed by combining two types of features: 1) Texture features extracted by using statistical moments of the gray-level histogram, spatial Gabor filtering and properties of the co-occurrence matrix and 2) Shape Features extracted by using the Curvature Scale Space transform and the histogram of Fourier descriptors of boundaries. An affine transformation is also applied to the acquired images to represent fish in 3D by multiple views for the feature extraction. The system was tested on a database containing 360 images of ten different species achieving as average correct rate of about 92%. Then, fish trajectories extracted using the proposed fish classification combined with a tracking system, are analyzed in order to understand anomalous behavior. In detail, the tracking layer computer fish trajectories, the classification layer associates trajectories to fish species and then by clustering these trajectories we are able to detect unusual fish behaviors to be further investigated by marine biologists.","PeriodicalId":360789,"journal":{"name":"ACM/IEEE international workshop on Analysis and Retrieval of Tracked Events and Motion in Imagery Stream","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121773932","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Masaki Takahashi, Mahito Fujii, M. Naemura, S. Satoh
{"title":"Human gesture recognition using 3.5-dimensional trajectory features for hands-free user interface","authors":"Masaki Takahashi, Mahito Fujii, M. Naemura, S. Satoh","doi":"10.1145/1877868.1877872","DOIUrl":"https://doi.org/10.1145/1877868.1877872","url":null,"abstract":"We present a new human motion recognition technique for a hands-free user interface. Although many motion recognition technologies for video sequences have been reported, no man-machine interface that recognizes enough variety of motions has been developed. The difficulty was the lack of spatial information that could be acquired from video sequences captured by a normal camera. The proposed system uses a depth image in addition to a normal grayscale image from a time-of-flight camera that measures the depth to objects, so various motions are accurately recognized. The main functions of this system are gesture recognition and posture measurement. The former is performed using the bag-of-words approach. The trajectories of tracked key points around the human body are used as features in this approach. The main technical contribution of the proposed method is the use of 3.5D spatiotemporal trajectory features, which contain horizontal, vertical, time, and depth information. The latter is obtained through face detection and object tracking technology. The proposed user interface is useful and natural because it does not require any contact-type devices, such as a motion sensor controller. The effectiveness of the proposed 3.5D spatiotemporal features was confirmed through a comparative experiment with conventional 3.0D spatiotemporal features. The generality of the system was proven by an experiment with multiple people. The usefulness of the system as a pointing device was also proven by a practical simulation.","PeriodicalId":360789,"journal":{"name":"ACM/IEEE international workshop on Analysis and Retrieval of Tracked Events and Motion in Imagery Stream","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134187698","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ciarán Ó Conaire, Damien Connaghan, Philip Kelly, N. O’Connor, Mark Gaffney, J. Buckley
{"title":"Combining inertial and visual sensing for human action recognition in tennis","authors":"Ciarán Ó Conaire, Damien Connaghan, Philip Kelly, N. O’Connor, Mark Gaffney, J. Buckley","doi":"10.1145/1877868.1877882","DOIUrl":"https://doi.org/10.1145/1877868.1877882","url":null,"abstract":"In this paper, we present a framework for both the automatic extraction of the temporal location of tennis strokes within a match and the subsequent classification of these as being either a serve, forehand or backhand. We employ the use of low-cost visual sensing and low-cost inertial sensing to achieve these aims, whereby a single modality can be used or a fusion of both classification strategies can be adopted if both modalities are available within a given capture scenario. This flexibility allows the framework to be applicable to a variety of user scenarios and hardware infrastructures. Our proposed approach is quantitatively evaluated using data captured from elite tennis players. Results point to the extremely accurate performance of the proposed approach irrespective of input modality configuration","PeriodicalId":360789,"journal":{"name":"ACM/IEEE international workshop on Analysis and Retrieval of Tracked Events and Motion in Imagery Stream","volume":"436 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132525309","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Human pose estimation with implicit shape models","authors":"Jürgen Müller, Michael Arens","doi":"10.1145/1877868.1877873","DOIUrl":"https://doi.org/10.1145/1877868.1877873","url":null,"abstract":"We address the problem of articulated 2D human pose estimation in natural images. A well-known person detector -- the Implicit Shape Model (ISM) approach introduced by Leibe et al. -- is shown not only to be well suited to detect persons, but can also be exploited to derive a person's pose. Therefore, we extend the original voting approach of ISM and let all visual words that contribute to a person hypothesis also vote for the positions of the person's body parts. Since this approach is not constrained to a certain feature type and different feature types can even be fused during the pose estimation process, the approach is highly flexible. We show preliminary evaluation results of our approach using on the public available HumanEva dataset which comprises ground-truth pose data and thereby provides training and evaluation data.","PeriodicalId":360789,"journal":{"name":"ACM/IEEE international workshop on Analysis and Retrieval of Tracked Events and Motion in Imagery Stream","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129904882","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Implicit visual concept modeling in image / video annotation","authors":"K. Ntalianis, A. Doulamis, N. Tsapatsoulis","doi":"10.1145/1877868.1877878","DOIUrl":"https://doi.org/10.1145/1877868.1877878","url":null,"abstract":"In this paper a novel approach for automatically annotating image databases is proposed. Despite most current approaches that are just based on spatial content analysis, the proposed method properly combines implicit feedback information and visual concept models for semantically annotating images. Our method can be easily adopted by any multimedia search engine, providing an intelligent way to even annotate completely non-annotated content. The proposed approach currently provides very interesting results in limited-content environments and it is expected to add significant value to billions of non-annotated images existing in the Web. Furthermore expert annotators can gain important knowledge relevant to user new trends, language idioms and styles of searching.","PeriodicalId":360789,"journal":{"name":"ACM/IEEE international workshop on Analysis and Retrieval of Tracked Events and Motion in Imagery Stream","volume":"74 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134636302","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ricardo Castellanos, H. Kalva, Oge Marques, B. Furht
{"title":"Event detection in video using motion analysis","authors":"Ricardo Castellanos, H. Kalva, Oge Marques, B. Furht","doi":"10.1145/1877868.1877884","DOIUrl":"https://doi.org/10.1145/1877868.1877884","url":null,"abstract":"Digital video is being used widely in a variety of applications such as entertainment, surveillance and security. Large amount of video in surveillance and security requires systems capable of processing video to automatically detect and recognize events to alleviate the load on humans and enable preventive actions when events are detected. The main objective of this work is the analysis of computer vision techniques and algorithms to perform automatic detection of specific events in video sequences. This paper presents a surveillance system based on motion analysis and introduces the idea of event probability zones. Advantages, limitations, capabilities and possible solution alternatives are also discussed. The result is a system capable of detecting events of objects moving in opposing direction in a predefined context or running in the scene; the results showed precision greater than 50% and recall greater than 80%.","PeriodicalId":360789,"journal":{"name":"ACM/IEEE international workshop on Analysis and Retrieval of Tracked Events and Motion in Imagery Stream","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131046581","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
L. Cinque, F. Renzo, G. Foresti, C. Micheloni, Gabriele Morrone
{"title":"Multi-sensor registration for objects motion detection","authors":"L. Cinque, F. Renzo, G. Foresti, C. Micheloni, Gabriele Morrone","doi":"10.1145/1877868.1877890","DOIUrl":"https://doi.org/10.1145/1877868.1877890","url":null,"abstract":"The first step in order to achieve low-level multi-sensor fusion is the registration of images from multiple types of sensors. This is a very important task: it can be useful to improve the detection or the tracking of a moving object in an area. Putting together the information of an IR (infrared) and a visual camera we can use the information of the heat emanated from a human body to detect a pedestrian in the video.\u0000 Basically we can align the IR and visual data knowing the calibration of the sensors, and always moving them together. In a real situation, it can be useful to align the images without imposing anything on the starting condition of the cameras and their relative position. In this paper, we present a method to automatically register IR with visual image data. The method uses geometric structures that are matched with a partial graph matching algorithm. We also introduce an iterative method to map IR and visual sequences using the homography matrix between frames. This method can be used to improve the multi-sensor motion detection: from an initial detection of a moving object in the visual image we can obtain the corresponding region in the thermal image.","PeriodicalId":360789,"journal":{"name":"ACM/IEEE international workshop on Analysis and Retrieval of Tracked Events and Motion in Imagery Stream","volume":"188 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123323813","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
R. Mörzinger, M. Sardis, Igor Rosenberg, H. Grabner, G. Veres, Imed Bouchrika, M. Thaler, René Schuster, Albert Hofmann, G. Thallinger, Vasilios Anagnostopoulos, D. Kosmopoulos, A. Voulodimos, C. Lalos, N. Doulamis, T. Varvarigou, Rolando Palma Zelada, Ignacio Jubert Soler, S. Stalder, L. Gool, L. Middleton, Z. Sabeur, Banafshe Arbab-Zavar, J. Carter, M. Nixon
{"title":"Tools for semi-automatic monitoring of industrial workflows","authors":"R. Mörzinger, M. Sardis, Igor Rosenberg, H. Grabner, G. Veres, Imed Bouchrika, M. Thaler, René Schuster, Albert Hofmann, G. Thallinger, Vasilios Anagnostopoulos, D. Kosmopoulos, A. Voulodimos, C. Lalos, N. Doulamis, T. Varvarigou, Rolando Palma Zelada, Ignacio Jubert Soler, S. Stalder, L. Gool, L. Middleton, Z. Sabeur, Banafshe Arbab-Zavar, J. Carter, M. Nixon","doi":"10.1145/1877868.1877889","DOIUrl":"https://doi.org/10.1145/1877868.1877889","url":null,"abstract":"This paper describes a tool chain for monitoring complex workflows. Statistics obtained from automatic workflow monitoring in a car assembly environment assist in improving industrial safety and process quality. To this end, we propose automatic detection and tracking of humans and their activity in multiple networked cameras. The described tools offer human operators retrospective analysis of a huge amount of pre-recorded and analyzed footage from multiple cameras in order to get a comprehensive overview of the workflows. Furthermore, the tools help technical administrators in adjusting algorithms by letting the user correct detections (for relevance feedback) and ground truth for evaluation. Another important feature of the tool chain is the capability to inform the employees about potentially risky conditions using the tool for automatic detection of unusual scenes.","PeriodicalId":360789,"journal":{"name":"ACM/IEEE international workshop on Analysis and Retrieval of Tracked Events and Motion in Imagery Stream","volume":"47 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128856591","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Multi-object particle filter tracking with automatic event analysis","authors":"Yifan Zhou, J. Benois-Pineau, H. Nicolas","doi":"10.1145/1877868.1877876","DOIUrl":"https://doi.org/10.1145/1877868.1877876","url":null,"abstract":"The automatic video content analysis is an important step to provide the content-based video coding, indexing and retrieval. It is also a key issue to the event analysis in video surveillance. In this paper, an automatic event analysis approach is presented. It is based on our previous method of Multi-object Particle Filter Tracking with Dual Consistency Check. The multiple non-rigid objects are first tracked individually in parallel by multi-resolution technique and particle filter method. The events including object presence and occlusion identification are then detected and analyzed by measuring the Goodness-of-Fit Coefficient based on Schwartz's inequality and the Backward Projection. The method is then tested in different indoor and outdoor environments with cluttered background. The experimental results show the robustness and the effectiveness of the method.","PeriodicalId":360789,"journal":{"name":"ACM/IEEE international workshop on Analysis and Retrieval of Tracked Events and Motion in Imagery Stream","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126494455","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}