{"title":"A Multimodal Audio Visible and Infrared Surveillance System (MAVISS)","authors":"A. Mittal, P. Kumar","doi":"10.1109/ICISIP.2005.1619428","DOIUrl":null,"url":null,"abstract":"This paper presents a low cost surveillance system employing multimodal information (visible, infrared and audio signals) for monitoring small area and detecting alarming events. To ensure efficient and robust operation, the system captures different aspects of the environment using audio and video information. Infrared imagery is usedfor night and other low level lighting situations. The visual processing module of the system uses a motion based approach for detecting objects, and employs Kalman filter model for tracking its motion. Environmental sound is recognized by processing audio signals to extract features in the form of Mel-Frequency Cepstral coefficients (MFCC), which are then used for classification by Dynamic Time Warping (DTW) technique. Semantic rules are proposed to identify alarming events by using information from audio and video module. Experimental results are shown on some typical sequences and publicly available dataset.","PeriodicalId":261916,"journal":{"name":"2005 3rd International Conference on Intelligent Sensing and Information Processing","volume":"12 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2005-12-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2005 3rd International Conference on Intelligent Sensing and Information Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICISIP.2005.1619428","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 9
Abstract
This paper presents a low cost surveillance system employing multimodal information (visible, infrared and audio signals) for monitoring small area and detecting alarming events. To ensure efficient and robust operation, the system captures different aspects of the environment using audio and video information. Infrared imagery is usedfor night and other low level lighting situations. The visual processing module of the system uses a motion based approach for detecting objects, and employs Kalman filter model for tracking its motion. Environmental sound is recognized by processing audio signals to extract features in the form of Mel-Frequency Cepstral coefficients (MFCC), which are then used for classification by Dynamic Time Warping (DTW) technique. Semantic rules are proposed to identify alarming events by using information from audio and video module. Experimental results are shown on some typical sequences and publicly available dataset.