{"title":"Computer-aided classification of bowhead whale call categories for mitigation monitoring","authors":"D. Mathias, A. Thode, S. B. Blackwell, C. Greene","doi":"10.1109/PASSIVE.2008.4786985","DOIUrl":null,"url":null,"abstract":"Since 2001 Directional Autonomous Seafloor Acoustic Recorders (DASARs) have been used to localize and record bowhead whale (Balaena mysticetus) calls during their annual migration. In 2007 DASARs were deployed at 35 locations over a 280 km swath in the Beaufort Sea, during seismic exploration activities (Fig. 1), in order to monitor potential changes in the animals' location and/or acoustic activity during the seismic activities. The large amount of acoustic data generated (about 50 days per DASAR) motivated the development of computer-aided methods to assist in detecting and classifying bowhead whale calls. Bowhead whale calls can be classified in various ways. Here, we divide calls into six categories: (1) upsweeps, (2) downsweeps, (3) constant calls, (4) u-shaped and (5) n-shaped undulated calls, and (6) complex calls, a catch-all category that covers both frequency-modulated calls with multiple inflections, and amplitude-modulated calls such as warbles, growls, and other such sounds. In addition, walrus and bearded seal calls can produce similar call features in a spectrogram, yielding a total of eight classification categories. The frequency range, duration, and fine structure of individual calls vary considerably even within each category, creating difficulties when using simple matched- filtering or spectrogram correlation methods. A manually reviewed test dataset was assembled, containing examples from each call category, arranged by signal-to-noise ratio (SNR) in 5 dB bins, ranging from 5 to 40 dB. The dataset was then used to test several methods for extracting relevant parameters from the signal for subsequent classification. Contour tracing methods that estimate frequency bandwidth, inflection points, and duration were examined, as well as other boundary descriptors that utilize standard image segmentation techniques. An optimization procedure was then used to determine appropriate decision boundaries for optimum statistical classifiers.","PeriodicalId":153349,"journal":{"name":"2008 New Trends for Environmental Monitoring Using Passive Systems","volume":"59 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2008-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2008 New Trends for Environmental Monitoring Using Passive Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/PASSIVE.2008.4786985","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5
Abstract
Since 2001 Directional Autonomous Seafloor Acoustic Recorders (DASARs) have been used to localize and record bowhead whale (Balaena mysticetus) calls during their annual migration. In 2007 DASARs were deployed at 35 locations over a 280 km swath in the Beaufort Sea, during seismic exploration activities (Fig. 1), in order to monitor potential changes in the animals' location and/or acoustic activity during the seismic activities. The large amount of acoustic data generated (about 50 days per DASAR) motivated the development of computer-aided methods to assist in detecting and classifying bowhead whale calls. Bowhead whale calls can be classified in various ways. Here, we divide calls into six categories: (1) upsweeps, (2) downsweeps, (3) constant calls, (4) u-shaped and (5) n-shaped undulated calls, and (6) complex calls, a catch-all category that covers both frequency-modulated calls with multiple inflections, and amplitude-modulated calls such as warbles, growls, and other such sounds. In addition, walrus and bearded seal calls can produce similar call features in a spectrogram, yielding a total of eight classification categories. The frequency range, duration, and fine structure of individual calls vary considerably even within each category, creating difficulties when using simple matched- filtering or spectrogram correlation methods. A manually reviewed test dataset was assembled, containing examples from each call category, arranged by signal-to-noise ratio (SNR) in 5 dB bins, ranging from 5 to 40 dB. The dataset was then used to test several methods for extracting relevant parameters from the signal for subsequent classification. Contour tracing methods that estimate frequency bandwidth, inflection points, and duration were examined, as well as other boundary descriptors that utilize standard image segmentation techniques. An optimization procedure was then used to determine appropriate decision boundaries for optimum statistical classifiers.