{"title":"Can We Identify Unknown Audio Recording Environments in Forensic Scenarios?","authors":"Denise Moussa, Germans Hirsch, Christian Riess","doi":"arxiv-2405.02119","DOIUrl":null,"url":null,"abstract":"Audio recordings may provide important evidence in criminal investigations.\nOne such case is the forensic association of the recorded audio to the\nrecording location. For example, a voice message may be the only investigative\ncue to narrow down the candidate sites for a crime. Up to now, several works\nprovide tools for closed-set recording environment classification under\nrelatively clean recording conditions. However, in forensic investigations, the\ncandidate locations are case-specific. Thus, closed-set tools are not\napplicable without retraining on a sufficient amount of training samples for\neach case and respective candidate set. In addition, a forensic tool has to\ndeal with audio material from uncontrolled sources with variable properties and\nquality. In this work, we therefore attempt a major step towards practical forensic\napplication scenarios. We propose a representation learning framework called\nEnvId, short for environment identification. EnvId avoids case-specific\nretraining. Instead, it is the first tool for robust few-shot classification of\nunseen environment locations. We demonstrate that EnvId can handle forensically\nchallenging material. It provides good quality predictions even under unseen\nsignal degradations, environment characteristics or recording position\nmismatches. Our code and datasets will be made publicly available upon acceptance.","PeriodicalId":501178,"journal":{"name":"arXiv - CS - Sound","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-05-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Sound","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2405.02119","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Audio recordings may provide important evidence in criminal investigations.
One such case is the forensic association of the recorded audio to the
recording location. For example, a voice message may be the only investigative
cue to narrow down the candidate sites for a crime. Up to now, several works
provide tools for closed-set recording environment classification under
relatively clean recording conditions. However, in forensic investigations, the
candidate locations are case-specific. Thus, closed-set tools are not
applicable without retraining on a sufficient amount of training samples for
each case and respective candidate set. In addition, a forensic tool has to
deal with audio material from uncontrolled sources with variable properties and
quality. In this work, we therefore attempt a major step towards practical forensic
application scenarios. We propose a representation learning framework called
EnvId, short for environment identification. EnvId avoids case-specific
retraining. Instead, it is the first tool for robust few-shot classification of
unseen environment locations. We demonstrate that EnvId can handle forensically
challenging material. It provides good quality predictions even under unseen
signal degradations, environment characteristics or recording position
mismatches. Our code and datasets will be made publicly available upon acceptance.