Johnny Reilly, John D Goodwin, Sihao Lu, Andriy S Kozlov
{"title":"用于自然刺激合成的双向生成对抗表征学习。","authors":"Johnny Reilly, John D Goodwin, Sihao Lu, Andriy S Kozlov","doi":"10.1152/jn.00421.2023","DOIUrl":null,"url":null,"abstract":"<p><p>Thousands of species use vocal signals to communicate with one another. Vocalizations carry rich information, yet characterizing and analyzing these complex, high-dimensional signals is difficult and prone to human bias. Moreover, animal vocalizations are ethologically relevant stimuli whose representation by auditory neurons is an important subject of research in sensory neuroscience. A method that can efficiently generate naturalistic vocalization waveforms would offer an unlimited supply of stimuli with which to probe neuronal computations. Although unsupervised learning methods allow for the projection of vocalizations into low-dimensional latent spaces learned from the waveforms themselves, and generative modeling allows for the synthesis of novel vocalizations for use in downstream tasks, we are not aware of any model that combines these tasks to synthesize naturalistic vocalizations in the waveform domain for stimulus playback. In this paper, we demonstrate BiWaveGAN: a bidirectional generative adversarial network (GAN) capable of learning a latent representation of ultrasonic vocalizations (USVs) from mice. We show that BiWaveGAN can be used to generate, and interpolate between, realistic vocalization waveforms. We then use these synthesized stimuli along with natural USVs to probe the sensory input space of mouse auditory cortical neurons. We show that stimuli generated from our method evoke neuronal responses as effectively as real vocalizations, and produce receptive fields with the same predictive power. BiWaveGAN is not restricted to mouse USVs but can be used to synthesize naturalistic vocalizations of any animal species and interpolate between vocalizations of the same or different species, which could be useful for probing categorical boundaries in representations of ethologically relevant auditory signals.<b>NEW & NOTEWORTHY</b> A new type of artificial neural network is presented that can be used to generate animal vocalization waveforms and interpolate between them to create new vocalizations. We find that our synthetic naturalistic stimuli drive auditory cortical neurons in the mouse equally well and produce receptive field features with the same predictive power as those obtained with natural mouse vocalizations, confirming the quality of the stimuli produced by the neural network.</p>","PeriodicalId":16563,"journal":{"name":"Journal of neurophysiology","volume":null,"pages":null},"PeriodicalIF":2.1000,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Bidirectional generative adversarial representation learning for natural stimulus synthesis.\",\"authors\":\"Johnny Reilly, John D Goodwin, Sihao Lu, Andriy S Kozlov\",\"doi\":\"10.1152/jn.00421.2023\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Thousands of species use vocal signals to communicate with one another. Vocalizations carry rich information, yet characterizing and analyzing these complex, high-dimensional signals is difficult and prone to human bias. Moreover, animal vocalizations are ethologically relevant stimuli whose representation by auditory neurons is an important subject of research in sensory neuroscience. A method that can efficiently generate naturalistic vocalization waveforms would offer an unlimited supply of stimuli with which to probe neuronal computations. Although unsupervised learning methods allow for the projection of vocalizations into low-dimensional latent spaces learned from the waveforms themselves, and generative modeling allows for the synthesis of novel vocalizations for use in downstream tasks, we are not aware of any model that combines these tasks to synthesize naturalistic vocalizations in the waveform domain for stimulus playback. In this paper, we demonstrate BiWaveGAN: a bidirectional generative adversarial network (GAN) capable of learning a latent representation of ultrasonic vocalizations (USVs) from mice. We show that BiWaveGAN can be used to generate, and interpolate between, realistic vocalization waveforms. We then use these synthesized stimuli along with natural USVs to probe the sensory input space of mouse auditory cortical neurons. We show that stimuli generated from our method evoke neuronal responses as effectively as real vocalizations, and produce receptive fields with the same predictive power. BiWaveGAN is not restricted to mouse USVs but can be used to synthesize naturalistic vocalizations of any animal species and interpolate between vocalizations of the same or different species, which could be useful for probing categorical boundaries in representations of ethologically relevant auditory signals.<b>NEW & NOTEWORTHY</b> A new type of artificial neural network is presented that can be used to generate animal vocalization waveforms and interpolate between them to create new vocalizations. We find that our synthetic naturalistic stimuli drive auditory cortical neurons in the mouse equally well and produce receptive field features with the same predictive power as those obtained with natural mouse vocalizations, confirming the quality of the stimuli produced by the neural network.</p>\",\"PeriodicalId\":16563,\"journal\":{\"name\":\"Journal of neurophysiology\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":2.1000,\"publicationDate\":\"2024-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of neurophysiology\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1152/jn.00421.2023\",\"RegionNum\":3,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2024/8/28 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q3\",\"JCRName\":\"NEUROSCIENCES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of neurophysiology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1152/jn.00421.2023","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/8/28 0:00:00","PubModel":"Epub","JCR":"Q3","JCRName":"NEUROSCIENCES","Score":null,"Total":0}
Bidirectional generative adversarial representation learning for natural stimulus synthesis.
Thousands of species use vocal signals to communicate with one another. Vocalizations carry rich information, yet characterizing and analyzing these complex, high-dimensional signals is difficult and prone to human bias. Moreover, animal vocalizations are ethologically relevant stimuli whose representation by auditory neurons is an important subject of research in sensory neuroscience. A method that can efficiently generate naturalistic vocalization waveforms would offer an unlimited supply of stimuli with which to probe neuronal computations. Although unsupervised learning methods allow for the projection of vocalizations into low-dimensional latent spaces learned from the waveforms themselves, and generative modeling allows for the synthesis of novel vocalizations for use in downstream tasks, we are not aware of any model that combines these tasks to synthesize naturalistic vocalizations in the waveform domain for stimulus playback. In this paper, we demonstrate BiWaveGAN: a bidirectional generative adversarial network (GAN) capable of learning a latent representation of ultrasonic vocalizations (USVs) from mice. We show that BiWaveGAN can be used to generate, and interpolate between, realistic vocalization waveforms. We then use these synthesized stimuli along with natural USVs to probe the sensory input space of mouse auditory cortical neurons. We show that stimuli generated from our method evoke neuronal responses as effectively as real vocalizations, and produce receptive fields with the same predictive power. BiWaveGAN is not restricted to mouse USVs but can be used to synthesize naturalistic vocalizations of any animal species and interpolate between vocalizations of the same or different species, which could be useful for probing categorical boundaries in representations of ethologically relevant auditory signals.NEW & NOTEWORTHY A new type of artificial neural network is presented that can be used to generate animal vocalization waveforms and interpolate between them to create new vocalizations. We find that our synthetic naturalistic stimuli drive auditory cortical neurons in the mouse equally well and produce receptive field features with the same predictive power as those obtained with natural mouse vocalizations, confirming the quality of the stimuli produced by the neural network.
期刊介绍:
The Journal of Neurophysiology publishes original articles on the function of the nervous system. All levels of function are included, from the membrane and cell to systems and behavior. Experimental approaches include molecular neurobiology, cell culture and slice preparations, membrane physiology, developmental neurobiology, functional neuroanatomy, neurochemistry, neuropharmacology, systems electrophysiology, imaging and mapping techniques, and behavioral analysis. Experimental preparations may be invertebrate or vertebrate species, including humans. Theoretical studies are acceptable if they are tied closely to the interpretation of experimental data and elucidate principles of broad interest.