{"title":"Convolution Neural Networks of Dynamically Sized Filters with Modified Stochastic Gradient Descent Optimizer for Sound Classification","authors":"Manu Pratap Singh, Pratibha Rashmi","doi":"10.3844/jcssp.2024.69.87","DOIUrl":null,"url":null,"abstract":": Deep Neural Networks (DNNs), specifically Convolution Neural Networks (CNNs) are found well suited to address the problem of sound classification due to their ability to capture the pattern of time and frequency domain. Mostly the convolutional neural networks are trained and tested with time-frequency patches of sound samples in the form of 2D pattern vectors. Generally, existing pre-trained convolutional neural network models use static-sized filters in all the convolution layers. In this present work, we consider the three different types of convolutional neural network architectures with different variable-size filters. The training set pattern vectors of time and frequency dimensions are constructed with the input samples of the spectrogram. In our proposed architectures, the size of kernels and the number of kernels are considered with a scale of variable length instead of fixed-size filters and static channels. The paper further presents the reformulation of a minibatch stochastic gradient descent optimizer with adaptive learning rate parameters according to the proposed architectures. The experimental results are obtained on the existing dataset of sound samples. The simulated results show the better performance of the proposed convolutional neural network architectures over existing pre-trained networks on the same dataset.","PeriodicalId":40005,"journal":{"name":"Journal of Computer Science","volume":"20 7","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Computer Science","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3844/jcssp.2024.69.87","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
: Deep Neural Networks (DNNs), specifically Convolution Neural Networks (CNNs) are found well suited to address the problem of sound classification due to their ability to capture the pattern of time and frequency domain. Mostly the convolutional neural networks are trained and tested with time-frequency patches of sound samples in the form of 2D pattern vectors. Generally, existing pre-trained convolutional neural network models use static-sized filters in all the convolution layers. In this present work, we consider the three different types of convolutional neural network architectures with different variable-size filters. The training set pattern vectors of time and frequency dimensions are constructed with the input samples of the spectrogram. In our proposed architectures, the size of kernels and the number of kernels are considered with a scale of variable length instead of fixed-size filters and static channels. The paper further presents the reformulation of a minibatch stochastic gradient descent optimizer with adaptive learning rate parameters according to the proposed architectures. The experimental results are obtained on the existing dataset of sound samples. The simulated results show the better performance of the proposed convolutional neural network architectures over existing pre-trained networks on the same dataset.
期刊介绍:
Journal of Computer Science is aimed to publish research articles on theoretical foundations of information and computation, and of practical techniques for their implementation and application in computer systems. JCS updated twelve times a year and is a peer reviewed journal covers the latest and most compelling research of the time.