Imen Kadri, G. Dauphin, Anissa Zergaïnoh-Mokraoui, Z. Lachiri
{"title":"Stereoscopic Image Coding Using a Global Disparity Estimation Algorithm Optimizing the Compensation Scheme Impact","authors":"Imen Kadri, G. Dauphin, Anissa Zergaïnoh-Mokraoui, Z. Lachiri","doi":"10.23919/spa50552.2020.9241244","DOIUrl":"https://doi.org/10.23919/spa50552.2020.9241244","url":null,"abstract":"This paper focuses on the disparity-compensated stereoscopic image coding. This coding scheme is the most implemented technique taking advantage of the typical stereoscopic image redundancy. Namely it first predicts one view using the other view and a disparity map. A compensated view is then computed by coding the remaining difference between the view and its prediction. This paper concerns the computation of the disparity map combining two existing techniques. The first one, more efficient at low bitrate, is an iterative search reducing the bit cost of losslessly storing the disparity map at the expense of a small increase in distortion. The second one, more efficient at high bitrate, selects disparities minimizing the distortions of the compensated view, assuming that JPEG is used in the compensation. This combination appears to be fruitful, achieving increased performance at low- and mid-range bitrate, when tested on a few stereoscopic images and compared to these two techniques.","PeriodicalId":157578,"journal":{"name":"2020 Signal Processing: Algorithms, Architectures, Arrangements, and Applications (SPA)","volume":"204 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116180933","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Selected methods of parametrization in problem of automatic classification classical music from the Renaissance era against the classical works from other eras","authors":"M. Walczynski, Patryk Grzybała","doi":"10.23919/spa50552.2020.9241302","DOIUrl":"https://doi.org/10.23919/spa50552.2020.9241302","url":null,"abstract":"In this article we present the results of our work in the field of automatic classification of classical music pieces. The studied works were compositions of classical music composed in four eras: Renaissance, Baroque, Classicism and Romanticism. In the work we described selected methods of parameterization of music files, so that they emphasize the characteristic of Renaissance. The parameters we use are of a horizontal nature, i.e. they do not penetrate the vertical structure of the piece (e.g. chords progression). We used a base of 571 works of classical music, both secular and religious. The files were stored in MusicXML format and contained 187 Renaissance pieces, 146 Baroque, 119 classics and 119 stylistically belonging to the Romantic era, respectively. The results of the studies were presented using 4, 13 and 113 parameters. An artificial neural network and Support Vector Machine were used to classify the era to which the song belongs.","PeriodicalId":157578,"journal":{"name":"2020 Signal Processing: Algorithms, Architectures, Arrangements, and Applications (SPA)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124209462","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Improved pedestrian detection by adjustment of segmented ROI in thermal night vision","authors":"Karol Piniarski, P. Pawlowski, A. Dabrowski","doi":"10.23919/spa50552.2020.9241295","DOIUrl":"https://doi.org/10.23919/spa50552.2020.9241295","url":null,"abstract":"In this work we present an analysis of the region of interest (ROI) generation from the source thermal night vision images through the double thresholding segmentation technique as a part of pedestrian detection procedure. In some cases, pedestrians do not fit into the generated ROIs. To solve the problem we propose to adjust (slightly enlarge) the segmented ROI. Through this, it is possible to reduce miss rate for the aggregated channel feature (ACF) classifier from 29.1% to 24.8% and for the deep convolutional neural network (CNN) classifier from 24.0% to 22.4%, with negligible impact on the processing time.","PeriodicalId":157578,"journal":{"name":"2020 Signal Processing: Algorithms, Architectures, Arrangements, and Applications (SPA)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123639551","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mustafa Othman, Ken Chen, Anissa Zergaïnoh-Mokraoui
{"title":"A Study of QoE-Aware Adaptation Mechanism for DASH Video Streaming based on Objective Visual Quality Assessment","authors":"Mustafa Othman, Ken Chen, Anissa Zergaïnoh-Mokraoui","doi":"10.23919/spa50552.2020.9241250","DOIUrl":"https://doi.org/10.23919/spa50552.2020.9241250","url":null,"abstract":"Dynamic Adaptive Streaming over HTTP (DASH) is a largely used video streaming technique. One key point is its adaptation mechanism which resides at the client’s side. This allows various context-aware adaptation strategies in order to optimize the overall Quality of Experience (QoE) of the video streaming. In this paper, we present a study on an adaptation mechanism which uses an objective visual quality assessment, namely the Structural Similarity Index Measurement (SSIM) metric, as a key criterion for adaptation. More specifically, the SSIM helps to maximize the effective use of the available bandwidth, in the sense that we adopt a higher bitrate not only because it is allowed by network conditions, but also because it does bring a significant visual quality improvement (measured through SSIM metric). In this way, an upgrade in bandwidth consumption will be allowed only if there is a real contribution to visual quality. This study has been tested through a series of experimental results obtained with several strategies for the choice of the threshold value. Our tests are all based on real mobile-network traffic traces and real video sequences.","PeriodicalId":157578,"journal":{"name":"2020 Signal Processing: Algorithms, Architectures, Arrangements, and Applications (SPA)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132324580","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Flexible hardware architectures for robust Cyberphysical systems","authors":"M. Hübner","doi":"10.23919/spa50552.2020.9241278","DOIUrl":"https://doi.org/10.23919/spa50552.2020.9241278","url":null,"abstract":"Cyberphysical systems need to handle complex applications in different domains. The challenge is the parameterization and compositions of resources of such systems which needs to be done traditionally at design time. This requires to explore a large design space and is therefore not leading to an optimized setup. Run-time adaptive systems can find a optimal point of operation at run-time which is an advantage. This talk shows possible architecture which allow such a flexibility and discusses future solutions.","PeriodicalId":157578,"journal":{"name":"2020 Signal Processing: Algorithms, Architectures, Arrangements, and Applications (SPA)","volume":"12 5","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"113967829","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"High performance multiplier-less pipelined FPGA architecture for 2-D non-separable quaternionic filter banks","authors":"Eugene V. Rybenkov, N. Petrovsky","doi":"10.23919/spa50552.2020.9241273","DOIUrl":"https://doi.org/10.23919/spa50552.2020.9241273","url":null,"abstract":"This paper presents a systematic design of the 2-D non-separable quaternionic paraunitary filter banks $(Q -$PUFB) based on the integer-to-integer invertible quaternionic multiplier applied to image processing. In order to achieve higher transform coding gains in multidimensional domain with relatively low-complexity implementation, orthogonal transform 8-channel $Q -$PUFB factorize into two-dimensional non-separable structures called ”64in-64out” (2D NS $Q -$PUFB). The given structures can be mapped directly to parallel-pipelined processor architecture with minimal latency time $4 (N +1)$ quaternion multiplication operations, where N is transform order of $Q -$PUFB. The latency of parallel-pipelined processing does not depend on the size of the original image. Experimental design results on resource utilization and total throughput are obtained using a Xilinx Ultrascale + FPGA Series. System prototype total throughput variates from 13.8 up to 55 million pixels per second and depends on fixed point constraints.","PeriodicalId":157578,"journal":{"name":"2020 Signal Processing: Algorithms, Architectures, Arrangements, and Applications (SPA)","volume":"44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133164622","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The Victory of Orthogonality","authors":"G. Strang","doi":"10.23919/spa50552.2020.9241274","DOIUrl":"https://doi.org/10.23919/spa50552.2020.9241274","url":null,"abstract":"The equation $A x=0$ tells us that x is perpendicular to every row of $A-$ and therefore to the whole row space of A. This is fundamental, but singular vectors v in the row space do more. (1) the v’s are orthogonal (2) the vectors $u=A v$ are also orthogonal (in the column space of A). Those v’s and u’s are columns in the singular value decomposition $A V=U Sigma$. They are eigenvectors of $A^{T} A$ and $A A^{T}$, perfect for applications. We can list 10 reasons why orthogonal matrices like U and V are best for computation - and also for understanding. Fortunately the product of orthogonal matrices $V_{I} V_{2}$ is also an orthogonal matrix. As long as our measure of length is $|v|^{2}=v_{1}^{2}+ldots+v_{n}^{2}$, orthogonal vectors and orthogonal matrices will win.","PeriodicalId":157578,"journal":{"name":"2020 Signal Processing: Algorithms, Architectures, Arrangements, and Applications (SPA)","volume":"349 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133182236","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Constructing a Dataset of Speech Recordings with Lombard Effect","authors":"Dawid Weber, Szymon Zaporowski, Daniel Korzekwa","doi":"10.23919/spa50552.2020.9241266","DOIUrl":"https://doi.org/10.23919/spa50552.2020.9241266","url":null,"abstract":"The purpose of the recordings was to create a speech corpus based on the ISLE dataset, extended with video and Lombard speech. Selected from a set of 165 sentences, 10, evaluated as having the highest possibility to occur in the context of the Lombard effect, were repeated in the presence of the so-called babble speech to obtain Lombard speech features. Altogether, 15 speakers were recorded, and speech parameters were calculated and analyzed. First, a brief summary of the research related to the Lombard effect is given. Then, the recording studio characteristics and the equipment utilized for recordings are shown. Examples of analyses are included, concerning both non-Lombard and Lombard speech. Finally, a recapitulation of experiments performed along with further research plans is given. The link to the data is also provided.","PeriodicalId":157578,"journal":{"name":"2020 Signal Processing: Algorithms, Architectures, Arrangements, and Applications (SPA)","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114195740","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Comparison of two methods of sound extraction from guitar string video recordings","authors":"Marta Stefaniak, A. Czyżewski","doi":"10.23919/spa50552.2020.9241261","DOIUrl":"https://doi.org/10.23919/spa50552.2020.9241261","url":null,"abstract":"A comparison of two sound extraction methods from guitar string video recordings is presented in the paper. A brief overview of high frame rate camera technology and possible applications are included. The method using the image analysis from two such cameras is presented. The cameras are placed at the angle of 90 degrees for recording the image in three planes. The results achieved with the setup proposed by ourselves are compared to the results of recording with a single high frame rate camera used for the Visual Microphone method developed by scientists from MIT. Spectrograms and signal spectra of recordings were compared and discussed, revealing that both methods of sound extraction from video brought the ability to reproduce sound, but with some distortions. Finally, the options for future experiments are considered.","PeriodicalId":157578,"journal":{"name":"2020 Signal Processing: Algorithms, Architectures, Arrangements, and Applications (SPA)","volume":"122 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115606769","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"1D convolutional context-aware architectures for acoustic sensing and recognition of passing vehicle type","authors":"A. Kurowski, Szymon Zaporowski, A. Czyżewski","doi":"10.23919/spa50552.2020.9241256","DOIUrl":"https://doi.org/10.23919/spa50552.2020.9241256","url":null,"abstract":"A network architecture that may be employed to sensing and recognition of a type of vehicle on the basis of audio recordings made in the proximity of a road is proposed in the paper. The analyzed road traffic consists of both passenger cars and heavier vehicles. Excerpts from recordings that do not contain vehicles passing sounds are also taken into account and marked as ones containing silence. The neural network architecture employed for these tasks is a 1D convolutional network. Two types of classifiers are tested: one analyzing only the current audio frame and one analyzing three consecutive audio frames that allow us to take into account the context of the middle frame occurrence. The neural network is trained on datasets derived for four frame lengths, namely 50 ms, 100 ms, 200 ms, and 400 ms. Results of statistical analysis of both network classification accuracy are presented. The context-aware variant of a neural network performed better in a statistically significant manner for three out of four investigated frame lengths.","PeriodicalId":157578,"journal":{"name":"2020 Signal Processing: Algorithms, Architectures, Arrangements, and Applications (SPA)","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116258830","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}