{"title":"Machine Learning for Multiscale Video Coding","authors":"M. V. Gashnikov","doi":"10.3103/S1060992X23030037","DOIUrl":null,"url":null,"abstract":"<p>The research concerns the use of machine learning algorithms for multiscale coding of digital video sequences. Based on machine learning, the digital image coder is generalized to the coding of video sequences. To this end, we offer an algorithm that allows for videoframes interdependency by using linear regression. The generalized image coder uses multiscale representation of videoframes, neural network three-dimensional interpolation of multiscale videoframe interpretation levels and generative-adversarial neural net replacement of homogeneous portions of a videoframe by synthetic video data. The method of coding the entire video and method of coding videoframes are exemplified by block diagrams. Formalized description of how videoframe correlation is taken into account is given. Real video sequences are used to carry out numerical experiments. The experimental data allow us to make a conclusion about the promise of using the algorithm in video coding and processing.</p>","PeriodicalId":721,"journal":{"name":"Optical Memory and Neural Networks","volume":"32 3","pages":"189 - 196"},"PeriodicalIF":1.0000,"publicationDate":"2023-09-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Optical Memory and Neural Networks","FirstCategoryId":"1085","ListUrlMain":"https://link.springer.com/article/10.3103/S1060992X23030037","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"OPTICS","Score":null,"Total":0}
引用次数: 0
Abstract
The research concerns the use of machine learning algorithms for multiscale coding of digital video sequences. Based on machine learning, the digital image coder is generalized to the coding of video sequences. To this end, we offer an algorithm that allows for videoframes interdependency by using linear regression. The generalized image coder uses multiscale representation of videoframes, neural network three-dimensional interpolation of multiscale videoframe interpretation levels and generative-adversarial neural net replacement of homogeneous portions of a videoframe by synthetic video data. The method of coding the entire video and method of coding videoframes are exemplified by block diagrams. Formalized description of how videoframe correlation is taken into account is given. Real video sequences are used to carry out numerical experiments. The experimental data allow us to make a conclusion about the promise of using the algorithm in video coding and processing.
期刊介绍:
The journal covers a wide range of issues in information optics such as optical memory, mechanisms for optical data recording and processing, photosensitive materials, optical, optoelectronic and holographic nanostructures, and many other related topics. Papers on memory systems using holographic and biological structures and concepts of brain operation are also included. The journal pays particular attention to research in the field of neural net systems that may lead to a new generation of computional technologies by endowing them with intelligence.