听和看:端到端的声音分类和可视化分类声音

PeerJ preprints Pub Date : 2018-10-15 DOI:10.7287/peerj.preprints.27280v1

Thomas Miano

{"title":"听和看:端到端的声音分类和可视化分类声音","authors":"Thomas Miano","doi":"10.7287/peerj.preprints.27280v1","DOIUrl":null,"url":null,"abstract":"Machine learning is a field of study that uses computational and statistical techniques to enable computers to learn. When machine learning is applied, it functions as an instrument that can solve problems or expand knowledge about the surrounding world. Increasingly, machine learning is also an instrument for artistic expression in digital and non-digital media. While painted art has existed for thousands of years, the oldest digital art is less than a century old. Digital media as an art form is a relatively nascent, and the practice of machine learning in digital art is even more recent. Across all artistic media, a piece is powerful when it can captivate its consumer. Such captivation can be elicited through through a wide variety of methods including but not limited to distinct technique, emotionally evocative communication, and aesthetically pleasing combinations of textures. This work aims to explore how machine learning can be used simultaneously as a scientific instrument for understanding the world and as an artistic instrument for inspiring awe. Specifically, our goal is to build an end-to-end system that uses modern machine learning techniques to accurately recognize sounds in the natural environment and to communicate via visualization those sounds that it has recognized. We validate existing research by finding that convolutional neural networks, when paired with transfer learning using out-of-domain data, can be successful in mapping an image classification task to a sound classification task. Our work offers a novel application where the model used for performant sound classification is also used for visualization in an end-to-end, sound-to-image system.","PeriodicalId":93040,"journal":{"name":"PeerJ preprints","volume":"44 1","pages":"e27280"},"PeriodicalIF":0.0000,"publicationDate":"2018-10-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Hear and See: End-to-end sound classification and visualization of classified sounds\",\"authors\":\"Thomas Miano\",\"doi\":\"10.7287/peerj.preprints.27280v1\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Machine learning is a field of study that uses computational and statistical techniques to enable computers to learn. When machine learning is applied, it functions as an instrument that can solve problems or expand knowledge about the surrounding world. Increasingly, machine learning is also an instrument for artistic expression in digital and non-digital media. While painted art has existed for thousands of years, the oldest digital art is less than a century old. Digital media as an art form is a relatively nascent, and the practice of machine learning in digital art is even more recent. Across all artistic media, a piece is powerful when it can captivate its consumer. Such captivation can be elicited through through a wide variety of methods including but not limited to distinct technique, emotionally evocative communication, and aesthetically pleasing combinations of textures. This work aims to explore how machine learning can be used simultaneously as a scientific instrument for understanding the world and as an artistic instrument for inspiring awe. Specifically, our goal is to build an end-to-end system that uses modern machine learning techniques to accurately recognize sounds in the natural environment and to communicate via visualization those sounds that it has recognized. We validate existing research by finding that convolutional neural networks, when paired with transfer learning using out-of-domain data, can be successful in mapping an image classification task to a sound classification task. Our work offers a novel application where the model used for performant sound classification is also used for visualization in an end-to-end, sound-to-image system.\",\"PeriodicalId\":93040,\"journal\":{\"name\":\"PeerJ preprints\",\"volume\":\"44 1\",\"pages\":\"e27280\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-10-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"PeerJ preprints\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.7287/peerj.preprints.27280v1\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"PeerJ preprints","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.7287/peerj.preprints.27280v1","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 3

摘要

机器学习是一个研究领域，它使用计算和统计技术使计算机能够学习。当机器学习被应用时，它可以作为一种工具来解决问题或扩展对周围世界的知识。机器学习也越来越多地成为数字和非数字媒体艺术表达的工具。虽然绘画艺术已经存在了数千年，但最古老的数字艺术还不到一个世纪。数字媒体作为一种艺术形式是相对新生的，而机器学习在数字艺术中的实践更是最近才出现的。在所有的艺术媒介中，当一件作品能够吸引消费者时，它就是强大的。这种魅力可以通过各种各样的方法来激发，包括但不限于独特的技术，情感上唤起的交流，以及美学上令人愉悦的纹理组合。这项工作旨在探索如何将机器学习同时用作理解世界的科学工具和激发敬畏的艺术工具。具体来说，我们的目标是建立一个端到端的系统，该系统使用现代机器学习技术来准确识别自然环境中的声音，并通过可视化来交流它所识别的声音。我们通过发现卷积神经网络与使用域外数据的迁移学习配对，可以成功地将图像分类任务映射到声音分类任务，从而验证了现有的研究。我们的工作提供了一种新的应用，其中用于性能声音分类的模型也用于端到端声音到图像系统的可视化。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Hear and See: End-to-end sound classification and visualization of classified sounds

Machine learning is a field of study that uses computational and statistical techniques to enable computers to learn. When machine learning is applied, it functions as an instrument that can solve problems or expand knowledge about the surrounding world. Increasingly, machine learning is also an instrument for artistic expression in digital and non-digital media. While painted art has existed for thousands of years, the oldest digital art is less than a century old. Digital media as an art form is a relatively nascent, and the practice of machine learning in digital art is even more recent. Across all artistic media, a piece is powerful when it can captivate its consumer. Such captivation can be elicited through through a wide variety of methods including but not limited to distinct technique, emotionally evocative communication, and aesthetically pleasing combinations of textures. This work aims to explore how machine learning can be used simultaneously as a scientific instrument for understanding the world and as an artistic instrument for inspiring awe. Specifically, our goal is to build an end-to-end system that uses modern machine learning techniques to accurately recognize sounds in the natural environment and to communicate via visualization those sounds that it has recognized. We validate existing research by finding that convolutional neural networks, when paired with transfer learning using out-of-domain data, can be successful in mapping an image classification task to a sound classification task. Our work offers a novel application where the model used for performant sound classification is also used for visualization in an end-to-end, sound-to-image system.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

PeerJ preprints

自引率

0.00%

发文量