通过机器学习和人类反馈分析音乐文档布局

2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR) Pub Date : 2017-11-01 DOI:10.1109/ICDAR.2017.259

Jorge Calvo-Zaragoza, Kecheng Zhang, Z. Saleh, Gabriel Vigliensoni, Ichiro Fujinaga

{"title":"通过机器学习和人类反馈分析音乐文档布局","authors":"Jorge Calvo-Zaragoza, Kecheng Zhang, Z. Saleh, Gabriel Vigliensoni, Ichiro Fujinaga","doi":"10.1109/ICDAR.2017.259","DOIUrl":null,"url":null,"abstract":"Music documents often include musical symbols as well as other relevant elements such as staff lines, text, and decorations. To detect and separate these constituent elements, we propose a layout analysis framework based on machine learning that focuses on pixel-level classification of the image. For that, we make use of supervised learning classifiers trained to infer the category of each pixel. In addition, our scenario considers a human-aided computing approach in which the user is part of the recognition loop, providing feedback where relevant errors are made.","PeriodicalId":433676,"journal":{"name":"2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR)","volume":"43 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"Music Document Layout Analysis through Machine Learning and Human Feedback\",\"authors\":\"Jorge Calvo-Zaragoza, Kecheng Zhang, Z. Saleh, Gabriel Vigliensoni, Ichiro Fujinaga\",\"doi\":\"10.1109/ICDAR.2017.259\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Music documents often include musical symbols as well as other relevant elements such as staff lines, text, and decorations. To detect and separate these constituent elements, we propose a layout analysis framework based on machine learning that focuses on pixel-level classification of the image. For that, we make use of supervised learning classifiers trained to infer the category of each pixel. In addition, our scenario considers a human-aided computing approach in which the user is part of the recognition loop, providing feedback where relevant errors are made.\",\"PeriodicalId\":433676,\"journal\":{\"name\":\"2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR)\",\"volume\":\"43 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICDAR.2017.259\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDAR.2017.259","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 4

摘要

音乐文件通常包括音乐符号以及其他相关元素，如五线谱、文本和装饰。为了检测和分离这些组成元素，我们提出了一个基于机器学习的布局分析框架，该框架专注于图像的像素级分类。为此，我们使用经过训练的监督学习分类器来推断每个像素的类别。此外，我们的场景考虑了一种人工辅助计算方法，其中用户是识别循环的一部分，在发生相关错误时提供反馈。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Music Document Layout Analysis through Machine Learning and Human Feedback

Music documents often include musical symbols as well as other relevant elements such as staff lines, text, and decorations. To detect and separate these constituent elements, we propose a layout analysis framework based on machine learning that focuses on pixel-level classification of the image. For that, we make use of supervised learning classifiers trained to infer the category of each pixel. In addition, our scenario considers a human-aided computing approach in which the user is part of the recognition loop, providing feedback where relevant errors are made.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR)

自引率

0.00%

发文量