Jorge Calvo-Zaragoza, Kecheng Zhang, Z. Saleh, Gabriel Vigliensoni, Ichiro Fujinaga
{"title":"Music Document Layout Analysis through Machine Learning and Human Feedback","authors":"Jorge Calvo-Zaragoza, Kecheng Zhang, Z. Saleh, Gabriel Vigliensoni, Ichiro Fujinaga","doi":"10.1109/ICDAR.2017.259","DOIUrl":null,"url":null,"abstract":"Music documents often include musical symbols as well as other relevant elements such as staff lines, text, and decorations. To detect and separate these constituent elements, we propose a layout analysis framework based on machine learning that focuses on pixel-level classification of the image. For that, we make use of supervised learning classifiers trained to infer the category of each pixel. In addition, our scenario considers a human-aided computing approach in which the user is part of the recognition loop, providing feedback where relevant errors are made.","PeriodicalId":433676,"journal":{"name":"2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR)","volume":"43 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDAR.2017.259","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4
Abstract
Music documents often include musical symbols as well as other relevant elements such as staff lines, text, and decorations. To detect and separate these constituent elements, we propose a layout analysis framework based on machine learning that focuses on pixel-level classification of the image. For that, we make use of supervised learning classifiers trained to infer the category of each pixel. In addition, our scenario considers a human-aided computing approach in which the user is part of the recognition loop, providing feedback where relevant errors are made.