{"title":"Knowledge Discovery in Optical Music Recognition: Enhancing Information Retrieval with Instance Segmentation","authors":"Elona Shatri, George Fazekas","doi":"arxiv-2408.15002","DOIUrl":null,"url":null,"abstract":"Optical Music Recognition (OMR) automates the transcription of musical\nnotation from images into machine-readable formats like MusicXML, MEI, or MIDI,\nsignificantly reducing the costs and time of manual transcription. This study\nexplores knowledge discovery in OMR by applying instance segmentation using\nMask R-CNN to enhance the detection and delineation of musical symbols in sheet\nmusic. Unlike Optical Character Recognition (OCR), OMR must handle the\nintricate semantics of Common Western Music Notation (CWMN), where symbol\nmeanings depend on shape, position, and context. Our approach leverages\ninstance segmentation to manage the density and overlap of musical symbols,\nfacilitating more precise information retrieval from music scores. Evaluations\non the DoReMi and MUSCIMA++ datasets demonstrate substantial improvements, with\nour method achieving a mean Average Precision (mAP) of up to 59.70\\% in dense\nsymbol environments, achieving comparable results to object detection.\nFurthermore, using traditional computer vision techniques, we add a parallel\nstep for staff detection to infer the pitch for the recognised symbols. This\nstudy emphasises the role of pixel-wise segmentation in advancing accurate\nmusic symbol recognition, contributing to knowledge discovery in OMR. Our\nfindings indicate that instance segmentation provides more precise\nrepresentations of musical symbols, particularly in densely populated scores,\nadvancing OMR technology. We make our implementation, pre-processing scripts,\ntrained models, and evaluation results publicly available to support further\nresearch and development.","PeriodicalId":501178,"journal":{"name":"arXiv - CS - Sound","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-08-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Sound","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2408.15002","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Optical Music Recognition (OMR) automates the transcription of musical
notation from images into machine-readable formats like MusicXML, MEI, or MIDI,
significantly reducing the costs and time of manual transcription. This study
explores knowledge discovery in OMR by applying instance segmentation using
Mask R-CNN to enhance the detection and delineation of musical symbols in sheet
music. Unlike Optical Character Recognition (OCR), OMR must handle the
intricate semantics of Common Western Music Notation (CWMN), where symbol
meanings depend on shape, position, and context. Our approach leverages
instance segmentation to manage the density and overlap of musical symbols,
facilitating more precise information retrieval from music scores. Evaluations
on the DoReMi and MUSCIMA++ datasets demonstrate substantial improvements, with
our method achieving a mean Average Precision (mAP) of up to 59.70\% in dense
symbol environments, achieving comparable results to object detection.
Furthermore, using traditional computer vision techniques, we add a parallel
step for staff detection to infer the pitch for the recognised symbols. This
study emphasises the role of pixel-wise segmentation in advancing accurate
music symbol recognition, contributing to knowledge discovery in OMR. Our
findings indicate that instance segmentation provides more precise
representations of musical symbols, particularly in densely populated scores,
advancing OMR technology. We make our implementation, pre-processing scripts,
trained models, and evaluation results publicly available to support further
research and development.