{"title":"铁路技术图 (RTM) 组件识别的迁移学习方法","authors":"Obadage Rochana Rumalshan, Pramuka Weerasinghe, Mohamed Shaheer, Prabhath Gunathilake, Erunika Dayaratna","doi":"arxiv-2405.13229","DOIUrl":null,"url":null,"abstract":"The extreme popularity over the years for railway transportation urges the\nnecessity to maintain efficient railway management systems around the globe.\nEven though, at present, there exist a large collection of Computer Aided\nDesigned Railway Technical Maps (RTMs) but available only in the portable\ndocument format (PDF). Using Deep Learning and Optical Character Recognition\ntechniques, this research work proposes a generic system to digitize the\nrelevant map component data from a given input image and create a formatted\ntext file per image. Out of YOLOv3, SSD and Faster-RCNN object detection models\nused, Faster-RCNN yields the highest mean Average Precision (mAP) and the\nhighest F1 score values 0.68 and 0.76 respectively. Further it is proven from\nthe results obtained that, one can improve the results with OCR when the text\ncontaining image is being sent through a sophisticated pre-processing pipeline\nto remove distortions.","PeriodicalId":501285,"journal":{"name":"arXiv - CS - Digital Libraries","volume":"23 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-05-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Transfer Learning Approach for Railway Technical Map (RTM) Component Identification\",\"authors\":\"Obadage Rochana Rumalshan, Pramuka Weerasinghe, Mohamed Shaheer, Prabhath Gunathilake, Erunika Dayaratna\",\"doi\":\"arxiv-2405.13229\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The extreme popularity over the years for railway transportation urges the\\nnecessity to maintain efficient railway management systems around the globe.\\nEven though, at present, there exist a large collection of Computer Aided\\nDesigned Railway Technical Maps (RTMs) but available only in the portable\\ndocument format (PDF). Using Deep Learning and Optical Character Recognition\\ntechniques, this research work proposes a generic system to digitize the\\nrelevant map component data from a given input image and create a formatted\\ntext file per image. Out of YOLOv3, SSD and Faster-RCNN object detection models\\nused, Faster-RCNN yields the highest mean Average Precision (mAP) and the\\nhighest F1 score values 0.68 and 0.76 respectively. Further it is proven from\\nthe results obtained that, one can improve the results with OCR when the text\\ncontaining image is being sent through a sophisticated pre-processing pipeline\\nto remove distortions.\",\"PeriodicalId\":501285,\"journal\":{\"name\":\"arXiv - CS - Digital Libraries\",\"volume\":\"23 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-05-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - CS - Digital Libraries\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2405.13229\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Digital Libraries","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2405.13229","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Transfer Learning Approach for Railway Technical Map (RTM) Component Identification
The extreme popularity over the years for railway transportation urges the
necessity to maintain efficient railway management systems around the globe.
Even though, at present, there exist a large collection of Computer Aided
Designed Railway Technical Maps (RTMs) but available only in the portable
document format (PDF). Using Deep Learning and Optical Character Recognition
techniques, this research work proposes a generic system to digitize the
relevant map component data from a given input image and create a formatted
text file per image. Out of YOLOv3, SSD and Faster-RCNN object detection models
used, Faster-RCNN yields the highest mean Average Precision (mAP) and the
highest F1 score values 0.68 and 0.76 respectively. Further it is proven from
the results obtained that, one can improve the results with OCR when the text
containing image is being sent through a sophisticated pre-processing pipeline
to remove distortions.