KerisRDNet: Mask-aware augmentation and residual dilated networks for cultural heritage blade classification

IF 4.9

Machine learning with applications Pub Date : 2026-06-01 Epub Date: 2026-02-05 DOI:10.1016/j.mlwa.2026.100852

Khafiizh Hastuti, Erwin Yudi Hidayat, Abu Salam, Usman Sudibyo

{"title":"KerisRDNet: Mask-aware augmentation and residual dilated networks for cultural heritage blade classification","authors":"Khafiizh Hastuti, Erwin Yudi Hidayat, Abu Salam, Usman Sudibyo","doi":"10.1016/j.mlwa.2026.100852","DOIUrl":null,"url":null,"abstract":"<div><div>Fine-grained recognition of cultural artifacts remains challenging because of the scarcity of annotated data, subtle intra-class differences, and heterogeneous imaging conditions. This study addresses these issues through a domain-specific deep learning pipeline, demonstrated on Indonesian keris classification across three tasks: <em>pamor</em> (27 classes), <em>dhapur</em> (42), and <em>tangguh</em> (5). The pipeline integrates background homogenization, orientation normalization, and YOLOv8-based blade cropping with mask-aware augmentation restricted to the blade regions. For classification, we propose KerisRDNet, which extends InceptionResNetV2 with Inception-Residual-Dilated (IRD) blocks and squeeze-and-excitation to model the elongated geometries and subtle forging motifs. Experiments show that baseline networks collapse under fine-grained settings, with macro-F1 near zero, whereas the proposed approach achieves 0.268 (<em>pamor</em>), 0.276 (<em>dhapur</em>), and 0.635 (<em>tangguh</em>) with Top-3 accuracy above 0.5 and AUC up to 0.853. Across three stratified resamplings, paired non-parametric tests (Wilcoxon signed-rank) indicated directionally consistent improvements; given the small number of repetitions (<span><math><mrow><mi>n</mi><mo>=</mo><mn>3</mn></mrow></math></span>), these results are interpreted conservatively. These results demonstrate the feasibility of practically viable keris recognition as a decision-support tool for cultural heritage curation, while also offering a transferable workflow for low-data fine-grained recognition tasks.</div></div>","PeriodicalId":74093,"journal":{"name":"Machine learning with applications","volume":"24 ","pages":"Article 100852"},"PeriodicalIF":4.9000,"publicationDate":"2026-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Machine learning with applications","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2666827026000174","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2026/2/5 0:00:00","PubModel":"Epub","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Fine-grained recognition of cultural artifacts remains challenging because of the scarcity of annotated data, subtle intra-class differences, and heterogeneous imaging conditions. This study addresses these issues through a domain-specific deep learning pipeline, demonstrated on Indonesian keris classification across three tasks: pamor (27 classes), dhapur (42), and tangguh (5). The pipeline integrates background homogenization, orientation normalization, and YOLOv8-based blade cropping with mask-aware augmentation restricted to the blade regions. For classification, we propose KerisRDNet, which extends InceptionResNetV2 with Inception-Residual-Dilated (IRD) blocks and squeeze-and-excitation to model the elongated geometries and subtle forging motifs. Experiments show that baseline networks collapse under fine-grained settings, with macro-F1 near zero, whereas the proposed approach achieves 0.268 (pamor), 0.276 (dhapur), and 0.635 (tangguh) with Top-3 accuracy above 0.5 and AUC up to 0.853. Across three stratified resamplings, paired non-parametric tests (Wilcoxon signed-rank) indicated directionally consistent improvements; given the small number of repetitions (

n = 3

), these results are interpreted conservatively. These results demonstrate the feasibility of practically viable keris recognition as a decision-support tool for cultural heritage curation, while also offering a transferable workflow for low-data fine-grained recognition tasks.

查看原文本刊更多论文

基于掩码感知的文化遗产刀片分类增强和残差扩展网络

由于注释数据的稀缺性、微妙的类内差异和不同的成像条件，对文化文物的细粒度识别仍然具有挑战性。本研究通过特定领域的深度学习管道解决了这些问题，并在印度尼西亚keris分类中展示了三个任务：pamor（27类）、dhapur（42类）和tangguh（5类）。该管道集成了背景均匀化、方向归一化和基于yolov8的叶片裁剪，以及仅限于叶片区域的掩模感知增强。对于分类，我们提出KerisRDNet，它扩展了Inception-Residual-Dilated （IRD）块和挤压-激励的Inception-Residual-Dilated （IRD）块来建模细长的几何形状和微妙的锻造图案。实验表明，在细粒度设置下，基线网络崩溃，宏f1接近于零，而该方法达到0.268 (pamor), 0.276 （dhapur）和0.635 (tangguh)， Top-3精度高于0.5，AUC高达0.853。在三次分层重采样中，配对非参数检验（Wilcoxon符号秩）显示方向一致的改善；考虑到重复次数很少（n=3），这些结果被保守地解释。这些结果证明了keris识别作为文化遗产管理决策支持工具的可行性，同时也为低数据细粒度识别任务提供了可转移的工作流程。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Machine learning with applications Management Science and Operations Research, Artificial Intelligence, Computer Science Applications

自引率

0.00%

发文量

审稿时长

98 days