{"title":"A hyperspectral unmixing approach for ink mismatch detection in unbalanced clusters","authors":"Faryal Aurooj Nasir , Salman Liaquat , Khurram Khurshid , Nor Muzlifah Mahyuddin","doi":"10.1016/j.jiixd.2024.01.004","DOIUrl":null,"url":null,"abstract":"<div><p>Detecting ink mismatch is a significant challenge in verifying the authenticity of documents, especially when dealing with uneven ink distribution. Conventional imaging methods frequently fail to distinguish visually similar inks. Our study presents a novel hyperspectral unmixing approach to detect ink mismatches in unbalanced clusters. The proposed method identifies unique spectral characteristics of different inks employing k-means clustering and Gaussian mixture models (GMMs) to perform color segmentation on different ink types and utilizes elbow estimation and silhouette coefficient to evaluate the number of inks estimation precisely. For a more accurate estimation of quantity, which is generally not an attribute of clustering methods, we employed entropy calculations in the red, green, and blue depth channels for precise abundance estimation of ink. This unique combination of basic techniques in conjunction exhibits better efficacy in performing ink unmixing and provides a real-world document forensic solution compared to current methods that rely on assumptions like prior knowledge of the inks used in a document and deep learning-based methods that rely heavily on abundant training datasets. We evaluate our approach on the iVision handwritten hyperspectral images dataset (iVision HHID), which is a comprehensive and rich dataset that surpasses the commonly-used UWA writing inks hyperspectral images (WIHSI) database in size and diversity. This study has accomplished the unmixing task with three main challenges: unmixing of diverse ink spectral signatures (149 spectral bands instead of 33 bands in the previous dataset), without using prior knowledge and assumptions about the number of inks used in the questioned document, and not requiring large training data for performing unmixing. Furthermore, the security of the proposed document authentication methodology to address the likelihood of forgeries or manipulations in questioned documents is enhanced as compared to previous works relying on known inks and known spectrum. Randomization techniques and anomaly detection mechanisms are used in our methodology which increases the difficulty for adversaries to predict and manipulate specific aspects of the input data in questioned documents, thereby enhancing the robustness of our method. The code for conducting this research can be accessed at <span>GitHub repository</span><svg><path></path></svg>.</p></div>","PeriodicalId":100790,"journal":{"name":"Journal of Information and Intelligence","volume":"2 2","pages":"Pages 177-190"},"PeriodicalIF":0.0000,"publicationDate":"2024-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2949715924000040/pdfft?md5=3d98b093a0be134b496feff3d3fa509c&pid=1-s2.0-S2949715924000040-main.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Information and Intelligence","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2949715924000040","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Detecting ink mismatch is a significant challenge in verifying the authenticity of documents, especially when dealing with uneven ink distribution. Conventional imaging methods frequently fail to distinguish visually similar inks. Our study presents a novel hyperspectral unmixing approach to detect ink mismatches in unbalanced clusters. The proposed method identifies unique spectral characteristics of different inks employing k-means clustering and Gaussian mixture models (GMMs) to perform color segmentation on different ink types and utilizes elbow estimation and silhouette coefficient to evaluate the number of inks estimation precisely. For a more accurate estimation of quantity, which is generally not an attribute of clustering methods, we employed entropy calculations in the red, green, and blue depth channels for precise abundance estimation of ink. This unique combination of basic techniques in conjunction exhibits better efficacy in performing ink unmixing and provides a real-world document forensic solution compared to current methods that rely on assumptions like prior knowledge of the inks used in a document and deep learning-based methods that rely heavily on abundant training datasets. We evaluate our approach on the iVision handwritten hyperspectral images dataset (iVision HHID), which is a comprehensive and rich dataset that surpasses the commonly-used UWA writing inks hyperspectral images (WIHSI) database in size and diversity. This study has accomplished the unmixing task with three main challenges: unmixing of diverse ink spectral signatures (149 spectral bands instead of 33 bands in the previous dataset), without using prior knowledge and assumptions about the number of inks used in the questioned document, and not requiring large training data for performing unmixing. Furthermore, the security of the proposed document authentication methodology to address the likelihood of forgeries or manipulations in questioned documents is enhanced as compared to previous works relying on known inks and known spectrum. Randomization techniques and anomaly detection mechanisms are used in our methodology which increases the difficulty for adversaries to predict and manipulate specific aspects of the input data in questioned documents, thereby enhancing the robustness of our method. The code for conducting this research can be accessed at GitHub repository.