An Intelligent Hybrid Model for Identity Document Classification

Khandan, Nouna

doi:10.48550/arxiv.2106.04345

Cited by 1 publication

(4 citation statements)

References 83 publications

(119 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Once a matching model was found, a more intricate analysis was carried out to determine whether the matched model should be accepted or rejected. Khandan [10] proposed a method that combined SIFT and OCR for identity document classification. The study aimed to develop a method to classify identity documents with a confidence level for each match, which was determined by the number of matches in the SIFT model during each classification task.…”

Section: Related Workmentioning

confidence: 99%

“…Some studies have also shown that incorporating visual features can enhance text-based document classification [22]. In the context of identity document classification, visual features often serve as the basis for classification methods [5,6], although some studies explore the combination of multiple types of information such as visual features and textual information [10], or visual features and spatial information [5,9]. CNN has been proposed as a classification method for all three categories examined in our study: general image classification [17], text document classification [19], and identity document classification [11].…”

Section: Related Workmentioning

confidence: 99%

“…Regarding datasets, we discovered that the scarcity of research on identity documents compared to other document types can be attributed to the sensitive nature of these documents, which often contain confidential information. In certain studies, researchers addressed this challenge by collecting their own data [6,9] or collaborating with companies [10,11] to obtain datasets. However, this approach restricts access to the dataset by other researchers, limiting reproducibility.…”

Section: Related Workmentioning

confidence: 99%

“…For instance, many identity documents share the same layout regardless of their type or issuing country [6]. Visual features can also be used along with textual features; this approach is demonstrated using SIFT to detect visual features and OCR to extract textual information [10]. However, OCR requires prior knowledge of the expected language within the document, as highlighted in the study.…”

Section: Introductionmentioning

confidence: 99%

See 3 more Smart Citations

Enhancing Identity Document Classification in KYC Processes: An Evaluation of the Bag-of-Visual-Words Model and Segmentation Impact

Lee,

Kartowisastro

2023

ISI

View full text Add to dashboard Cite

The growth of online services, such as financial services, travel agencies, and e-government, has emphasized the importance of an efficient Know Your Customer (KYC) process. Efficient identity verification and document classification are crucial for KYC, such as ensuring the alignment of submitted identity documents with requirements, categorizing them accurately, and verifying their completeness within the KYC process. This article proposes the utilization of the bag-of-visual words (BoVW) model, which combines SIFT, k-means, and SVM techniques, to achieve accurate identity document classification without relying on geometry transformations. We observed that while segmentation significantly enhances accuracy during testing by eliminating irrelevant parts, its impact on the training phase appears to result in a drop in the model's performance. This drop in performance might be associated with segmentation during the training phase, where the removal of irrelevant parts might have caused the algorithm to have difficulty in identifying which features to disregard within the samples. This also implies that introducing imperfections such as blurred and low brightness samples into training dataset could potentially enhance the classification model. To test the theory, we compiled a dataset consisting of 8,400 samples, divided into 20 classes. This single compiled dataset was then used to generate three different kinds of datasets: USGM (an unsegmented dataset), SGM (a segmented dataset), and SGM2 (a segmented dataset where the subject of interest is clearly visible in the samples, serving as the training dataset). Three different testing is used: same-variant, cross-variant, and k-fold cross-validation. Our model demonstrates an average accuracy up to 97.2%, which remains relatively consistent across different types of testing.

show abstract

Section: Related Workmentioning

confidence: 99%