The growth of online services, such as financial services, travel agencies, and e-government, has emphasized the importance of an efficient Know Your Customer (KYC) process. Efficient identity verification and document classification are crucial for KYC, such as ensuring the alignment of submitted identity documents with requirements, categorizing them accurately, and verifying their completeness within the KYC process. This article proposes the utilization of the bag-of-visual words (BoVW) model, which combines SIFT, k-means, and SVM techniques, to achieve accurate identity document classification without relying on geometry transformations. We observed that while segmentation significantly enhances accuracy during testing by eliminating irrelevant parts, its impact on the training phase appears to result in a drop in the model's performance. This drop in performance might be associated with segmentation during the training phase, where the removal of irrelevant parts might have caused the algorithm to have difficulty in identifying which features to disregard within the samples. This also implies that introducing imperfections such as blurred and low brightness samples into training dataset could potentially enhance the classification model. To test the theory, we compiled a dataset consisting of 8,400 samples, divided into 20 classes. This single compiled dataset was then used to generate three different kinds of datasets: USGM (an unsegmented dataset), SGM (a segmented dataset), and SGM2 (a segmented dataset where the subject of interest is clearly visible in the samples, serving as the training dataset). Three different testing is used: same-variant, cross-variant, and k-fold cross-validation. Our model demonstrates an average accuracy up to 97.2%, which remains relatively consistent across different types of testing.