The high crime rate in Indonesia that occurs from year to year causes a high number of cases that must be examined, tried, and decided through the courts as stipulated in Law No. 48 of 2009 concerning Judicial Power. Therefore, this study was conducted to build a system for predicting sentences resulting from criminal court decisions in the Republic of Indonesia which is expected to facilitate the implementation of jurisprudence. The prediction system was built by comparing 6 Bidirectional Encoder Representations from Transformers (BERT) models and a A Robustly Optimized BERT Pretraining Approach (RoBERTa) model on 3 different proposed architectures: BERT Base,
Hierarchical BERT + Mean Pooling, and Hierarchical BERT + LSTM (Long Short-Term Memory). The compared models include indobert-base-p1, indobert-base-uncased, legalindobert-indonlu, legal-indobert-indolem, indobert-large-p1, indonesian-roberta-base. Those models are also compared with Support Vector Machine (SVM)+Term Frequency-Inverse Document Frequency (TF-IDF) as a baseline. The legal-indobert-indolem model with the Hierarchical BERT + Mean Pooling architecture succeeded in performing multiclass classification tasks into 14 classes with the highest F1score value of 79.8888%. Thus, the successfully created model can be further used in assisting jurisprudence as it has developed the ability to predict criminal court decisions based on similar previously documented cases.