TRANSFORMER-BASED OPTICAL CHARACTER RECOGNITION APPROACH FOR IDENTIFYING MOTOR VEHICLES WITH OVERDUE TAXES

  • Nabila Dwi Fazira Department of Statistics, Faculty of Mathematics and Natural Sciences, Universitas Islam Indonesia, Indonesia https://orcid.org/0009-0002-9409-3123
  • Achmad Fauzan Department of Statistics, Faculty of Mathematics and Natural Sciences, Universitas Islam Indonesia, Indonesia https://orcid.org/0000-0002-0533-5518
Keywords: Five-yearly Tax, Text recognition, TrOCR, Vehicle plate

Abstract

The high growth in the number of motorized vehicles in Indonesia has given rise to special attention in managing traffic administration, especially in relation to vehicle taxes. To present innovative solutions in vehicle tax administration, this research was conducted to detect the five-year tax status of motor vehicles in Indonesia using the Transformer Optical Character Recognition (TrOCR) model. The aim of this research is to evaluate the performance of the TrOCR model in recognizing text on motor vehicle number plates in Indonesia and classifying number plates that have and have not paid tax. The data used is primary data in the form of images of motor vehicle number plates taken around the Faculty of Mathematics and Natural Sciences, Universitas Islam Indonesia, using a purposive sampling with constraints on the representation of each class. Although the data collection was limited to this location, Indonesian vehicle plates follow a standardized format, with regional differences primarily in the prefix letters. Additionally, the university attracts students from various regions who often use vehicles registered in their home provinces. Consequently, the collected dataset reflects a diverse range of number plates, making it a reasonable representation of motor vehicle plates across Indonesia. The research results show that the TrOCR model succeeded in achieving a Character Error Rate (CER) value of 2.9% with a data configuration of 90% for training and 10% for testing, and using 8 epochs. Evaluation of model performance indicates that overall text detection is very effective in classifying the five-year tax status of motor vehicles. Although there are some prediction errors, the overall performance of the model can be considered good and is able to provide reliable information regarding the five-yearly vehicle tax status

Downloads

Download data is not yet available.

References

Kumparan, “ARTI PLAT NOMOR DAN SERBA-SERBI IDENTITAS KENDARAAN.” [Online]. Available: https://kumparan.com/info-otomotif/arti-plat-nomor-dan-serba-serbi-identitas-kendaraan-1xBytPuVTLT/4

M. A. Taufik, “TINJAUAN HUKUM ISLAM TERHADAP JUAL BELI PLAT NOMOR KENDARAAN BERMOTOR DI YOGYAKARTA,” Az Zarqa’, vol. 7, no. 2, pp. 237–251, 2015.

Baladewa, “ARTI DAN JENIS PLAT NOMOR KENDARAAN DI INDONESIA.” [Online]. Available: https://www.bhinneka.com/blog/arti-dan-jenis-plat-nomor-kendaraan-di-indonesia/

N. Andari, “DASAR HUKUM DAN PERATURAN PLAT NOMOR KENDARAAN,” Carmudi Indonesia. [Online]. Available: https://www.carmudi.co.id/journal/dasar-hukum-dan-peraturan-plat-nomor-kendaraan/

E. Wattimury, H. Z. Wadjo, and E. Ubwarin, “PENGGUNAAN PELAT TANDA NOMOR KENDARAAN BERMOTOR YANG DIPALSUKAN DI KOTA AMBON,” LUTUR Law Journal, vol. 2020, no. 1, 2020.

Kantor Pelayanan Pajak Daerah, “PAJAK 5 TAHUNAN.” [Online]. Available: https://samsatsleman.jogjaprov.go.id/index.php/layanan/pajak-5-tahunan

BAPENDA JABAR, “CARA MENGHITUNG PAJAK PROGRESIF KENDARAAN BERMOTOR.” [Online]. Available: https://bapenda.jabarprov.go.id/2017/03/14/cara-menghitung-pajak-progresif-kendaraan-bermotor/

Y. Galahartlambang, T. Khotiah, Z. Fanani, and A. A. Y. Solekhah, “DETEKSI PLAT NOMOR KENDARAAN OTOMATIS DENGAN CONVOLUTIONAL NEURAL NETWORK DAN OCR PADA TEMPAT PARKIR ITB AHMAD DAHLAN LAMONGAN,” Jurnal Manajemen Informatika & Sistem Informasi (MISI), vol. 6, no. 2, pp. 114–122, 2023, https://doi.org/10.36595/misi.v5i2.

D. Avianto, “PENGENALAN POLA KARAKTER PLAT NOMOR KENDARAAN MENGGUNAKAN ALGORITMA MOMENTUM BACKPROPAGATION NEURAL NETWORK,” Jurnal Informatika, vol. 10, no. 1, 2016.

B. W. Hawk, A. Karaisl, and N. White, “MODELLING MEDIEVAL HANDS: PRACTICAL OCR FOR CAROLINE MINUSCULE,” DHQ: Digital Humanities Quarterly, vol. 13, no. 1, 2019.

A. Khan et al., “OCR APPROACHES FOR HUMANITIES: APPLICATIONS OF ARTIFICIAL INTELLIGENCE/MACHINE LEARNING ON TRANSCRIPTION AND TRANSLITERATION OF HISTORICAL DOCUMENTS,” Digital Studies in Language and Literature, vol. 1, no. 1–2, pp. 85–112, Dec. 2024, https://doi.org/10.1515/dsll-2024-0013.

A. Jindal and R. Ghosh, “AN OPTIMIZED CNN SYSTEM TO RECOGNIZE HANDWRITTEN CHARACTERS IN ANCIENT DOCUMENTS IN GRANTHA SCRIPT,” International Journal of Information Technology, vol. 15, no. 4, pp. 1975–1983, Apr. 2023, https://doi.org/10.1007/s41870-023-01247-1.

N. Jayanthi, S. Indu, S. Hasija, and P. Tripathi, “DIGITIZATION OF ANCIENT MANUSCRIPTS AND INSCRIPTIONS - A Review,” 2017, pp. 605–612. https://doi.org/10.1007/978-981-10-5427-3_62.

P. Bhatt and I. P. Patel, “OPTICAL CHARACTER RECOGNITION USING DEEP LEARNING,” vol. 11, no. 1, pp. 55–66, 2018, [Online]. Available: https://www.researchgate.net/publication/326009476_Optical_Character_Recognition_using_Deep_learning_-_A_Technical_Review

M. Li et al., “TrOCR: TRANSFORMER-BASED OPTICAL CHARACTER RECOGNITION WITH PRE-TRAINED MODELS,” in The Thirty-Seventh AAAI Conference on Artificial Intelligence (AAAI-23), 2023. [Online]. Available: www.aaai.org

R. L. Zhang, “A COMPREHENSIVE EVALUATION OF TROCR WITH VARYING IMAGE EFFECTS,” 2024.

P. B. Ströbel, S. Clematide, M. Volk, and T. Hodel, “TRANSFORMER-BASED HTR FOR HISTORICAL DOCUMENTS,” arXiv preprint arXiv:2203.11008, 2022.

L. Li, “HANDWRITING RECOGNITION IN HISTORICAL DOCUMENTS WITH MULTIMODAL LLM,” arXiv preprint arXiv:2410.24034, 2024.

J. X. C. Ke, A. DhakshinaMurthy, R. B. George, and P. Branco, “THE EFFECT OF RESAMPLING TECHNIQUES ON THE PERFORMANCES OF MACHINE LEARNING CLINICAL RISK PREDICTION MODELS IN THE SETTING OF SEVERE CLASS IMBALANCE: DEVELOPMENT AND INTERNAL VALIDATION IN A RETROSPECTIVE COHORT,” Discover Artificial Intelligence, vol. 4, no. 1, p. 91, Nov. 2024, https://doi.org/10.1007/s44163-024-00199-0.

Resampling Methods. Boston, MA: Birkhäuser Boston, 2006. https://doi.org/10.1007/0-8176-4444-X.

S. Kehi, S. E. M. Nirahua, and H. M. Y. Tita, “PENYALAHGUNAAN PEMBAYARAN PAJAK KENDARAAN BERMOTOR PEMERINTAH BERDASARKAN PERATURAN DAERAH NOMOR 6 TAHUN 2010,” TATOHI: Jurnal Ilmu Hukum, vol. 2, no. 9, p. 966, Aug. 2022, https://doi.org/10.47268/tatohi.v2i9.1436.

L. Deng and D. Yu, “DEEP LEARNING: METHODS AND APPLICATIONS,” Foundations and Trends R in Signal Processing, vol. 7, pp. 197–387, 2013, https://doi.org/10.1561/2000000039.

A. Kamilaris and F. X. Prenafeta-Boldú, “DEEP LEARNING IN AGRICULTURE: A SURVEY,” Comput Electron Agric, vol. 147, pp. 70–90, Aug. 2018, https://doi.org/10.1016/J.COMPAG.2018.02.016.

W. Wang, F. Wei, L. Dong, H. Bao, N. Yang, and M. Zhou, “MINILM: DEEP SELF-ATTENTION DISTILLATION FOR TASK-AGNOSTIC COMPRESSION OF PRE-TRAINED TRANSFORMERS,” in 34th Conference on Neural Information Processing Systems (NeurIPS 2020), ancouver, Canada, Aug. 2020. [Online]. Available: http://arxiv.org/abs/2002.10957

H. Touvron et al., “TRAINING DATA-EFFICIENT IMAGE TRANSFORMERS & DISTILLATION THROUGH ATTENTION,” in Proceedings of the 38 th International Conference on Machine Learning, PMLR, 2021.

A. Vaswani et al., “ATTENTION IS ALL YOU NEED,” in 31st Conference on Neural Information Processing Systems (NIPS 2017), Long Beach, CA, USA., 2023.

C. Neudecker, K. Baierer, M. Gerber, C. Clausner, A. Antonacopoulos, and S. Pletschacher, “A SURVEY OF OCR EVALUATION TOOLS AND METRICS,” in Proceedings of the 6th International Workshop on Historical Document Imaging and Processing, in HIP ’21. New York, NY, USA: ASSOCIATION FOR COMPUTING MACHINERY, 2021, pp. 13–18. https://doi.org/10.1145/3476887.3476888.

Y. Baek et al., “CLEval: CHARACTER-LEVEL EVALUATION FOR TEXT DETECTION AND RECOGNITION TASKS,” Jun. 2020, [Online]. Available: http://arxiv.org/abs/2006.06244

Muslih and L. B. Handoko, “PENGUJIAN AVALANCHE EFFECT PADA KRIPTOGRAFI TEKS MENGGUNAKAN AUTOKEY CIPHER,” in 2 st Proceeding STEKOM, Stekom University, 2022.

D. Das, J. Philip, M. Mathew, and C. V Jawahar, “A COST EFFICIENT APPROACH TO CORRECT OCR ERRORS IN LARGE DOCUMENT COLLECTIONS,” in 2019 International Conference on Document Analysis and Recognition (ICDAR), 2019, pp. 655–662. https://doi.org/10.1109/ICDAR.2019.00110.

K. Leung, “EVALUATE OCR OUTPUT QUALITY WITH CHARACTER ERROR RATE (CER) AND WORD ERROR RATE (WER).” [Online]. Available: https://towardsdatascience.com/evaluating-ocr-output-quality-with-character-error-rate-cer-and-word-error-rate-wer-853175297510

Yuhandri, “PERBANDINGAN METODE CROPPING PADA SEBUAH CITRA UNTUK PENGEMBILAN MOTIF TERTENTU PADA KAIN SONGKET SUMATERA BARAT,” Jurnal KomtekInfo, vol. 6, no. 1, pp. 96–105, 2019, [Online]. Available: http://lppm.upiyptk.ac.id/ojsupi/index.php/KOMTEKINFO

O. Russakovsky et al., “IMAGENET LARGE SCALE VISUAL RECOGNITION CHALLENGE,” International Journal of Computer Vision (IJCV), vol. 115, no. 3, pp. 211–252, 2015, https://doi.org/10.1007/s11263-015-0816-y.

Y. Lecun, Y. Bengio, and G. Hinton, “DEEP LEARNING,” Nature, vol. 521, no. 7553, pp. 436–444, 2015, https://doi.org/10.1038/nature14539ï.

Published
2025-07-01
How to Cite
[1]
N. D. Fazira and A. Fauzan, “TRANSFORMER-BASED OPTICAL CHARACTER RECOGNITION APPROACH FOR IDENTIFYING MOTOR VEHICLES WITH OVERDUE TAXES”, BAREKENG: J. Math. & App., vol. 19, no. 3, pp. 1597-1608, Jul. 2025.