Artificial Intelligence in Recordkeeping

A Systematic Review of Machine Learning Applications for Automated Records Classification

Authors

  • Sigit Sumarsono Universitas Indonesia
  • Muhamad Prabu Wibowo Universitas Indonesia

DOI:

https://doi.org/10.24252/v13i1a12

Keywords:

Records Management, Artificial Intelligence, Machine Learning, Archival Science

Abstract

This study presents a systematic literature review (SLR) of scholarly research published between 2014 and 2023, with the aim of identifying prevailing trends, methodological approaches, and contextual factors surrounding the use of Machine Learning (ML) models for records classification within Records Management and Archival Science. Employing the PRISMA framework, the review analyzes a curated selection of studies to assess the scope and maturity of ML applications in this domain. The findings revealed that while ML has been increasingly explored for tasks such as classification and appraisal, its application remains geographically skewed, with the majority of studies originating from Global North countries. The models employed range from probabilistic and regression-based algorithms to decision tree classifiers, reflecting diverse but largely traditional methodological approaches. The adoption of more sophisticated techniques, including deep learning and large language models, was still limited. The study underscores a critical research gap concerning the implementation of advanced ML models, particularly in the context of Global South institutions, where such technologies could significantly enhance recordkeeping efficiency and scalability. This review highlights the need for further empirical studies that develop and evaluate cutting-edge ML models in diverse archival contexts, promoting more inclusive and globally representative innovation in archival automation.

Downloads

Download data is not yet available.

References

Alsmadi, I., & Gan, K. H. (2019). Review of short-text classification. International Journal of Web Information Systems, 15(2), 155–182. https://doi.org/10.1108/IJWIS-12-2017-0083

Bardelli, C., Rondinelli, A., Vecchio, R., & Figini, S. (2020). Automatic Electronic Invoice Classification Using Machine Learning Models. Machine Learning and Knowledge Extraction, 2(4), 617–629. Scopus. https://doi.org/10.3390/make2040033

Büttner, G. (2019). Auto-classification in an international organization: Report from a feasibility study. Comma, 2017(2), 15–26. https://doi.org/10.3828/comma.2017.2.2

Colavizza, G., Blanke, T., Jeurgens, C., & Noordegraaf, J. (2022). Archives and AI: An Overview of Current Debates and Future Perspectives. Journal on Computing and Cultural Heritage, 15(1), 1–15. https://doi.org/10.1145/3479010

Franks, J. (2022). Text Classification for Records Management. Journal on Computing and Cultural Heritage, 15(3). Scopus. https://doi.org/10.1145/3485846

Goodrum, H., Roberts, K., & Bernstam, E. V. (2020). Automatic classification of scanned electronic health record documents. International Journal of Medical Informatics, 144. Scopus. https://doi.org/10.1016/j.ijmedinf.2020.104302

Hjørland, B. (2023). Description: Its meaning, epistemology, and use with emphasis on information science. Journal of the Association for Information Science and Technology, 74(13), 1532–1549. https://doi.org/10.1002/asi.24834

Hutchinson, T. (2020). Natural language processing and machine learning as practical toolsets for archival processing. Records Management Journal, 30(2), 155–174. https://doi.org/10.1108/RMJ-09-2019-0055

Kowsari, K., Jafari Meimandi, K., Heidarysafa, M., Mendu, S., Barnes, L., & Brown, D. (2019). Text Classification Algorithms: A Survey. Information, 10(4), 150. https://doi.org/10.3390/info10040150

Lei, M., Ge, J., Li, Z., Li, C., Zhou, Y., Zhou, X., & Luo, B. (2017). Automatically Classify Chinese Judgment Documents Utilizing Machine Learning Algorithms. In Z. Bao, G. Trajcevski, L. Chang, & W. Hua (Eds.), Database Systems for Advanced Applications (Vol. 10179, pp. 3–17). Springer International Publishing. https://doi.org/10.1007/978-3-319-55705-2_1

Li, Q., Peng, H., Li, J., Xia, C., Yang, R., Sun, L., Yu, P. S., & He, L. (2022). A Survey on Text Classification: From Traditional to Deep Learning. ACM Transactions on Intelligent Systems and Technology, 13(2), 1–41. https://doi.org/10.1145/3495162

Makhlouf Shabou, B. (2015). Digital diplomatics and measurement of electronic public data qualities: What lessons should be learned? Records Management Journal, 25(1), 56–77. https://doi.org/10.1108/RMJ-01-2015-0006

Meng, Y., Zhang, Y., Huang, J., Xiong, C., Ji, H., Zhang, C., & Han, J. (2020). Text Classification Using Label Names Only: A Language Model Self-Training Approach (Version 1). arXiv. https://doi.org/10.48550/ARXIV.2010.07245

Minaee, S., Kalchbrenner, N., Cambria, E., Nikzad, N., Chenaghlu, M., & Gao, J. (2022). Deep Learning-based Text Classification: A Comprehensive Review. ACM Computing Surveys, 54(3), 1–40. https://doi.org/10.1145/3439726

Mokhtar, U. A., & Yusof, Z. M. (2015). Classification: The understudied concept. International Journal of Information Management, 35(2), 176–182. https://doi.org/10.1016/j.ijinfomgt.2014.12.002

Downloads

Published

2025-05-02

How to Cite

Sumarsono, S., & Wibowo, M. P. (2025). Artificial Intelligence in Recordkeeping: A Systematic Review of Machine Learning Applications for Automated Records Classification. Khizanah Al-Hikmah : Jurnal Ilmu Perpustakaan, Informasi, Dan Kearsipan, 13(1), 148–159. https://doi.org/10.24252/v13i1a12

Issue

Section

Articles

Similar Articles

> >> 

You may also start an advanced similarity search for this article.