Artificial Intelligence in Recordkeeping
A Systematic Review of Machine Learning Applications for Automated Records Classification
DOI:
https://doi.org/10.24252/v13i1a12Keywords:
Records Management, Artificial Intelligence, Machine Learning, Archival ScienceAbstract
This study presents a systematic literature review (SLR) of scholarly research published between 2014 and 2023, with the aim of identifying prevailing trends, methodological approaches, and contextual factors surrounding the use of Machine Learning (ML) models for records classification within Records Management and Archival Science. Employing the PRISMA framework, the review analyzes a curated selection of studies to assess the scope and maturity of ML applications in this domain. The findings revealed that while ML has been increasingly explored for tasks such as classification and appraisal, its application remains geographically skewed, with the majority of studies originating from Global North countries. The models employed range from probabilistic and regression-based algorithms to decision tree classifiers, reflecting diverse but largely traditional methodological approaches. The adoption of more sophisticated techniques, including deep learning and large language models, was still limited. The study underscores a critical research gap concerning the implementation of advanced ML models, particularly in the context of Global South institutions, where such technologies could significantly enhance recordkeeping efficiency and scalability. This review highlights the need for further empirical studies that develop and evaluate cutting-edge ML models in diverse archival contexts, promoting more inclusive and globally representative innovation in archival automation.
Downloads
References
Alsmadi, I., & Gan, K. H. (2019). Review of short-text classification. International Journal of Web Information Systems, 15(2), 155–182. https://doi.org/10.1108/IJWIS-12-2017-0083
Bardelli, C., Rondinelli, A., Vecchio, R., & Figini, S. (2020). Automatic Electronic Invoice Classification Using Machine Learning Models. Machine Learning and Knowledge Extraction, 2(4), 617–629. Scopus. https://doi.org/10.3390/make2040033
Büttner, G. (2019). Auto-classification in an international organization: Report from a feasibility study. Comma, 2017(2), 15–26. https://doi.org/10.3828/comma.2017.2.2
Colavizza, G., Blanke, T., Jeurgens, C., & Noordegraaf, J. (2022). Archives and AI: An Overview of Current Debates and Future Perspectives. Journal on Computing and Cultural Heritage, 15(1), 1–15. https://doi.org/10.1145/3479010
Franks, J. (2022). Text Classification for Records Management. Journal on Computing and Cultural Heritage, 15(3). Scopus. https://doi.org/10.1145/3485846
Goodrum, H., Roberts, K., & Bernstam, E. V. (2020). Automatic classification of scanned electronic health record documents. International Journal of Medical Informatics, 144. Scopus. https://doi.org/10.1016/j.ijmedinf.2020.104302
Hjørland, B. (2023). Description: Its meaning, epistemology, and use with emphasis on information science. Journal of the Association for Information Science and Technology, 74(13), 1532–1549. https://doi.org/10.1002/asi.24834
Hutchinson, T. (2020). Natural language processing and machine learning as practical toolsets for archival processing. Records Management Journal, 30(2), 155–174. https://doi.org/10.1108/RMJ-09-2019-0055
Kowsari, K., Jafari Meimandi, K., Heidarysafa, M., Mendu, S., Barnes, L., & Brown, D. (2019). Text Classification Algorithms: A Survey. Information, 10(4), 150. https://doi.org/10.3390/info10040150
Lei, M., Ge, J., Li, Z., Li, C., Zhou, Y., Zhou, X., & Luo, B. (2017). Automatically Classify Chinese Judgment Documents Utilizing Machine Learning Algorithms. In Z. Bao, G. Trajcevski, L. Chang, & W. Hua (Eds.), Database Systems for Advanced Applications (Vol. 10179, pp. 3–17). Springer International Publishing. https://doi.org/10.1007/978-3-319-55705-2_1
Li, Q., Peng, H., Li, J., Xia, C., Yang, R., Sun, L., Yu, P. S., & He, L. (2022). A Survey on Text Classification: From Traditional to Deep Learning. ACM Transactions on Intelligent Systems and Technology, 13(2), 1–41. https://doi.org/10.1145/3495162
Makhlouf Shabou, B. (2015). Digital diplomatics and measurement of electronic public data qualities: What lessons should be learned? Records Management Journal, 25(1), 56–77. https://doi.org/10.1108/RMJ-01-2015-0006
Meng, Y., Zhang, Y., Huang, J., Xiong, C., Ji, H., Zhang, C., & Han, J. (2020). Text Classification Using Label Names Only: A Language Model Self-Training Approach (Version 1). arXiv. https://doi.org/10.48550/ARXIV.2010.07245
Minaee, S., Kalchbrenner, N., Cambria, E., Nikzad, N., Chenaghlu, M., & Gao, J. (2022). Deep Learning-based Text Classification: A Comprehensive Review. ACM Computing Surveys, 54(3), 1–40. https://doi.org/10.1145/3439726
Mokhtar, U. A., & Yusof, Z. M. (2015). Classification: The understudied concept. International Journal of Information Management, 35(2), 176–182. https://doi.org/10.1016/j.ijinfomgt.2014.12.002
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Sigit Sumarsono, Muhamad Prabu Wibowo

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
By submitting your manuscript to our journal, you are following Copyright and License