Artificial Intelligence in Recordkeeping
A Systematic Review of Machine Learning Applications for Automated Records Classification
DOI:
https://doi.org/10.24252/v13i1a12Keywords:
Records Management, Artificial Intelligence, Machine Learning, Archival ScienceAbstract
This study presents a systematic literature review (SLR) of scholarly research published between 2014 and 2023, with the aim of identifying prevailing trends, methodological approaches, and contextual factors surrounding the use of Machine Learning (ML) models for records classification within Records Management and Archival Science. Employing the PRISMA framework, the review analyzes a curated selection of studies to assess the scope and maturity of ML applications in this domain. The findings revealed that while ML has been increasingly explored for tasks such as classification and appraisal, its application remains geographically skewed, with the majority of studies originating from Global North countries. The models employed range from probabilistic and regression-based algorithms to decision tree classifiers, reflecting diverse but largely traditional methodological approaches. The adoption of more sophisticated techniques, including deep learning and large language models, was still limited. The study underscores a critical research gap concerning the implementation of advanced ML models, particularly in the context of Global South institutions, where such technologies could significantly enhance recordkeeping efficiency and scalability. This review highlights the need for further empirical studies that develop and evaluate cutting-edge ML models in diverse archival contexts, promoting more inclusive and globally representative innovation in archival automation.
Downloads
References
Bardelli, C., Rondinelli, A., Vecchio, R., & Figini, S. (2020). Automatic Electronic Invoice Classification Using Machine Learning Models. Machine Learning and Knowledge Extraction, 2(4), 617–629. Scopus. https://doi.org/10.3390/make2040033
Büttner, G. (2019). Auto-classification in an international organization: Report from a feasibility study. Comma, 2017(2), 15–26. https://doi.org/10.3828/comma.2017.2.2
Colavizza, G., Blanke, T., Jeurgens, C., & Noordegraaf, J. (2022). Archives and AI: An Overview of Current Debates and Future Perspectives. Journal on Computing and Cultural Heritage, 15(1), 1–15. https://doi.org/10.1145/3479010
Franks, J. (2022). Text Classification for Records Management. Journal on Computing and Cultural Heritage, 15(3). Scopus. https://doi.org/10.1145/3485846
Goodrum, H., Roberts, K., & Bernstam, E. V. (2020). Automatic classification of scanned electronic health record documents. International Journal of Medical Informatics, 144. Scopus. https://doi.org/10.1016/j.ijmedinf.2020.104302
Hjørland, B. (2023). Description: Its meaning, epistemology, and use with emphasis on information science. Journal of the Association for Information Science and Technology, 74(13), 1532–1549. https://doi.org/10.1002/asi.24834
Hutchinson, T. (2020). Natural language processing and machine learning as practical toolsets for archival processing. Records Management Journal, 30(2), 155–174. https://doi.org/10.1108/RMJ-09-2019-0055
Kowsari, K., Jafari Meimandi, K., Heidarysafa, M., Mendu, S., Barnes, L., & Brown, D. (2019). Text Classification Algorithms: A Survey. Information, 10(4), 150. https://doi.org/10.3390/info10040150
Lei, M., Ge, J., Li, Z., Li, C., Zhou, Y., Zhou, X., & Luo, B. (2017). Automatically Classify Chinese Judgment Documents Utilizing Machine Learning Algorithms. In Z. Bao, G. Trajcevski, L. Chang, & W. Hua (Eds.), Database Systems for Advanced Applications (Vol. 10179, pp. 3–17). Springer International Publishing. https://doi.org/10.1007/978-3-319-55705-2_1
Li, Q., Peng, H., Li, J., Xia, C., Yang, R., Sun, L., Yu, P. S., & He, L. (2022). A Survey on Text Classification: From Traditional to Deep Learning. ACM Transactions on Intelligent Systems and Technology, 13(2), 1–41. https://doi.org/10.1145/3495162
Makhlouf Shabou, B. (2015). Digital diplomatics and measurement of electronic public data qualities: What lessons should be learned? Records Management Journal, 25(1), 56–77. https://doi.org/10.1108/RMJ-01-2015-0006
Meng, Y., Zhang, Y., Huang, J., Xiong, C., Ji, H., Zhang, C., & Han, J. (2020). Text Classification Using Label Names Only: A Language Model Self-Training Approach (Version 1). arXiv. https://doi.org/10.48550/ARXIV.2010.07245
Minaee, S., Kalchbrenner, N., Cambria, E., Nikzad, N., Chenaghlu, M., & Gao, J. (2022). Deep Learning--based Text Classification: A Comprehensive Review. ACM Computing Surveys, 54(3), 1–40. https://doi.org/10.1145/3439726
Mokhtar, U. A., & Yusof, Z. M. (2015). Classification: The understudied concept. International Journal of Information Management, 35(2), 176–182. https://doi.org/10.1016/j.ijinfomgt.2014.12.002
Oladejo, B., & Hadžidedić, S. (2021). Electronic records management – a state of the art review. Records Management Journal, 31(1), 74–88. https://doi.org/10.1108/RMJ-09-2019-0059
Page, M. J., McKenzie, J. E., Bossuyt, P. M., Boutron, I., Hoffmann, T. C., Mulrow, C. D., Shamseer, L., Tetzlaff, J. M., Akl, E. A., Brennan, S. E., Chou, R., Glanville, J., Grimshaw, J. M., Hróbjartsson, A., Lalu, M. M., Li, T., Loder, E. W., Mayo-Wilson, E., McDonald, S., … Moher, D. (2021). The PRISMA 2020 statement: An updated guideline for reporting systematic reviews. BMJ, 372, n71. https://doi.org/10.1136/bmj.n71
Palanivinayagam, A., El-Bayeh, C. Z., & Damaševičius, R. (2023). Twenty Years of Machine-Learning-Based Text Classification: A Systematic Review. Algorithms, 16(5), 236. https://doi.org/10.3390/a16050236
Paul, J., Khatri, P., & Kaur Duggal, H. (2024). Frameworks for developing impactful systematic literature reviews and theory building: What, Why and How? Journal of Decision Systems, 33(4), 537–550. https://doi.org/10.1080/12460125.2023.2197700
Paul, J., Lim, W. M., O’Cass, A., Hao, A. W., & Bresciani, S. (2021). Scientific procedures and rationales for systematic literature reviews (SPAR‐4‐SLR). International Journal of Consumer Studies, 45(4). https://doi.org/10.1111/ijcs.12695
Payne, N. (2023). An Intelligent Class—The Sequel: The Development Of A Novel Context Capturing Method For The Functional Auto Classification Of Records. In He J., Palpanas T., Hu X., Cuzzocrea A., Dou D., Slezak D., Wang W., Gruca A., Lin J.C.-W., & Agrawal R. (Eds.), Proc. - IEEE Int. Conf. Big Data, BigData (pp. 2071–2082). Institute of Electrical and Electronics Engineers Inc.; Scopus. https://doi.org/10.1109/BigData59044.2023.10386255
Payne, N., & Baron, J. R. (2017). Auto-categorization methods for digital archives. 2017 IEEE International Conference on Big Data (Big Data), 2288–2298. https://doi.org/10.1109/BigData.2017.8258182
Petticrew, M., & Roberts, H. (2006). Systematic Reviews in the Social Sciences: A Practical Guide (1st ed.). Wiley. https://doi.org/10.1002/9780470754887
Pintas, J. T., Fernandes, L. A. F., & Garcia, A. C. B. (2021). Feature selection methods for text classification: A systematic literature review. Artificial Intelligence Review, 54(8), 6149–6200. https://doi.org/10.1007/s10462-021-09970-6
Riduan, G. M., Soesanti, I., & Adji, T. B. (2021). A Systematic Literature Review of Text Classification: Datasets and Methods. 2021 IEEE 5th International Conference on Information Technology, Information Systems and Electrical Engineering (ICITISEE), 71–77. https://doi.org/10.1109/ICITISEE53823.2021.9655788
Rolan, G., Humphries, G., Jeffrey, L., Samaras, E., Antsoupova, T., & Stuart, K. (2019). More human than human? Artificial intelligence in the archive. Archives and Manuscripts, 47(2), 179–203. https://doi.org/10.1080/01576895.2018.1502088
Sampaio, A. (2015). Improving Systematic Mapping Reviews. ACM SIGSOFT Software Engineering Notes, 40(6), 1–8. https://doi.org/10.1145/2830719.2830732
Toosi, A., Bottino, A. G., Saboury, B., Siegel, E., & Rahmim, A. (2021). A Brief History of AI: How to Prevent Another Winter (A Critical Review). PET Clinics, 16(4), 449–469. https://doi.org/10.1016/j.cpet.2021.07.001
Triantafyllou, I. (2023). Thematic Categorization on University Records. 2023 IEEE 11th International Conference on Systems and Control (ICSC), 384–389. https://doi.org/10.1109/ICSC58660.2023.10449857
Vellino, A., & Alberts, I. (2016). Assisting the appraisal of e-mail records with automatic classification. Records Management Journal, 26(3), 293–313. Scopus. https://doi.org/10.1108/RMJ-02-2016-0006
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Sigit Sumarsono, Muhamad Prabu Wibowo

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
By submitting your manuscript to our journal, you are following Copyright and License