Does Personalization Matter in Prompting? A Case Study of Classifying Paper Metadata Using Zero-Shot Prompting

Authors

  • Chandra Lesmana Universitas Indonesia
  • Muhammad Okky Ibrohim Universitas Indonesia
  • Indra Budi Universitas Indonesia

DOI:

https://doi.org/10.24252/instek.v10i1.57445

Keywords:

systematic literature review, paper metadata classification, large language model, zero-shot prompting, personalization

Abstract

Systematic Literature Review (SLR) is one way for researchers to obtain information on research developments on a topic in a structured manner. This makes SLR a preferred method by researchers because the process involves systematic, objective analysis and focuses on answering research questions. In general, there are three stages to conducting SLR, namely planning, implementation, and reporting. However, compiling an SLR takes a long time because it goes through all the stages one by one. To overcome this problem, an automation process is needed so that it can speed up the SLR compilation process. Previous studies have carried out an automation process in the form of SLR document classification by utilizing several machine learning models that require a lot of training data like Naïve Bayes, Support Vector Machine, and Logistic Model Tree. In this study, the authors conducted an automation process by utilizing open-source Large Language Model (LLM) namely Mistral-7B-Instruct-v0.2 and LLaMA-3.1–8B to classify title and abstract of SLR documents. We compared the effect of using personalization on zero-shot prompting. By using LLM with zero-shot prompting, the classification process no longer requires training data, so that it does not need data annotation cost. Experiment results showed that personalization improved classification performance, getting the best results with Macro F1 0.5538 using the Llama 3.1 model.

Downloads

Download data is not yet available.

References

[1] G. Lame, “Systematic Literature Reviews: An Introduction,” Proc. Des. Soc. Int. Conf. Eng. Des., vol. 1, no. 1, pp. 1633–1642, Jul. 2019, doi: 10.1017/dsi.2019.169.

[2] Y. Xiao and M. Watson, “Guidance on Conducting a Systematic Literature Review,” J. Plan. Educ. Res., vol. 39, no. 1, pp. 93–112, Mar. 2019, doi: 10.1177/0739456X17723971.

[3] A. Chapman et al., “Overcoming challenges in conducting systematic reviews in implementation science: a methods commentary,” Syst. Rev., vol. 12, no. 1, p. 116, Jul. 2023, doi: 10.1186/s13643-023-02285-3.

[4] H. Almeida, M.-J. Meurs, L. Kosseim, and A. Tsang, “Data Sampling and Supervised Learning for HIV Literature Screening,” IEEE Trans. NanoBioscience, vol. 15, no. 4, pp. 354–361, Jun. 2016, doi: 10.1109/TNB.2016.2565481.

[5] A. Bannach-Brown et al., “Machine learning algorithms for systematic review: reducing workload in a preclinical review of animal studies and reducing human screening error,” Syst. Rev., vol. 8, no. 1, p. 23, Dec. 2019, doi: 10.1186/s13643-019-0942-7.

[6] H. Naveed et al., “A Comprehensive Overview of Large Language Models,” Nov. 23, 2023, arXiv: arXiv:2307.06435. Accessed: Dec. 25, 2023. [Online]. Available: http://arxiv.org/abs/2307.06435

[7] Y. Chang et al., “A Survey on Evaluation of Large Language Models,” ACM Trans. Intell. Syst. Technol., vol. 15, no. 3, pp. 1–45, Jun. 2024, doi: 10.1145/3641289.

[8] T. B. Brown et al., “Language Models are Few-Shot Learners,” Jul. 22, 2020, arXiv: arXiv:2005.14165. Accessed: May 27, 2024. [Online]. Available: http://arxiv.org/abs/2005.14165

[9] T. Kojima, S. S. Gu, M. Reid, Y. Matsuo, and Y. Iwasawa, “Large language models are zero-shot reasoners,” in Proceedings of the 36th International Conference on Neural Information Processing Systems, in NIPS ’22. Red Hook, NY, USA: Curran Associates Inc., 2022.

[10] X. Luo et al., “Potential Roles of Large Language Models in the Production of Systematic Reviews and Meta-Analyses,” J. Med. Internet Res., vol. 26, p. e56780, Jun. 2024, doi: 10.2196/56780.

[11] F. Dennstädt, J. Zink, P. M. Putora, J. Hastings, and N. Cihoric, “Title and abstract screening for literature reviews using large language models: an exploratory study in the biomedical domain,” Syst. Rev., vol. 13, no. 1, p. 158, Jun. 2024, doi: 10.1186/s13643-024-02575-4.

[12] J. Chen et al., “When large language models meet personalization: perspectives of challenges and opportunities,” World Wide Web, vol. 27, no. 4, p. 42, Jul. 2024, doi: 10.1007/s11280-024-01276-1.

[13] B. Clavié, A. Ciceu, F. Naylor, G. Soulié, and T. Brightwell, “Large Language Models in the Workplace: A Case Study on Prompt Engineering for Job Type Classification,” in Natural Language Processing and Information Systems, E. Métais, F. Meziane, V. Sugumaran, W. Manning, and S. Reiff-Marganiec, Eds., Cham: Springer Nature Switzerland, 2023, pp. 3–17.

[14] A. Salemi, S. Mysore, M. Bendersky, and H. Zamani, “LaMP: When Large Language Models Meet Personalization,” in Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), L.-W. Ku, A. Martins, and V. Srikumar, Eds., Bangkok, Thailand: Association for Computational Linguistics, Aug. 2024, pp. 7370–7392. doi: 10.18653/v1/2024.acl-long.399.

[15] M. O. Ibrohim, C. Bosco, and V. Basile, “Sentiment Analysis for the Natural Environment: A Systematic Review,” ACM Comput. Surv., vol. 56, no. 4, pp. 1–37, Apr. 2024, doi: 10.1145/3604605.

[16] M. M. Kebede, C. Le Cornet, and R. T. Fortner, “In-depth evaluation of machine learning methods for semi-automating article screening in a systematic review of mechanistic literature,” Res. Synth. Methods, vol. 14, no. 2, pp. 156–172, 2023, doi: https://doi.org/10.1002/jrsm.1589.

[17] Y. Bao et al., “Using Machine Learning and Natural Language Processing to Review and Classify the Medical Literature on Cancer Susceptibility Genes,” JCO Clin. Cancer Inform., no. 3, pp. 1–9, Dec. 2019, doi: 10.1200/CCI.19.00042.

[18] A. Salemi, S. Mysore, M. Bendersky, and H. Zamani, “LaMP: When Large Language Models Meet Personalization,” Jun. 05, 2024, arXiv: arXiv:2304.11406. doi: 10.48550/arXiv.2304.11406.

[19] A. Dubey et al., “The Llama 3 Herd of Models,” Aug. 15, 2024, arXiv: arXiv:2407.21783. Accessed: Sep. 11, 2024. [Online]. Available: http://arxiv.org/abs/2407.21783

[20] A. Q. Jiang et al., “Mistral 7B,” Oct. 10, 2023, arXiv: arXiv:2310.06825. Accessed: Sep. 04, 2024. [Online]. Available: http://arxiv.org/abs/2310.06825

[21] A. López-Pineda, R. Nouni-García, Á. Carbonell-Soliva, V. F. Gil-Guillén, C. Carratalá-Munuera, and F. Borrás, “Validation of large language models (Llama 3 and ChatGPT-4o mini) for title and abstract screening in biomedical systematic reviews,” Res. Synth. Methods, vol. 16, no. 4, pp. 620–630, Jul. 2025, doi: 10.1017/rsm.2025.15.

[22] K. Takahashi, K. Yamamoto, A. Kuchiba, and T. Koyama, “Confidence interval for micro-averaged F (1) and macro-averaged F (1) scores.,” Appl. Intell. Dordr. Neth., vol. 52, no. 5, pp. 4961–4972, Mar. 2022, doi: 10.1007/s10489-021-02635-5.

[23] M. Q. R. Pembury Smith and G. D. Ruxton, “Effective use of the McNemar test,” Behav. Ecol. Sociobiol., vol. 74, no. 11, Nov. 2020, doi: 10.1007/s00265-020-02916-y.

[24] Z. Gerolemou and J. Scholtes, “Target-Based Sentiment Analysis as a Sequence-Tagging Task,” Nov. 2019.

[25] B. Khanna N, S. Moses J, and N. M, “SoftMax based User Attitude Detection Algorithm for Sentimental Analysis,” Procedia Comput. Sci., vol. 125, pp. 313–320, 2018, doi: 10.1016/j.procs.2017.12.042.

[26] L. L. Benites-Lazaro, L. L. Giatti, W. C. Sousa Junior, and A. Giarolla, “Land-water-food nexus of biofuels: Discourse and policy debates in Brazil,” Environ. Dev., vol. 33, p. 100491, Mar. 2020, doi: 10.1016/j.envdev.2019.100491.

[27] M. Altaweel and C. Bone, “Applying content analysis for investigating the reporting of water issues,” Comput. Environ. Urban Syst., vol. 36, no. 6, pp. 599–613, Nov. 2012, doi: 10.1016/j.compenvurbsys.2012.03.004.

Downloads

Published

2025-06-30

How to Cite

[1]
C. Lesmana, Muhammad Okky Ibrohim, and Indra Budi, “Does Personalization Matter in Prompting? A Case Study of Classifying Paper Metadata Using Zero-Shot Prompting”, INSTEK, vol. 10, no. 1, pp. 252–262, Jun. 2025.

Issue

Section

Volume 10 Nomor 1 April Tahun 2025