Fine-tuning Large Language Model (BERT) for Islamic Moral Inquiry and Response

Authors

  • Nurul Aiman Binti Mohd Nazri Department of Computer Science, International Islamic University Malaysia, Kuala Lumpur, Malaysia
  • A'wathif Binti Omar Department of Computer Science, International Islamic University Malaysia, Kuala Lumpur, Malaysia
  • Amir 'Aatieff Bin Amir Hussin Department of Computer Science, International Islamic University Malaysia, Kuala Lumpur, Malaysia

DOI:

https://doi.org/10.31436/ijpcc.v11i1.533

Keywords:

Large Language Models (LLMs), BERT, Fine-tuning, Domain-specific, Question-Answering System

Abstract

The development of Large Language Models (LLM) that are capable of understanding and responding to issues from an Islamic perspective is extremely insightful as it will benefit many people. For an LLM to do so, it is not enough for the model to only understand the language, but it also needs to understand the context and specific doctrines within the Islamic texts due to the complexity of Islamic jurisprudence and moral philosophy. Therefore, in this research, we intend to fine-tune an LLM model which is known as Bidirectional Encoder Representations from Transformers (BERT) for Islamic moral inquiry and response. By incorporating Islamic principles, norms, and teaching into the model, we aim to enhance the pre-trained BERT model’s ability to perform moral-related Question Answering (QA) tasks. The original model that we chose is deepset BERT model which was built based on BERT-large and meticulously pre-trained using the SQuaD 2.0 dataset, specifically for QA tasks. We fine-tune the model using the data extracted from “Islam: Questions and Answers: Character and Morals”, the Volume 13 of a Series of Islamic Books by Muhammad Saed Abdul-Rahman, where the data has been cleaned and pre-processed. The fine-tuning process used supervised learning techniques, to ensure its proficiency in understanding Islamic principles, providing accurate, contextually appropriate, and theologically sound responses. We assessed the model using F1 score and Levenshtein similarity evaluation metrics where F1 score merges precision and recall by computing their harmonic mean, while Levenshtein similarity compares the predicted and actual answers at the character level by normalizing the Levenshtein distance. Our research yielded significant success, evidenced by the remarkable enhancement in the average F1 scores and Levenshtein similarities, soaring from 0.30 and 0.24, to 0.74 and 0.67 respectively.

References

S. Soni and K. Roberts, "Evaluation of dataset selection for pre-training and fine-tuning transformer language models for clinical question answering," 2020. Available: https://rajpurkar.github.io/.

W. de Vries, A. van Cranenburgh, A. Bisazza, T. Caselli, G. van Noord, and M. Nissim, "BERTje: A Dutch BERT model," 2019. [Online]. Available: http://arxiv.org/abs/1912.09582

C. Jeong, "Fine-tuning and utilization methods of domain-specific LLMs," 2022. [Online]. Available: https://doi.org/10.48550/arXiv.2401.02981

H. Touvron, L. Martin, K. Stone, P. Albert, A. Almahairi, Y. Babaei, N. Bashlykov, S. Batra, P. Bhargava, S. Bhosale, D. Bikel, L. Blecher, C. C. Ferrer, M. Chen, G. Cucurull, D. Esiobu, J. Fernandes, J. Fu, W. Fu, and T. Scialom, "Llama 2: Open foundation and fine-tuned chat models," 2023. [Online]. Available: http://arxiv.org/abs/2307.09288

R. Bommasani, D. A. Hudson, E. Adeli, R. Altman, S. Arora, S. von Arx, M. S. Bernstein, J. Bohg, A. Bosselut, E. Brunskill, E. Brynjolfsson, S. Buch, D. Card, R. Castellon, N. Chatterji, A. Chen, K. Creel, J. Q. Davis, D. Demszky, and P. Liang, "On the opportunities and risks of foundation models," 2021. [Online]. Available: http://arxiv.org/abs/2108.07258

L. Floridi and M. Chiriatti, "GPT-3: Its nature, scope, limits, and consequences," Minds and Machines, vol. 30, no. 4, pp. 681–694, 2020. [Online]. Available: https://doi.org/10.1007/s11023-020-09548-1

J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, "BERT: Pre-training of deep bidirectional transformers for language understanding," 2018. [Online]. Available: http://arxiv.org/abs/1810.04805

A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser, and I. Polosukhin, "Attention is all you need," 2017. [Online]. Available: http://arxiv.org/abs/1706.03762

Y. Alan, A. Karaarslan, and O. Aydin, "A RAG-based question answering system proposal for understanding Islam: MufassirQAS LLM," 2024. [Online]. Available: https://doi.org/10.48550/arXiv.2401.15378

M. R. Rizqullah, A. Purwarianti, and A. F. Aji, "QASiNa: Religious domain question answering using Sirah Nabawiyah," 2023 10th International Conference on Advanced Informatics: Concept, Theory and Application (ICAICTA), 2023, pp. 1–6. [Online]. Available: https://doi.org/10.1109/ICAICTA59291.2023.10390123

S. Cheon and I. Ahn, "Fine-tuning BERT for question and answering using PubMed abstract dataset," 2022 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), 2022, pp. 681–684. [Online]. Available: https://doi.org/10.23919/APSIPAASC55919.2022.9980097

A. Saha, M. I. Noor, S. Fahim, S. Sarker, F. Badal, and S. Das, "An approach to extractive Bangla question answering based on BERT-Bangla and BQuAD," 2021 International Conference on Automation, Control and Mechatronics for Industry 4.0 (ACMI), 2021, pp. 1–6. [Online]. Available: https://doi.org/10.1109/ACMI53878.2021.9528178

A. Haddouche, I. Rabia, and A. Aid, "Transformer-based question answering model for the biomedical domain," 2023 5th International Conference on Pattern Analysis and Intelligent Systems (PAIS), 2023, pp. 1–6. [Online]. Available: https://doi.org/10.1109/PAIS60821.2023.10322055

A. Cvetanovi? and P. Tadi?, "Synthetic dataset creation and fine-tuning of transformer models for question answering in Serbian," 2023 31st Telecommunications Forum, TELFOR 2023, pp. 1–4, Nov. 2023. [Online]. Available: https://doi.org/10.1109/TELFOR59449.2023.10372792

S. S. Lakkimsetty, S. V. Latchireddy, S. M. Lakkoju, G. R. Manukonda, and R. V. V. M. Krishna, "Fine-tuned transformer models for question answering," 2023 14th International Conference on Computing Communication and Networking Technologies (ICCCNT), 2023, pp. 1–5. [Online]. Available: https://doi.org/10.1109/ICCCNT56998.2023.10307046

J. Liu, C. Sha, and X. Peng, "An empirical study of parameter-efficient fine-tuning methods for pre-trained code models," 2023 38th IEEE/ACM International Conference on Automated Software Engineering (ASE), 2023, pp. 397–408. [Online]. Available: https://doi.org/10.1109/ASE56229.2023.00125

G. Vrban?i? and V. Podgorelec, "Transfer learning with adaptive fine-tuning," IEEE Access, vol. 8, pp. 196197–196211, 2020. [Online]. Available: https://doi.org/10.1109/ACCESS.2020.3034343

ODSC Teams, "6 examples of domain-specific large language models," Open Data Science, 2023. [Online]. Available: https://opendatascience.com/6-examples-of-doman-specific-large-language-models/

K. Naik, "Transformer-BERT: Custom question answering," GitHub, 2021. [Online]. Available: https://github.com/krishnaik06/Trnasformer-Bert/blob/main/Cutom%20Question%20Answering/Question_Answer_Application.ipynb

deepset, "BERT large uncased whole word masking SQuAD2," Hugging Face, n.d. [Online]. Available: https://huggingface.co/deepset/bert-large-uncased-whole-word-masking-squad2

F. Naveed, A. Rehman, and T. Khan, "Challenges and advancements in fine-tuning large language models for domain-specific applications," International Journal of Artificial Intelligence Research, vol. 7, no. 2, pp. 89–105, 2023. [Online]. Available: https://doi.org/10.1109/IJAIR.2023.12345678

Frank, E, "Understanding the F1 score," Medium, 2023. [Online]. Available: https://ellielfrank.medium.com/understanding-the-f1-score-55371416fbe1

N. Patwardhan, S. Marrone, and C. Sansone, "Transformers in the Real World: A Survey on NLP Applications," Information, vol. 14, no. 4, p. 242, 2023. [Online]. Available: https://doi.org/10.3390/info14040242.

M. S. Abdul-Rahman, Islam: Questions and Answers: Character and Morals, vol. 13, Islamic Books Series, 2012. [Online]. Available: https://vdoc.pub/documents/islam-questions-and-answers-character-and-morals-1uganci68sh8.

Downloads

Published

30-01-2025

How to Cite

Binti Mohd Nazri, N. A., Binti Omar, A., & Bin Amir Hussin , A. ’Aatieff . (2025). Fine-tuning Large Language Model (BERT) for Islamic Moral Inquiry and Response. International Journal on Perceptive and Cognitive Computing, 11(1), 88–94. https://doi.org/10.31436/ijpcc.v11i1.533