Dialogue Systems for Informal Malagasy : A Comparative Evaluation of LLaMA 3.2 and Mistral 7B

Francis Rakotomalala; Aimé Richard  Hajalalaina; Ndaohialy Manda Vy Ravonimanantsoa

doi:10.52846/stccj.2025.5.1.69

Authors

Francis Rakotomalala Université de Fianarantsoa
Aimé Richard Hajalalaina University of Fianarantsoa
Ndaohialy Manda Vy Ravonimanantsoa University of Antananarivo

DOI:

https://doi.org/10.52846/stccj.2025.5.1.69

Keywords:

Chatbot, Langage Informal Malagasy, LlaMA, Mistral, NLP

Abstract

In the context of low-resource language modeling, this article focuses on adapting the pretrained language models LLaMA and Mistral for the development of a dialogue system in informal Malagasy. Malagasy, being a rich but underrepresented language in data corpora, presents unique challenges in terms of vocabulary, syntactic structure, and informal variations. The aim of this research is to demonstrate the ability of modern language processing models to overcome these challenges and generate relevant responses in this language. The results show a notable improvement in model performance following a specific adaptation phase. A significant reduction in loss and perplexity was observed, indicating the models’ effectiveness in learning and adjusting to the unique characteristics of informal Malagasy. The training of conversational agents also helped maintain good fluency and coherence in dialogues, although further adjustments are needed to improve lexical and syntactic alignment, which are essential for smooth and natural interaction. When comparing the performance of LLaMA 3.2 3B and Mistral 7B, Mistral stands out for its ability to generate more natural and fluid dialogues, while LLaMA excels in tasks requiring strict and precise content matching. These findings highlight the effectiveness of these models for developing dialogue systems tailored to the specificities of informal Malagasy, while also underlining the potential for ongoing improvement.

References

H. Shum, X. He, et D. Li, « From Eliza to XiaoIce: challenges and opportunities with social chatbots », Frontiers Inf Technol Electronic Eng, vol. 19, no 1, p. 10-26, janv. 2018

A. Vaswani, « Attention is all you need », Advances in Neural Information Processing Systems, 2017.

A. Radford, J. Wu, R. Child, D. Luan, D. Amodei, et I. Sutskever, « Language models are unsupervised multitask learners », OpenAI blog, vol. 1, no 8, p. 9, 2019.

B. P. King, « Practical Natural Language Processing for Low-Resource Languages. », PhD Thesis, 2015.

H. Touvron et al., « Llama: Open and efficient foundation language models », arXiv preprint arXiv:2302.13971, 2023.

Q. J. Albert, A. Sablayrolles, A. Mensch, C. Bamford, et D. S. Chaplot, « Mistral 7B », arXiv, 2023.

H. Zhang, H. Song, S. Li, M. Zhou, et D. Song, « A Survey of Controllable Text Generation Using Transformer-based Pre-trained Language Models », ACM Comput. Surv., vol. 56, no 3, p. 1-37, mars 2024

R. Sennrich, « Neural Machine Translation », Institute for Language, Cognition and Computation University of Edinburgh, vol. 18, 2016, Consulté le: 2 janvier 2025.

F. Rakotomalala, A. R. Hajalalaina, M. V. Ravonimanantsoa Ndaohialy, A. Andriavelonera Alexandre, et A. H. Ranaivoson, « FLICs (Facebook Language Informal Corpus): a novel dataset for informal language », Int J Data Sci Anal, vol. 18, no 4, p. 393-403, oct. 2024.

A. Bendale, M. Sapienza, S. Ripplinger, S. Gibbs, J. Lee, et P. Mistry, « SUTRA: Scalable Multilingual Language Model Architecture », 7 mai 2024.

B. Zhang et R. Sennrich, « Root mean square layer normalization », Advances in Neural Information Processing Systems, vol. 32, 2019, Consulté le: 12 mai 2025.

J. Su, M. Ahmed, Y. Lu, S. Pan, W. Bo, et Y. Liu, « Roformer: Enhanced transformer with rotary position embedding », Neurocomputing, vol. 568, p. 127063, 2024.

N. Shazeer, « Fast Transformer Decoding: One Write-Head is All You Need », 6 novembre 2019.

N. Shazeer, « GLU Variants Improve Transformer », 12 février 2020.

A. Balachandran, « Tamil-Llama: A New Tamil Language Model Based on Llama 2 », 10 novembre 2023.