Local Arabic Legal Chatbot (RAG + LLM) – Need Advice

Posted by Maleficent-Town8242@reddit | LocalLLaMA | View on Reddit | 4 comments

Hi everyone,

I’m currently working on a project to build a 100% local AI chatbot for a government-related use case focused on data protection (DPO support).

The goal is to create a chatbot that can answer questions about legal texts, regulations, and personal data protection laws, mainly in Arabic. Because of the sensitive nature of the data, everything must run locally (no external APIs).

Current approach:

What I need help with:

  1. What’s the best local LLM for Arabic legal content right now?
  2. Any feedback on using bge-m3 for Arabic RAG?
  3. Should I consider fine-tuning, or is RAG enough for this use case?
  4. Any real-world examples of government / legal chatbots running fully local?
  5. Tips to reduce hallucinations in legal answers?

Thanks in advance!