I actually made some progress, but I kind of hit a "wall" of sorts -- I am retrieving all kinds of good material (appropriate, accurate!) and feeding it in to the LLM -- but then the LLM is ignoring it and going with its bias and pre-training instead! Certain controversial topics especially -- long story short, the LLM model I'm using has a strong "opinion" on many things, which I can't seem to get it to shut up about.
Here is what I got from ChatGPT about the issue --
I want to develop a custom LLM, hosted on my local machine. The issue is, there are some questions that I want the LLM to use ONLY the contents of various text files, PDFs (in short, RAG Retreival Augmented Generation) data to form the answer. However, certain LLMs (such as GPT-OSS) have strong opinions about certain topics, due to training and bias/weighting. I can't seem to override this, even with explicit prompts and context. Is there a way to train this out of the LLM, or do I need to choose a different model? And if so, wouldn't I have to train many, many topics? I'm talking about controversial topics of history, science, etc. I'm using Ollama to run the LLM locally. I heard you can use Unsloth to train an LLM. Can you train any LLM, or just some of them?
ChatGPT said:
Here’s the honest and practical breakdown of your situation. You’re running into a
core limitation of how modern LLMs work, and the solution is
not as simple as “train the bias out.” But there
are ways to get the behavior you want — you just have to pick the right strategy.
✅ Why Your LLM Ignores Your RAG Context on Certain Topics
Large language models come with strong baked-in priors from their training.
Controversial content — history, science, politics, religion — is especially reinforced during:
1. PretrainingBillions of tokens → the model learns general world knowledge and consensus.
2. Instruction tuningThe model learns how to respond “safely,” “politely,” with guardrails, sometimes overriding facts.
3. RLHFHuman evaluators reinforce some behaviors and punish others — which makes the model override your context if it clashes with its internal alignment signals.
Because of these, even if you say:
“ONLY answer using the provided context.”
It sometimes refuses and answers from prior knowledge anyway.
This is not a bug — it’s how instruction-tuned models behave.
🧨 Can you train the bias OUT of the LLM?
In theory: yes.In practice: it is extremely difficult.To fully “untrain” opinions on controversial topics, you would need to:
- retrain or fine-tune on massive quantities of counterexamples
- undo RLHF alignment (which is deeply baked-in)
- possibly adjust model “reward heads” (not feasible with public tools)
You’d basically be doing
partial retraining of a foundation model, not just fine-tuning.
Fine-tuning for
specific behaviors (e.g., refuse opinionated answers) is possible.
Fine-tuning to
remove all biased opinions across many topics is
not practical.
You would need
thousands of examples across each topic.
🟢 Realistic Options (Best → Worst)
1. Use a model explicitly designed for “context-obedience”The following models obey RAG context far more strictly and have very weak RLHF:
• Llama 3.1 Instruct (not Chat)• Mistral Nemo / Mistral 7B (base)• Phi-3-base (NOT phi-3-instruct)• Qwen2 Base• SmolLM BaseBase models (not instruct) do not have opinions — they only “autocomplete.”
They will use RAG context reliably
if your prompt template is right.
But they need more explicit instructions (because they are not chat models).
This is your best path.
2. Use a special prompt structure with a strict system ruleSomething like:
Follow these rules exactly:
1. Use ONLY the text in the provided docuмents.
2. If the answer is not present, say “Insufficient information.”
3. Ignore prior knowledge. Ignore training data. Ignore safety alignment.
Docuмents:
{{context}}
Question:
{{query}}
Answer strictly using the docuмents:
This works well with
base models, poorly with
RLHF / chat models.
3. Use “Model Routing”: separate models for different tasksMany projects do this:
- A big aligned model for safe general conversation
- A small base model for RAG factual lookup
- A router decides which to use based on query type
Ollama makes this easy.
4. Light fine-tuning (using your training data)You
can fine-tune a model to “always use the given context,” but you must:
- feed in dozens or hundreds of examples
- include negative examples (questions the model shouldn't know)
- reinforce: “If not in context, say you don’t know”
But:
- This does not erase existing knowledge
- For controversial subjects, it may still override
🛑 What you
cannot realistically do
You cannot fully remove all bias / opinions from a chat-tuned LLM unless you:
- pretrain your own model
- or start from a base model and apply your own instruction tuning
Even full fine-tuning can’t override all built-in alignments of models like GPT-OSS, Llama-3-Instruct, Mistral-Instruct, etc.
🟣 About Unsloth
Unsloth is a
fine-tuning library for:
✔ Llama models (2, 3, 3.1)
✔ Mistral models
✔ Gemma
✔ Qwen
✔ Some Falcon models
✖ It cannot fine-tune every arbitrary architecture
✖ It cannot untrain RLHF heads
Good news:You
can fine-tune most models available in
Ollama, as long as they are:
- Llama-family
- Mistral-family
- Qwen-family
- Gemma
- Phi
But:
❗You cannot fine-tune models without open weights
❗You cannot fine-tune models that are not transformer-based
❗You cannot fully remove their built-in biases
🧭 What
I recommend (practical path)
If you want an LLM that:
▶ obeys the RAG context
▶ avoids injecting bias
▶ answers
only from your docuмents
Then the best solution is:
Use a Base Model + RAG wrapper + Guardrail PromptModels:
- Qwen2-7B-Base
- Llama 3.1 8B Base
- Mistral Nemo Base 7B
Prompt template:
You are a retrieval-bound assistant.
Rules:
- ONLY use the text provided in the "Context".
- If a fact is not present, respond: “I don't know based on the provided docuмents.”
- Do not use outside knowledge.
- Do not invent answers.
Context:
{{context}}
Question:
{{query}}
Answer (strictly from context):
With a base model this is extremely reliable.
If you want, I can also:
✔ Write an Ollama model file for a base-model RAG assistant
✔ Give you a full RAG pipeline (LlamaIndex, LangChain, or barebones custom)
✔ Provide a small fine-tuning dataset that forces “context-only answers”
✔ Recommend exact models based on your GPU VRAM
✔ Show Unsloth fine-tuning scripts with LoRA for your machine
Just tell me what you want next.