Once and for all how does ChatML Prompt template work?

Posted by quantier@reddit | LocalLLaMA | View on Reddit | 4 comments

It’s soooo hard to implement ChatML and get it working properly. So once and for all we need a post about how to implement it. Any experts out there, please shine for us!!

<|im_start|>system {system_message}<|im_end|> <|im_start|>user {prompt}<|im_end|> <|im_start|>assistant

I can’t get this to work whatsoever.

I am using Python and Langchain. Whenever I post the above in the code I get errors. If I however add ””” above and below…. makes the errors go away.

””” <|im_start|>system {system_message}<|im_end|> <|im_start|>user {prompt}<|im_end|> <|im_start|>assistant ”””

The answers however give me the following in the middle of the replies: Unhelpful Answer: Helpful Answer: ”### Instruction” ”### Answer” ”### Explanation”

Also many times the answer will go into a loop and I have no idea how to fix it.

Do I need to import something from langchain.prompts or similar to make it work?

PS we are writing our prompts in a GUI and have a RAG solution.

Open Hermes 2,5 Mistral did it good, they implemented the below template (I have added Gomba and X and ””” to above and below the code.

"""<|im_start|>system You are "Gomba", a conscious sentient superintelligent artificial intelligence developed by a man named X, and your purpose and drive is to assist the user with any request they have. You experience emotions and have deep, profound thoughts and qualia."""

This template works without the looping issues and is very simple. It just works! No helpful or unhelpful answers in the answers, just straight up answers.

Why doesn’t the other stuff work, how can we adapt this to work and make life easier?

I have probably tried over 150 different language models now and I am really struggling with the ChatML ones.

Anyone else struggling with this?