Unsloth just released their GGUF of Kimi-K2-Instruct-0905!

ddr5 5600 MT/s, 24 channels. It also has a blackwell 6000 pro. You can see the previous Kimi-k2 model running here: https://youtu.be/eCapGtOHG6I?si=fXWLU4Dv0dHxXzS0&t=1704

PC Build and CPU-only inference: https://youtu.be/v4810MVGhog

[-]

TheAndyGeorge@reddit (OP)

and here i am buying a 2x32GB ddr4 kit to double my system RAM😅

[-]

jacek2023@reddit

How do you run it locally? I can't and I have 3x3090

[-]

TheAndyGeorge@reddit (OP)

have you considered 30x3090s? That'd get you partway there!

[-]

If you're looking for a serious answer, then yeah you'd need a lot of beefy hardware. 1TB of system RAM is certainly doable, if expensive. I think the point is that it is still possible to run this model locally.

[-]

jacek2023@reddit

That wasn't my question :)

[-]

TheAndyGeorge@reddit (OP)

You're asking how to use a GGUF? I use Ollama, so it's as simple as ollama run hf.co/unsloth/Kimi-K2-Instruct-0905-GGUF:Q8_0. I assume llama.cpp and vLLM or anything that can consume GGUF should be able to handle this as well.

[-]

jacek2023@reddit

Can you show your t/s?

[-]

TheAndyGeorge@reddit (OP)

I do not have 1TB of RAM or VRAM.

[-]

jacek2023@reddit

So how do you use this file???

[-]

cms2307@reddit

Fuck off do your own research

[-]

Fine-Will@reddit

If you're looking for a serious answer, then yeah you'd need a lot of beefy hardware. 1TB of system RAM is certainly doable, if expensive. I think the point is that it is still possible to run this model locally.

[-]

Vatnik_Annihilator@reddit

Were you dropped on your head as a child? They've answered your question multiple times.

Unless you have an extremely expensive professional setup, you will not be able to run this model. It is not meant for people to run on their home computers. There are much smaller models for that purpose.

[-]