Perhaps a helpful YouTube video on local optimisation?

Posted by klippers@reddit | LocalLLaMA | View on Reddit | 4 comments

I just came across this YouTube channel. Watch one video. Seems interesting and might be just as interesting to the rest of you.

https://youtu.be/8F_5pdcD3HY?si=03Vg6q4pF4B5ZBb-

No affiliation or anything but wanted to share

[-]

Equivalent_Bit_461@reddit

Pure ai slop channel, the user asked chatgpt, Claude, deepseek for optimisation flags and just copy paste onto the video. How do I know that? That's exactly the method I used myself. Next, AI speech, the entire script is AI slop, all of it, word per word. My pattern recognition is being triggered like never before today. It's nothing but a slop channel for a quick buck. Not approved.

[-]

yensteel@reddit

His tips on mlock for docker was really useful!

I think he spent too much time explaining MOE, and the video could be shorter. However, he clearly explained what each parameter is doing, including what didn't work for the model, but is worth trying out.

It's an excellent video for introducing optimizations in llama.cpp.

[-]

JustLookingForNothin@reddit

He provided a good explanation AND visualization of the MOE principle. Newcomers will appreciate the details, experts can skip to the next chapter which he provided in the description. Good video.

[-]

MelodicRecognition7@reddit

good production, useful tips, but there is 1 thing I do not recommend: 4 bit cache. You really should not go below 8 bits.