How do I turn off CPU for llama.cpp?

Posted by ClimateBoss@reddit | LocalLLaMA | View on Reddit | 4 comments

Using ik_llama, llama.cpp like this ``` ./llama-server --numa numactl --threads 0 // cpu turned off? -ngl 9999 --cont-batching --parallel 1 -fa on --no-mmap -sm graph -cuda fusion=1 -khad -sas -gr -smgs -ger -mla 3 // whatever this does --mlock -mg 0 -ts 1,1 // dual gpu ``` ### 800% CPU usage ???? 100% gpu ??? 2 P40 pascal no nvlink