How do I turn off CPU for llama.cpp?
Posted by ClimateBoss@reddit | LocalLLaMA | View on Reddit | 4 comments
Using ik_llama, llama.cpp like this
```
./llama-server
--numa numactl
--threads 0 // cpu turned off?
-ngl 9999
--cont-batching
--parallel 1
-fa on
--no-mmap
-sm graph -cuda fusion=1
-khad -sas -gr -smgs -ger -mla 3 // whatever this does
--mlock
-mg 0 -ts 1,1 // dual gpu
```
### 800% CPU usage ???? 100% gpu ???
2 P40 pascal no nvlink
4 Comments
Samrit_buildss@reddit
ParaboloidalCrest@reddit
Familiar_Army_2788@reddit
ClimateBoss@reddit (OP)