Curious how AMD (Radeon) GPUs can handle LLMs
Posted by siegevjorn@reddit | LocalLLaMA | View on Reddit | 34 comments
Hey folks,
Since the GPU craze, I'd been eyeing on what's available right now atm:
RX 6800 and 7600xt.
Both have decent price/VRAM with 16gb.
But my concern is whether the VRAM in AMD tranlates well to that of Nvidia. For instance, will 16gb of RX 6800 will load same model size as 16gb of Nvidia GPU? For those of you who have both AMD/Nvidia gpus (3090 and 7900xtx), what was your experience, where you able to load same model size on 7900xtx that you used to load onto 3090? If AMD VRAMs are inefficient, how much? Is is 20% inefficient or 30%?
Another question is with RoCm support. I see from llama.cpp that any GPU with HIP support will be able to offload layers.
https://github.com/ggerganov/llama.cpp/blob/master/docs/build.md#hip
According to AMD site, that includes RX 6800:
https://rocm.docs.amd.com/projects/install-on-windows/en/latest/reference/system-requirements.html
So I can safely assume that anything that runs llama.cpp on the backend will run LLM out of the box with RDNA2 (RX 6800), right?
Does it apply the same to vLLM? vLLM specifies only 7900 support:
https://docs.vllm.ai/en/v0.6.5/getting_started/amd-installation.html
But does it support other 7000 series GPUs(RDNA3)?
I mean it seems like AMD has expanded their suppprt for ML for all RDNA3 GPUs:
https://rocm.docs.amd.com/projects/radeon/en/latest/
If running vLLM in tensor parallel possible, $300 price of 7600xt sounds quite attractive.
34 Comments
Excellent_Bar_2638@reddit
BlueSwordM@reddit
legit_split_@reddit
Better_Athlete_JJ@reddit
05032-MendicantBias@reddit
Nindaleth@reddit
siegevjorn@reddit (OP)
San4itos@reddit
05032-MendicantBias@reddit
Zenobody@reddit
05032-MendicantBias@reddit
San4itos@reddit
celsowm@reddit
U_A_beringianus@reddit
siegevjorn@reddit (OP)
ttkciar@reddit
U_A_beringianus@reddit
siegevjorn@reddit (OP)
thetaFAANG@reddit
rdm13@reddit
siegevjorn@reddit (OP)
rdm13@reddit
ForsookComparison@reddit
siegevjorn@reddit (OP)
Zenobody@reddit
siegevjorn@reddit (OP)
frivolousfidget@reddit
stjepano85@reddit
Rich_Repeat_22@reddit
siegevjorn@reddit (OP)
Zenobody@reddit
SporksInjected@reddit
darth_chewbacca@reddit
LagOps91@reddit