Can you run actually useful LLMs on anything less than 3090 ?

Posted by Relevant-Pie475@reddit | LocalLLaMA | View on Reddit | 34 comments

I started my LLM self-hosting journey with a 1660 Ti (Bad Choice, I know)

I wanted to get started a bit quickly, and this was the first GPU that I could buy without breaking much bank

However, I soon realized that this is extremely under-powered. So I started looking for a GPU with more VRAM. I came across 3060, which seem to me a good balance between raw GPU performance & cost

Afterwards, I reached out to a colleague who is also very active in self-hosting LLMs. I told him that I got a 3060, and his first response is that it sucks. He is running his setup on a 3090, and is planning to get another one

Honestly, I don't consider myself a AI power-user. I'm mostly self-hosting it for my family, to provide them a more ethical choice to use AI as compared to commercial offerings, and also due to data & privacy concerns

But my main question is that for you LLM experts, is it possible to host a relatively useful LLM on a GPU with 12 GB VRAM ? I did some research before buying, and it seemed like a good balance for the cost-power ratio. But honestly hearing regarding the performance from the colleague, it affected my confidence in the setup & started questioning regarding if I'll be able to self-host LLMs without dropping 1000$ for the hardware

I understand it doesn't matter much, but I plugged the GPU into an HP workstation with Intel Xeon & 32 GBs of DDR3 RAM. I didn't get a chance to run the benchmarks, but overall I thought the performance was good enough for the personal use case

So I wanted you all to share your experiences with hosting LLMs with anything under 3090 !