ai model for 12 gb ram 3 gb vram gtx 1050
Posted by Ok-Type-7663@reddit | LocalLLaMA | View on Reddit | 17 comments
[gemini]()
[chatgpt]()
[claude]()
old models = worst thing ever. any good model for 12 gb ram 3 gb vram gtx 1050 linux mint 22.2?
WhoRoger@reddit
Granite 4 h 7B is perfect for this. Or SmolLM3 3B
One-Pain6799@reddit
You can use Qwen3.5 2b
Healthy-Nebula-3603@reddit
...any
Indigas11@reddit
I run qwen3.6 35b a3b IQ3_XXS on Laptop i7 8th gen 16gb ram + gtx 1050 4gb vram.
pp 15t/s and tg 7t/s (approx) with 96000 ctx (ctv and ctk q4_0)
If you need workflow, that you give it a plan and you come back later, than it is right choice.
You can try qwen3.5 9b, but i get pp 38t/s and tg 7t/s.
ML-Future@reddit
For your setup I think Qwen 3.5 2b IQ4_NL (1.21gb) would be the best.
Or maybe Qwen 3.5 4b IQ4_NL (2.58 gb)
HellomyfriendNine@reddit
qwen 3.5 4b the best small model I have ever used(still lacks coding) but great for general reasoning and math
sagiroth@reddit
For anything sensible you need at bare minimum 8gb vram and 32gb ram tbh and that's only MOE models sadly.
tomByrer@reddit
A new $500 cell phone would be a better AI server than that computer....
sagiroth@reddit
On that setup I ran qwen 35A3B with cpu offload to ram at 80tkps and 64k context. Dont think a 500$ phone can do that
MotokoAGI@reddit
if you have ddr4 system, then qwen3.6-36b at Q4 with cmoe option.
knselektor@reddit
you can use https://github.com/AlexsJones/llmfit to select a few models to test for your use case
dreamai87@reddit
Bro for you just go with qwen 3 2507 4b instruct q4
Endlesscrysis@reddit
Literally just prompt it to websearch latest leaderboards and benchmarks, if you don't explicitly point it towards how to find recent information it will pick the lazy route and just go from memory/training which is obviously outdated.
OsmanthusBloom@reddit
I would try Gemma4 E2B, possibly even E4B. You should be able to fit these if you use llama.cpp, Q4 quants, quantized context (q8_0 or possibly q4_0 if you dare), and either skip mmproj entirely (no image input support then) or at least don't offload it to VRAM.
These are far from the best available models but probably the best you can use with your very limited hardware. Also Qwen3.5 4B might work, or some of the LiquidAI LFM models.
The 1-bit Bonsai models are another option. I've successfully run the 8B model on just 2GB VRAM, see here: https://www.reddit.com/r/LocalLLaMA/comments/1sbnf8y/running_1bit_bonsai_8b_on_2gb_vram_mx150_mobile/
NigaTroubles@reddit
Qwen3 is old ??
ABLPHA@reddit
Yes, we have Qwen3.5 and 3.6 now, which are not even close
1998marcom@reddit
gpt-oss 20b? or Qwen3.5 4B (maybe with some offload), Gemma4 E4B?