BoogerheadCult

google/gemma-4-12B · Hugging Face

Posted by jacek2023@reddit | LocalLLaMA | View on Reddit | 171 comments

BoogerheadCult@reddit

Whatever you say, LLM model is use-case specific, I guess with your intelligence, you work on low level project that such a bad model is good enough for you. Bet you don't even know how to eval models properly to learn their limitations. 🤡

google/gemma-4-12B · Hugging Face

Posted by jacek2023@reddit | LocalLLaMA | View on Reddit | 171 comments

Microsoft Aion 1.0 Instruct and Aion 1.0 Plan models!

Posted by Mysterious_Finish543@reddit | LocalLLaMA | View on Reddit | 100 comments

Would you consider getting an NVIDIA RTX Spark laptop?

Posted by gamblingapocalypse@reddit | LocalLLaMA | View on Reddit | 157 comments

BoogerheadCult@reddit

Hell no, get yourself some used 3090 and build the rest is a lot cheaper. Not to mention it is a laptop so running it plugged in will cause very fast battery degradation and you will have a throttled useless paperweight after a few years once the battery is dead.

RTX Spark does not have 600GB/s Bandwith

Posted by rpiguy9907@reddit | LocalLLaMA | View on Reddit | 191 comments

RTX Spark does not have 600GB/s Bandwith

Posted by rpiguy9907@reddit | LocalLLaMA | View on Reddit | 191 comments

BoogerheadCult@reddit

Looks like how unemployment rate is computed, it was so easy to manipulate the numbers. Here in the US we have thousands of layoffs every other weeks, why would anybody with some common sense think the number is so little changed ? Looks at the Layoffs and the cscareerquestions subs, lots are being cooked. It is not so bad because the stock market is ATH due to AI hypes, if stock market crashes, lots more gonna feel the pain. If you want to buy this overpriced POS, then go for it. Many here don't see the appeals so they said no thanks. Sounds like you are just trying to convince yourself on a bad buy. For example 5k for 5090 that was 2k MSRP, sure I understand prices went up a bit but not 2.5x, that makes no sense for 32GB VRAM cards. You NoVidia shills just lost your god darn minds.

RTX Spark does not have 600GB/s Bandwith

Posted by rpiguy9907@reddit | LocalLLaMA | View on Reddit | 191 comments

RTX Spark does not have 600GB/s Bandwith

Posted by rpiguy9907@reddit | LocalLLaMA | View on Reddit | 191 comments

RTX Spark does not have 600GB/s Bandwith

Posted by rpiguy9907@reddit | LocalLLaMA | View on Reddit | 191 comments

Added an old 2070 Super to my rig and I can't go back...worse, now I need more

Posted by PferdOne@reddit | LocalLLaMA | View on Reddit | 46 comments

BoogerheadCult@reddit

Need to do two things: 1) Use the last Nvidia driver that works with both cards 2) Use an older cuda-toolkit. 3) Compile llama cpp for multiple nvidia architecture. If it is a pain, dust off an old PC, put the older cards in the old PCs then use RPC comnection.

Speed difference between Windows 11 and Linux with llama.cpp: a myth when using medium and large MoE models

Posted by Far-Usual5771@reddit | LocalLLaMA | View on Reddit | 63 comments

BoogerheadCult@reddit

Dumb tests, because you are not pushing the system enough so there is not enough differences, on resource-constrained systems such as running very large models while trying to squeeze every last drop out of your RAM and VRAM, then Linux always shines. There is no coincidence that you can allocate more VRAM and run larger model on Ryzen Max in Linux but can't do it on Windows. Sorry OP, you are not technical enough for this kind of comparisons, just stay with Windows, it is for low IQ people anyway.

I have 2x PC's. One with a 5090 and one with a 4080. Is there an easy way to use both together networked?

Posted by F0UR_TWENTY@reddit | LocalLLaMA | View on Reddit | 34 comments

BoogerheadCult@reddit

One is an old NUC I have laying around, I should upgrade that one soon, at least it has USB 3.0 so I bought a 2.5G USB ethernet and use that to link them together.

I have 2x PC's. One with a 5090 and one with a 4080. Is there an easy way to use both together networked?

Posted by F0UR_TWENTY@reddit | LocalLLaMA | View on Reddit | 34 comments

I have 2x PC's. One with a 5090 and one with a 4080. Is there an easy way to use both together networked?

Posted by F0UR_TWENTY@reddit | LocalLLaMA | View on Reddit | 34 comments

I have 2x PC's. One with a 5090 and one with a 4080. Is there an easy way to use both together networked?

Posted by F0UR_TWENTY@reddit | LocalLLaMA | View on Reddit | 34 comments

BoogerheadCult@reddit

Looks up using llama cpp over rpc. I got Claude to help me setting it up and it's pretty sweet. Hooking up 2x3090 with 2xR9700 across two machine over 2.5G ethernet. Gonna upgrade that soon to 10G.

Uploaded my Qwen3.6 27B based fine tune, after two years of experience fine tuning models

Posted by de4dee@reddit | LocalLLaMA | View on Reddit | 14 comments

Uploaded my Qwen3.6 27B based fine tune, after two years of experience fine tuning models

Posted by de4dee@reddit | LocalLLaMA | View on Reddit | 14 comments

Why is there no community project for training your own LLM from scratch on consumer hardware?

Posted by tevlon@reddit | LocalLLaMA | View on Reddit | 69 comments

BoogerheadCult@reddit

The people who can do it probably already worked at or poached by big AU firms making 500k to 7 figure. Why would they want to do all that for free ? Not to mention the hardware required, you need a huge hardware farm, only FAANGs can afford. Not even small to mid size companies can do that from scratch. So if just follow the money, you will get the answers.

Granite 4.1 Architecture Changes?

Posted by the-salami@reddit | LocalLLaMA | View on Reddit | 6 comments

What would you do? 2x5060ti for $800, 2x5070ti for $1400 or 5090 for $4000?

Posted by fallingdowndizzyvr@reddit | LocalLLaMA | View on Reddit | 121 comments

BoogerheadCult@reddit

bullshit, I got my R9700 configured and it outperformed all those except the 5090, skill issues. But seeing that you got more money than IQ then go ahead, be another NVIDIA sheep to the slaughter.

What would you do? 2x5060ti for $800, 2x5070ti for $1400 or 5090 for $4000?

Posted by fallingdowndizzyvr@reddit | LocalLLaMA | View on Reddit | 121 comments

BoogerheadCult@reddit

Neither of them, for the amount of VRAM you got, it is an extremely bad deal, blame it on the NVIDIA fanboys. Just have Claude help you with installing drivers and figure out how to get alternative cards working, way more cost effective.

Stop pretending self-hosting is cheaper. It's not. We do it for different reasons and we should say so.

Posted by Napster3301@reddit | LocalLLaMA | View on Reddit | 88 comments

BoogerheadCult@reddit

Cheaper because your online tokens are currently VC-subsidized. Once you run your own model, you will see that it costs tons to just run anything decent. So technically it's still cheaper, hosting your own model is the true cost and the token cost you get with Gemini, Claude are fake cost. It doesn't reflect the true cost.

Is NVIDIA still the default best choice for local LLMs in 2026?

Posted by pmv143@reddit | LocalLLaMA | View on Reddit | 267 comments

BoogerheadCult@reddit

Nope, the NVIDIA fanboys are pushing prices to very unreasonable levels, $/VRAM is almost 2x, 3x competitors, I am looking at other alternatives now. The breaking point for me was seeing how badly used 3090 which sellers didn't even hesitate to disclose that they were used for mining now being sold for $1200. You gotta be kidding me.

Comparison of Qwen 3.6 and Gemma4 (MoE and Dense models, Q4_K_M), generating a moderately complex MySQL query, only one produced acceptable results

Posted by bgravato@reddit | LocalLLaMA | View on Reddit | 43 comments

BoogerheadCult@reddit

Qwen has different set of defaults for different tasks, sorry but I think you are just an idiot who doesn't know what it is doing, hence rendering your little "experiment" useless.

Comparison of Qwen 3.6 and Gemma4 (MoE and Dense models, Q4_K_M), generating a moderately complex MySQL query, only one produced acceptable results

Posted by bgravato@reddit | LocalLLaMA | View on Reddit | 43 comments

BoogerheadCult@reddit

It is the llama.cpp launch parameters, it has nothing to do with your script. I start to think either you are a Google shill or just an idiot who doesn't know what it is talking about.

Meet the Fleet of BlackBeard

Posted by BlackBeardAI@reddit | LocalLLaMA | View on Reddit | 70 comments

BoogerheadCult@reddit

If you spend 10k on hardware to run some of the most capable models out there right now such as the Qwen3.6 27B, sorry to say that you are an idiot. I spent no more than 2k and I can get it to run very smoothly and it already reduced my reliance on Claude, which allows me to reduce my monthly subscription to $20/month plan. Used 3090 go for around 1k, rest 1k can be on other hardware, which is doable.

Meet the Fleet of BlackBeard

Posted by BlackBeardAI@reddit | LocalLLaMA | View on Reddit | 70 comments

BoogerheadCult@reddit

Funny how you think your Claude subscription will stay the same. Right now most of your tokens are VC subsidized. Try running these yourself and you will realize the cost to run LLM model far exceeds what they are charging you for. When all the LLM providers increases their prices, there will be a huge surge in hardware demands and prices gonna shoot up. Good luck getting a rig at that point. Or keep paying $500 to keep your Claude subscription.

Openclaw ia trending down and will disappear soon

Posted by rm-rf-rm@reddit | LocalLLaMA | View on Reddit | 338 comments