BoogerheadCult

google/gemma-4-12B · Hugging Face

Posted by jacek2023@reddit | LocalLLaMA | View on Reddit | 171 comments

[-]

BoogerheadCult@reddit

Whatever you say, LLM model is use-case specific, I guess with your intelligence, you work on low level project that such a bad model is good enough for you. Bet you don't even know how to eval models properly to learn their limitations. 🤡

google/gemma-4-12B · Hugging Face

Posted by jacek2023@reddit | LocalLLaMA | View on Reddit | 171 comments

[-]

BoogerheadCult@reddit

Hot garbage, tried and none of these models so far impressed me.

Microsoft Aion 1.0 Instruct and Aion 1.0 Plan models!

Posted by Mysterious_Finish543@reddit | LocalLLaMA | View on Reddit | 100 comments

[-]

BoogerheadCult@reddit

Microslop can't even get their shit AI on Windoze done right, these gonna be another disasters on the making.

Would you consider getting an NVIDIA RTX Spark laptop?

Posted by gamblingapocalypse@reddit | LocalLLaMA | View on Reddit | 157 comments

[-]

BoogerheadCult@reddit

Hell no, get yourself some used 3090 and build the rest is a lot cheaper. Not to mention it is a laptop so running it plugged in will cause very fast battery degradation and you will have a throttled useless paperweight after a few years once the battery is dead.

RTX Spark does not have 600GB/s Bandwith

Posted by rpiguy9907@reddit | LocalLLaMA | View on Reddit | 191 comments

[-]

BoogerheadCult@reddit

Nobody is using a Mac laptop to run LLM, it is bad, battery and thermal throttling. Mac Studio is more inline priced with the specs you mentioned.

RTX Spark does not have 600GB/s Bandwith

Posted by rpiguy9907@reddit | LocalLLaMA | View on Reddit | 191 comments

[-]

BoogerheadCult@reddit

Looks like how unemployment rate is computed, it was so easy to manipulate the numbers. Here in the US we have thousands of layoffs every other weeks, why would anybody with some common sense think the number is so little changed ? Looks at the Layoffs and the cscareerquestions subs, lots are being cooked. It is not so bad because the stock market is ATH due to AI hypes, if stock market crashes, lots more gonna feel the pain. If you want to buy this overpriced POS, then go for it. Many here don't see the appeals so they said no thanks. Sounds like you are just trying to convince yourself on a bad buy. For example 5k for 5090 that was 2k MSRP, sure I understand prices went up a bit but not 2.5x, that makes no sense for 32GB VRAM cards. You NoVidia shills just lost your god darn minds.

RTX Spark does not have 600GB/s Bandwith

Posted by rpiguy9907@reddit | LocalLLaMA | View on Reddit | 191 comments

[-]

BoogerheadCult@reddit

Everybody in the US knows the guy in the White House cooked up the stats. True unemployment rate is somewhere about 30%.

RTX Spark does not have 600GB/s Bandwith

Posted by rpiguy9907@reddit | LocalLLaMA | View on Reddit | 191 comments

[-]

BoogerheadCult@reddit

Right ? Better resale value too. All these Nvidia shills and people buying these overpriced garbage has lost their god darn minds.

RTX Spark does not have 600GB/s Bandwith

Posted by rpiguy9907@reddit | LocalLLaMA | View on Reddit | 191 comments

[-]

BoogerheadCult@reddit

Forgot the $3000 price tags. They think everybody gonna go rush out and buy this shit. LOL.

Added an old 2070 Super to my rig and I can't go back...worse, now I need more

Posted by PferdOne@reddit | LocalLLaMA | View on Reddit | 46 comments

[-]

BoogerheadCult@reddit

Need to do two things: 1) Use the last Nvidia driver that works with both cards 2) Use an older cuda-toolkit. 3) Compile llama cpp for multiple nvidia architecture. If it is a pain, dust off an old PC, put the older cards in the old PCs then use RPC comnection.

Speed difference between Windows 11 and Linux with llama.cpp: a myth when using medium and large MoE models

Posted by Far-Usual5771@reddit | LocalLLaMA | View on Reddit | 63 comments

[-]

BoogerheadCult@reddit

Dumb tests, because you are not pushing the system enough so there is not enough differences, on resource-constrained systems such as running very large models while trying to squeeze every last drop out of your RAM and VRAM, then Linux always shines. There is no coincidence that you can allocate more VRAM and run larger model on Ryzen Max in Linux but can't do it on Windows. Sorry OP, you are not technical enough for this kind of comparisons, just stay with Windows, it is for low IQ people anyway.

I have 2x PC's. One with a 5090 and one with a 4080. Is there an easy way to use both together networked?

Posted by F0UR_TWENTY@reddit | LocalLLaMA | View on Reddit | 34 comments

[-]

BoogerheadCult@reddit

One is an old NUC I have laying around, I should upgrade that one soon, at least it has USB 3.0 so I bought a 2.5G USB ethernet and use that to link them together.

I have 2x PC's. One with a 5090 and one with a 4080. Is there an easy way to use both together networked?

Posted by F0UR_TWENTY@reddit | LocalLLaMA | View on Reddit | 34 comments

[-]

BoogerheadCult@reddit

Helps with initial layer loading and I am obsessed with maxing out things that I can.

I have 2x PC's. One with a 5090 and one with a 4080. Is there an easy way to use both together networked?

Posted by F0UR_TWENTY@reddit | LocalLLaMA | View on Reddit | 34 comments

[-]

BoogerheadCult@reddit

There's no such things. Claude already made this extremely easy to setup, if that's still a challenge then it's not for you, just give up on the idea.

I have 2x PC's. One with a 5090 and one with a 4080. Is there an easy way to use both together networked?

Posted by F0UR_TWENTY@reddit | LocalLLaMA | View on Reddit | 34 comments

[-]

BoogerheadCult@reddit

Looks up using llama cpp over rpc. I got Claude to help me setting it up and it's pretty sweet. Hooking up 2x3090 with 2xR9700 across two machine over 2.5G ethernet. Gonna upgrade that soon to 10G.

Uploaded my Qwen3.6 27B based fine tune, after two years of experience fine tuning models

Posted by de4dee@reddit | LocalLLaMA | View on Reddit | 14 comments

[-]

BoogerheadCult@reddit

https://preview.redd.it/qwg5k2jsm54h1.jpeg?width=1080&format=pjpg&auto=webp&s=bc7922bce027facc113d3e4315807216d2b6050f

Uploaded my Qwen3.6 27B based fine tune, after two years of experience fine tuning models

Posted by de4dee@reddit | LocalLLaMA | View on Reddit | 14 comments

[-]

BoogerheadCult@reddit

So basically a dumbed-down Qwen to shill for Ponzi Buttcoin ? Count me in, I am so excited for this garbage /s

Why is there no community project for training your own LLM from scratch on consumer hardware?

Posted by tevlon@reddit | LocalLLaMA | View on Reddit | 69 comments

[-]

BoogerheadCult@reddit

The people who can do it probably already worked at or poached by big AU firms making 500k to 7 figure. Why would they want to do all that for free ? Not to mention the hardware required, you need a huge hardware farm, only FAANGs can afford. Not even small to mid size companies can do that from scratch. So if just follow the money, you will get the answers.

Granite 4.1 Architecture Changes?

Posted by the-salami@reddit | LocalLLaMA | View on Reddit | 6 comments

[-]

BoogerheadCult@reddit

Who cares, IBM is full of diversity hires, they are not going to deliver anything worth noting anyway.

What would you do? 2x5060ti for $800, 2x5070ti for $1400 or 5090 for $4000?

Posted by fallingdowndizzyvr@reddit | LocalLLaMA | View on Reddit | 121 comments

[-]

BoogerheadCult@reddit

bullshit, I got my R9700 configured and it outperformed all those except the 5090, skill issues. But seeing that you got more money than IQ then go ahead, be another NVIDIA sheep to the slaughter.

What would you do? 2x5060ti for $800, 2x5070ti for $1400 or 5090 for $4000?

Posted by fallingdowndizzyvr@reddit | LocalLLaMA | View on Reddit | 121 comments

[-]

BoogerheadCult@reddit

Neither of them, for the amount of VRAM you got, it is an extremely bad deal, blame it on the NVIDIA fanboys. Just have Claude help you with installing drivers and figure out how to get alternative cards working, way more cost effective.

Stop pretending self-hosting is cheaper. It's not. We do it for different reasons and we should say so.

Posted by Napster3301@reddit | LocalLLaMA | View on Reddit | 88 comments

[-]

BoogerheadCult@reddit

Cheaper because your online tokens are currently VC-subsidized. Once you run your own model, you will see that it costs tons to just run anything decent. So technically it's still cheaper, hosting your own model is the true cost and the token cost you get with Gemini, Claude are fake cost. It doesn't reflect the true cost.

Is NVIDIA still the default best choice for local LLMs in 2026?

Posted by pmv143@reddit | LocalLLaMA | View on Reddit | 267 comments

[-]

BoogerheadCult@reddit

Nope, the NVIDIA fanboys are pushing prices to very unreasonable levels, $/VRAM is almost 2x, 3x competitors, I am looking at other alternatives now. The breaking point for me was seeing how badly used 3090 which sellers didn't even hesitate to disclose that they were used for mining now being sold for $1200. You gotta be kidding me.

Comparison of Qwen 3.6 and Gemma4 (MoE and Dense models, Q4_K_M), generating a moderately complex MySQL query, only one produced acceptable results

Posted by bgravato@reddit | LocalLLaMA | View on Reddit | 43 comments

[-]

BoogerheadCult@reddit

Qwen has different set of defaults for different tasks, sorry but I think you are just an idiot who doesn't know what it is doing, hence rendering your little "experiment" useless.

Comparison of Qwen 3.6 and Gemma4 (MoE and Dense models, Q4_K_M), generating a moderately complex MySQL query, only one produced acceptable results

Posted by bgravato@reddit | LocalLLaMA | View on Reddit | 43 comments

[-]

BoogerheadCult@reddit

It is the llama.cpp launch parameters, it has nothing to do with your script. I start to think either you are a Google shill or just an idiot who doesn't know what it is talking about.

Meet the Fleet of BlackBeard

Posted by BlackBeardAI@reddit | LocalLLaMA | View on Reddit | 70 comments

[-]

BoogerheadCult@reddit

If you spend 10k on hardware to run some of the most capable models out there right now such as the Qwen3.6 27B, sorry to say that you are an idiot. I spent no more than 2k and I can get it to run very smoothly and it already reduced my reliance on Claude, which allows me to reduce my monthly subscription to $20/month plan. Used 3090 go for around 1k, rest 1k can be on other hardware, which is doable.

Meet the Fleet of BlackBeard

Posted by BlackBeardAI@reddit | LocalLLaMA | View on Reddit | 70 comments

[-]

BoogerheadCult@reddit

Funny how you think your Claude subscription will stay the same. Right now most of your tokens are VC subsidized. Try running these yourself and you will realize the cost to run LLM model far exceeds what they are charging you for. When all the LLM providers increases their prices, there will be a huge surge in hardware demands and prices gonna shoot up. Good luck getting a rig at that point. Or keep paying $500 to keep your Claude subscription.

Openclaw ia trending down and will disappear soon

Posted by rm-rf-rm@reddit | LocalLLaMA | View on Reddit | 338 comments

[-]

BoogerheadCult@reddit

Biggest fad ever and so many stupid idiots installed this security risk on their personal device. Almost a kind of social experiments.