HyPyke

Ahhhhh....I can breathe again....So long 4090....Join my 5080 and 3080 on the ebay someday shelf

Posted by HyPyke@reddit | LocalLLaMA | View on Reddit | 15 comments

Ahhhhh....I can breathe again....So long 4090....Join my 5080 and 3080 on the ebay someday shelf

Posted by HyPyke@reddit | LocalLLaMA | View on Reddit | 15 comments

HyPyke@reddit (OP)

Of course dual cards are allowed. The rule is that VRAM counts, nothing else. "Oh, your MAC has 256 of unified memory? Don't care. Keep walking."

Ahhhhh....I can breathe again....So long 4090....Join my 5080 and 3080 on the ebay someday shelf

Posted by HyPyke@reddit | LocalLLaMA | View on Reddit | 15 comments

Ahhhhh....I can breathe again....So long 4090....Join my 5080 and 3080 on the ebay someday shelf

Posted by HyPyke@reddit | LocalLLaMA | View on Reddit | 15 comments

HyPyke@reddit (OP)

Well to be fair, you could probably get an ok to quite decent used car where I live for what it costs. It was a splurge and I am privileged to have the money to do it. No wife, no kids, so....😁

Ahhhhh....I can breathe again....So long 4090....Join my 5080 and 3080 on the ebay someday shelf

Posted by HyPyke@reddit | LocalLLaMA | View on Reddit | 15 comments

HyPyke@reddit (OP)

But seriously Qwen-Coder-Next is my go to right now. I'll swap in a big ass Gemma-4 from time to time when it's acting up.

Ahhhhh....I can breathe again....So long 4090....Join my 5080 and 3080 on the ebay someday shelf

Posted by HyPyke@reddit | LocalLLaMA | View on Reddit | 15 comments

HyPyke@reddit (OP)

All of them. All the models. https://preview.redd.it/3u3zgi30xl3h1.png?width=320&format=png&auto=webp&s=44b47ed8704f9c39a810235b9d1cdd9d19121150

Hard freakin' decision..Blackwell 96G or Mac Studio 256G

Posted by HyPyke@reddit | LocalLLaMA | View on Reddit | 212 comments

Hard freakin' decision..Blackwell 96G or Mac Studio 256G

Posted by HyPyke@reddit | LocalLLaMA | View on Reddit | 212 comments

HyPyke@reddit (OP)

Server is passive, no fan, 300w to 600w, but much lower mem bandwidth 1597 GB/s vs 1792 GB/s Which is important for AI. I know clock are higher on server as well but cooling is a real issue unless you have a case you can tune for that. Mine has a 24 drive HD cage in front and a 5 fan bard behind that before you get to the video card so it's already warm air. I think the 300w MaxQ is prob the best for my situation.

Hard freakin' decision..Blackwell 96G or Mac Studio 256G

Posted by HyPyke@reddit | LocalLLaMA | View on Reddit | 212 comments

Hard freakin' decision..Blackwell 96G or Mac Studio 256G

Posted by HyPyke@reddit | LocalLLaMA | View on Reddit | 212 comments

Hard freakin' decision..Blackwell 96G or Mac Studio 256G

Posted by HyPyke@reddit | LocalLLaMA | View on Reddit | 212 comments

Hard freakin' decision..Blackwell 96G or Mac Studio 256G

Posted by HyPyke@reddit | LocalLLaMA | View on Reddit | 212 comments

Hard freakin' decision..Blackwell 96G or Mac Studio 256G

Posted by HyPyke@reddit | LocalLLaMA | View on Reddit | 212 comments

HyPyke@reddit (OP)

HASS has a lot of AI features now. I have a Voice Preview Edition and can do voice control 100% local. You can plug AI into automations. [https://www.home-assistant.io/blog/2025/09/11/ai-in-home-assistant/](https://www.home-assistant.io/blog/2025/09/11/ai-in-home-assistant/)

Hard freakin' decision..Blackwell 96G or Mac Studio 256G

Posted by HyPyke@reddit | LocalLLaMA | View on Reddit | 212 comments

Hard freakin' decision..Blackwell 96G or Mac Studio 256G

Posted by HyPyke@reddit | LocalLLaMA | View on Reddit | 212 comments

Hard freakin' decision..Blackwell 96G or Mac Studio 256G

Posted by HyPyke@reddit | LocalLLaMA | View on Reddit | 212 comments

HyPyke@reddit (OP)

Yeah, well I started on my 5080 I use for gaming. Then got a 4090 and now I'm thinking of dropping 10K. My server DOES need an upgrade bad. Epyc and ECC mem and all the trimmings. Prob is mem prices right now.

Hard freakin' decision..Blackwell 96G or Mac Studio 256G

Posted by HyPyke@reddit | LocalLLaMA | View on Reddit | 212 comments

HyPyke@reddit (OP)

Large models relative to what I can do with my 4090. I have a dedicated server in a rack already holding an old 3080 for video encode. The RTX 6000 would replace that. It's a decent enough system with 32G of ram and fast storage.

Hard freakin' decision..Blackwell 96G or Mac Studio 256G

Posted by HyPyke@reddit | LocalLLaMA | View on Reddit | 212 comments

Hard freakin' decision..Blackwell 96G or Mac Studio 256G

Posted by HyPyke@reddit | LocalLLaMA | View on Reddit | 212 comments

HyPyke@reddit (OP)

Wellllll...This card is going in a rack in a rack mount case. The blower cooler might be the better way to go considering back to front airflow....

Hard freakin' decision..Blackwell 96G or Mac Studio 256G

Posted by HyPyke@reddit | LocalLLaMA | View on Reddit | 212 comments

Hard freakin' decision..Blackwell 96G or Mac Studio 256G

Posted by HyPyke@reddit | LocalLLaMA | View on Reddit | 212 comments

HyPyke@reddit (OP)

I had no idea. I assumed they were like 15k and hard to get. I obv need more research...Microcenter not too far actually....

Hard freakin' decision..Blackwell 96G or Mac Studio 256G

Posted by HyPyke@reddit | LocalLLaMA | View on Reddit | 212 comments

Hard freakin' decision..Blackwell 96G or Mac Studio 256G

Posted by HyPyke@reddit | LocalLLaMA | View on Reddit | 212 comments

Hard freakin' decision..Blackwell 96G or Mac Studio 256G

Posted by HyPyke@reddit | LocalLLaMA | View on Reddit | 212 comments

How to disable thinking/reasoning in Gemma 4 E2B on Ollama? (1st time local user)

Posted by WatercressLarge2323@reddit | LocalLLaMA | View on Reddit | 20 comments

HyPyke@reddit

Nobody really responded to op's specific situation so here is some info that I dug up. You can't do it. (Yet) You can at the ollama console, as other's have pointed out. You can in OpenWebUI by changing the Advanced Parameters and creating a Custom Parameter named 'think' and setting it to 'false'. You can do it in the api calls but you can't serve a pre-configed non-thinking gemma 4 model in ollama. (Yet) Read these threads: [https://github.com/ollama/ollama/issues/10961](https://github.com/ollama/ollama/issues/10961) [https://github.com/ollama/ollama/pull/14108](https://github.com/ollama/ollama/pull/14108) Problem is, and I'm not a github guru, it LOOKS like the change was already made in Feb but it's not active/working yet. root@d66fbecc427e:~/.ollama# ollama create gemma4nothink -f Modelfile.gemma4nothink gathering model components Error: unknown parameter 'think' root@d66fbecc427e:~/.ollama# ollama --version ollama version is 0.21.0 That modelfile I used is bog standard except for `PARAMETER think false` So, I am not sure what the deal is. I can't find a dev or beta version of ollama to try either.