ChopSticksPlease

Qwen3.5 vs GLM-4.7 vs Qwen3-235B-Thinking

Posted by ChopSticksPlease@reddit | LocalLLaMA | View on Reddit | 39 comments
AI created this app in 12hrs. Used open models, mostly local LLMs.

Posted by ChopSticksPlease@reddit | LocalLLaMA | View on Reddit | 6 comments
MiniMax-M2.1 vs GLM-4.5-Air is the bigger really the better (coding)?

Posted by ChopSticksPlease@reddit | LocalLLaMA | View on Reddit | 18 comments
How to run Qwen3-next 80b when you are poor

Posted by ChopSticksPlease@reddit | LocalLLaMA | View on Reddit | 28 comments
Hardware for a new AI diy server build

Posted by ChopSticksPlease@reddit | LocalLLaMA | View on Reddit | 4 comments
Surprised you can run SOTA models on 10+ year old (cheap) workstation with usable tps, no need to break the bank.

Posted by ChopSticksPlease@reddit | LocalLLaMA | View on Reddit | 18 comments
Seed OSS 36b made me reconsider my life choices.

Posted by ChopSticksPlease@reddit | LocalLLaMA | View on Reddit | 80 comments
Spent weekend tuning LLM server to hone my nerdism so you don't have to.

Posted by ChopSticksPlease@reddit | LocalLLaMA | View on Reddit | 6 comments
Quad Radeon 9700 XFX 32GB vs RTX 6000 PRO

Posted by ChopSticksPlease@reddit | LocalLLaMA | View on Reddit | 4 comments
Nvidia power spike and PSU issues

Posted by ChopSticksPlease@reddit | LocalLLaMA | View on Reddit | 7 comments
Llama.cpp and VRAM vs context size vs cache quant

Posted by ChopSticksPlease@reddit | LocalLLaMA | View on Reddit | 2 comments
How to properly run gpt-oss-120b on multiple GPUs with llama.cpp?

Posted by ChopSticksPlease@reddit | LocalLLaMA | View on Reddit | 15 comments
Best model to run on dual 3090 (48GB vram)

Posted by ChopSticksPlease@reddit | LocalLLaMA | View on Reddit | 12 comments
Nvidia DGX Spark (or alike) vs dual RTX 3090

Posted by ChopSticksPlease@reddit | LocalLLaMA | View on Reddit | 12 comments
NVMe for local LLM is too slow. Any ideas?

Posted by ChopSticksPlease@reddit | LocalLLaMA | View on Reddit | 27 comments
Joined the 48GB Vram Dual Hairdryer club. Frankly a bit of disappointment, deepseek-r1:70b works fine, qwen2.5:72b seems to be too big still. The 32b models apparently provide almost the same code quality and for general questions the online big LLMs are better. Meh.

Posted by ChopSticksPlease@reddit | LocalLLaMA | View on Reddit | 117 comments
Dual 3090 vs Quad 3060 for local LLM

Posted by ChopSticksPlease@reddit | LocalLLaMA | View on Reddit | 7 comments
Created a gist how to setup Ollama with Open WebUI in Docker on Ubuntu Server VM with Nvidia GPU on Proxmox, perhaps someone here finds it useful.

Posted by ChopSticksPlease@reddit | LocalLLaMA | View on Reddit | 1 comments