ClimateBoss

Terrible speeds with LM Studio? (Is LM Studio bad?)

Posted by HugoCortell@reddit | LocalLLaMA | View on Reddit | 81 comments

[-]

ClimateBoss@reddit

every LMStudio thread or comment has tons of upvotes bot farming much? how can anyone "confirm" not spyware when its against their TOS to decompile their proprietary software? their privacy policy says they collect "usage data" whatever that is you have no clue, therefore its spyware malware whatever you call it also 38mb of "usage data" uploaded when "checking for updates" is excessive and clearly not what their privacy policy says

Terrible speeds with LM Studio? (Is LM Studio bad?)

Posted by HugoCortell@reddit | LocalLLaMA | View on Reddit | 81 comments

[-]

ClimateBoss@reddit

Dude LMStudio isnt open source and literally collects anything you do with it, jut read their privacy policy Also read [https://www.gnu.org/proprietary/](https://www.gnu.org/proprietary/)

Replacing $200/mo Cursor subscription with local Ollama + Claude API. Does this hybrid Mac/Windows setup make sense?

Posted by grohmaaan@reddit | LocalLLaMA | View on Reddit | 27 comments

[-]

ClimateBoss@reddit

Can never figure out what $200 of usage even means, anyone know how that compares to local llm? qwen3 coder 30b MXFP4 is not great but can do FIM on 16g vram.

Terrible speeds with LM Studio? (Is LM Studio bad?)

Posted by HugoCortell@reddit | LocalLLaMA | View on Reddit | 81 comments

[-]

ClimateBoss@reddit

must be all the **proprietary spyware** built in

RTX 6000 build / drive and fan questions

Posted by Direct_Bodybuilder63@reddit | LocalLLaMA | View on Reddit | 47 comments

[-]

ClimateBoss@reddit

whad u do bruh 4 RTX pro 6000 this build is crazy ? ai researcher or what? do u work at like meta ?

RTX 6000 build / drive and fan questions

Posted by Direct_Bodybuilder63@reddit | LocalLLaMA | View on Reddit | 47 comments

[-]

ClimateBoss@reddit

do you have more pics aroudn the side dude how did u do that looks epic

The Synthetic Data Playbook: Generating Trillions of the Finest Tokens

Posted by joelinho95@reddit | LocalLLaMA | View on Reddit | 10 comments

[-]

ClimateBoss@reddit

is this a joke? They wrote 4 prompts and called it gamechanging research? Rewrite clear, step-by-step tutorial or instructional guide. Use numbered steps or bullet points where appropriate to enhance clarity. Preserve all essential information while ensuring the style feels didactic and easy to follow. Output only the tutorial, nothing else.Rewrite the document as a clear, step-by-step tutorial or instructional guide. Use numbered steps or bullet points where appropriate to enhance clarity. Preserve all essential information while ensuring the style feels didactic and easy to follow. Output only the tutorial, nothing else.

High school student seeking advice: Found an architectural breakthrough that scales a 17.6B model down to 417M?

Posted by Appropriate-Scar3116@reddit | LocalLLaMA | View on Reddit | 210 comments

[-]

ClimateBoss@reddit

Enlighten us what that means.

High school student seeking advice: Found an architectural breakthrough that scales a 17.6B model down to 417M?

Posted by Appropriate-Scar3116@reddit | LocalLLaMA | View on Reddit | 210 comments

[-]

ClimateBoss@reddit

# "The specific mathematical structure remains strictly classified to prevent unauthorized use." Provide the github or no one is going to believe this.

High school student seeking advice: Found an architectural breakthrough that scales a 17.6B model down to 417M?

Posted by Appropriate-Scar3116@reddit | LocalLLaMA | View on Reddit | 210 comments

[-]

ClimateBoss@reddit

This is the way, scientific research. Everyone here saying to profit literally are running LLMs published by researchers.

Which multi GPU for local training? v100, MI50, RTX 2080 22gb?

Posted by ClimateBoss@reddit | LocalLLaMA | View on Reddit | 6 comments

[-]

ClimateBoss@reddit (OP)

How much of a difference does SXM over PCI-e have on fine-tuning? and what if its PCI-e but has to go to CPU?

How to do Batching in Llama.cpp ? Speed goes down LOL?

Posted by ClimateBoss@reddit | LocalLLaMA | View on Reddit | 8 comments

[-]

ClimateBoss@reddit (OP)

no u have to use vllm \--parallel 2 will reduce context by half but also not faster

Dual Tesla M40 12GiB Qwen 3.5 results (Ollama Ubuntu)

Posted by Ok-Internal9317@reddit | LocalLLaMA | View on Reddit | 3 comments

[-]

ClimateBoss@reddit

why are you using ollama crap, use ik llama.cpp

MLX vs GGUF (Unsloth) - Qwen3.5 122b-10b

Posted by waescher@reddit | LocalLLaMA | View on Reddit | 37 comments

[-]

ClimateBoss@reddit

so its basically fasster gguf? can use that on llama.cpp ?

Mac Studio 512GB RAM Option Disappears Amid Global DRAM Shortage

Posted by fairydreaming@reddit | LocalLLaMA | View on Reddit | 6 comments

[-]

ClimateBoss@reddit

who pays $8,000 for 512gb bolted on ram anyway? 8k EXTRA lMAO

Lads, time to recompile llama.cpp

Posted by muxxington@reddit | LocalLLaMA | View on Reddit | 56 comments

[-]

ClimateBoss@reddit

still waiting for tensor parallelism

MLX vs GGUF (Unsloth) - Qwen3.5 122b-10b

Posted by waescher@reddit | LocalLLaMA | View on Reddit | 37 comments

[-]

ClimateBoss@reddit

Is MLX only for Mac or can do that on Linux PC ?

How do I figure out -b batch size to increase token speed?

Posted by ClimateBoss@reddit | LocalLLaMA | View on Reddit | 4 comments

[-]

ClimateBoss@reddit (OP)

failed to load model Qwen3.5-35b unsloth q8\_k\_xl.gguf also tried qwen3 coder next? built from github main

Which model to use for coding: qwen3.5 or qwen2.5-coder?

Posted by Mashic@reddit | LocalLLaMA | View on Reddit | 25 comments

[-]

ClimateBoss@reddit

Qwen3 Coder Next 80b better than qwen3.5 35b and 122b IMO

[totally not an ad] combine 2x MCIO into 1x PCIe x16 adapter

Posted by MelodicRecognition7@reddit | LocalLLaMA | View on Reddit | 28 comments

[-]

ClimateBoss@reddit

is this only for AMD boards ? i dont even see MCIO plugs on xeon v4

Axe - a precision agentic coder. large codebases. zero bloat. terminal-native. precise retrieval. powerful inference. open-sourced.

Posted by EmbarrassedAsk2887@reddit | LocalLLaMA | View on Reddit | 14 comments

[-]

ClimateBoss@reddit

i still dont get what this is

Current state of Qwen3.5-122B-A10B

Posted by kevin_1994@reddit | LocalLLaMA | View on Reddit | 37 comments

[-]

ClimateBoss@reddit

# mxfp4 = noctrex

ik_llama.cpp Reasoning not working with GLM Models

Posted by KulangetaPestControl@reddit | LocalLLaMA | View on Reddit | 12 comments

[-]

ClimateBoss@reddit

GLM 4.5 Air works

VibeHQ, Orchestrate multiple Claude Code / Codex / Gemini CLI agents collaborate like a real company team. 7 agents built a hospital system from one prompt.

Posted by GGwithRabbit@reddit | LocalLLaMA | View on Reddit | 9 comments

[-]

ClimateBoss@reddit

# how do i use this on llama.cpp ?

LongCat-Flash-Lite 68.5B maybe a relatively good choice for a pure instruct model within the 24GB GPU VRAM constraint.

Posted by Sad-Pickle4282@reddit | LocalLLaMA | View on Reddit | 10 comments

[-]

ClimateBoss@reddit

compare to Qwen3.5 for **coding** ?

Ubuntu or Debian? Speed difference on llama.cpp tokens?

Posted by ClimateBoss@reddit | LocalLLaMA | View on Reddit | 8 comments

[-]

ClimateBoss@reddit (OP)

okay cause Canonical uploads whatever you type in Ubuntu Desktop search bar

How to generate songs using CofmyUi rtx 5060ti 16gb Tutorial

Posted by Legion10008@reddit | LocalLLaMA | View on Reddit | 2 comments

[-]

ClimateBoss@reddit

how do i do it without web ui bruh?

Is microsoft going to train LLM on this? Github is clearly getting destroyed.

Posted by FPham@reddit | LocalLLaMA | View on Reddit | 106 comments

[-]

ClimateBoss@reddit

how do i get me some github stars bruh ?

Recommendations for a affordable prebuilt PC to run 120B LLM locally?

Posted by TechnologyLumpy5937@reddit | LocalLLaMA | View on Reddit | 20 comments

[-]

ClimateBoss@reddit

examples of what ?? unusable for coding

LightMem (ICLR 2026): Lightweight and Efficient Memory-Augmented Generation — 10×+ gains with 100× lower cost

Posted by zxlzr@reddit | LocalLLaMA | View on Reddit | 15 comments

[-]

ClimateBoss@reddit

How do i use this in llama.cpp?

Recommendations for a affordable prebuilt PC to run 120B LLM locally?

Posted by TechnologyLumpy5937@reddit | LocalLLaMA | View on Reddit | 20 comments

[-]

ClimateBoss@reddit

what are you supposed to do with 2 tk/s ?

Vellium v0.4 — alternative simplified UI, updated writing mode and multi-char improvements

Posted by Possible_Statement84@reddit | LocalLLaMA | View on Reddit | 20 comments

[-]

ClimateBoss@reddit

does it run on terminal or ui only ?

Completed my 64GB VRAM rig - dual MI50 build + custom shroud

Posted by roackim@reddit | LocalLLaMA | View on Reddit | 49 comments

[-]

ClimateBoss@reddit

wanna sell them ? hahaha!

Completed my 64GB VRAM rig - dual MI50 build + custom shroud

Posted by roackim@reddit | LocalLLaMA | View on Reddit | 49 comments

[-]

ClimateBoss@reddit

yeah dude dont u need to compile vLLM from scratch for that?

Completed my 64GB VRAM rig - dual MI50 build + custom shroud

Posted by roackim@reddit | LocalLLaMA | View on Reddit | 49 comments

[-]

ClimateBoss@reddit

only 155w? crazy what are the temps?

Completed my 64GB VRAM rig - dual MI50 build + custom shroud

Posted by roackim@reddit | LocalLLaMA | View on Reddit | 49 comments

[-]

ClimateBoss@reddit

how much were the MI50s and how is the tks on llama.cpp? have you tried ik\_llama.cpp?

Best Qwen3.5-35B-A3B GGUF for 24GB VRAM?!

Posted by VoidAlchemy@reddit | LocalLLaMA | View on Reddit | 83 comments

[-]

ClimateBoss@reddit

okay but whats the point of the chart almost the same perplexity for q8\_0.gguf but IRL 4 bit quants are way dumber so why even bother with 4 bit?

Best Qwen3.5-35B-A3B GGUF for 24GB VRAM?!

Posted by VoidAlchemy@reddit | LocalLLaMA | View on Reddit | 83 comments

[-]

ClimateBoss@reddit

I don't believe you that 4-bit quants are some how almost like Q8\_0 gguf.

What language large models can I run on a 5060 laptop with 32GB of RAM?

Posted by Smart-Cap-2216@reddit | LocalLLaMA | View on Reddit | 4 comments

[-]

ClimateBoss@reddit

ya, qwen3.5 35b in MXFP4 mode, that should fit 8gb vram and maybe 16gb ddr5 download LMStudio if you're new

4xP100 in NVlink how to get the most out of them?

Posted by Simple_Library_2700@reddit | LocalLLaMA | View on Reddit | 3 comments

[-]

ClimateBoss@reddit

how much did u get it for ? also looking at c4130 do u recommend?

What language large models can I run on a 5060 laptop with 32GB of RAM?

Posted by Smart-Cap-2216@reddit | LocalLLaMA | View on Reddit | 4 comments

[-]

ClimateBoss@reddit

qwen3 coder 30b deepseek r1 8b

Qwen 3 coder next ud-q8-xl F16 filling up the two orin rpc mesh!

Posted by braydon125@reddit | LocalLLaMA | View on Reddit | 10 comments

[-]

ClimateBoss@reddit

is 1gb ethernet good enough or need more? rpc-server is compiled with llama-server or some other installation?

Qwen 3 coder next ud-q8-xl F16 filling up the two orin rpc mesh!

Posted by braydon125@reddit | LocalLLaMA | View on Reddit | 10 comments

[-]

ClimateBoss@reddit

whats the llama.cpp command for RPC on 2 computers?

I created yet another coding agent - Its tiny and fun (atleast for me), hope the community finds it useful

Posted by Weird_Search_4723@reddit | LocalLLaMA | View on Reddit | 42 comments

[-]

ClimateBoss@reddit

gpt oss 120b and qwen3 coder next

I created yet another coding agent - Its tiny and fun (atleast for me), hope the community finds it useful

Posted by Weird_Search_4723@reddit | LocalLLaMA | View on Reddit | 42 comments

[-]

ClimateBoss@reddit

u/Weird_Search_4723 messes up on c++ has issues with } etc can u check?

How to Prompt Caching with llama.cpp?

Posted by ClimateBoss@reddit | LocalLLaMA | View on Reddit | 13 comments

[-]

ClimateBoss@reddit (OP)

\--ctx-checkpoints 69 not real fix but reduces prompt processing sometimes doesnt work on **ik\_llama.cpp**