jacek2023

it's a valid question but you should ask it on X 😉 [https://x.com/osanseviero/status/2062205174785921438](https://x.com/osanseviero/status/2062205174785921438)

google/gemma-4-12B · Hugging Face

Posted by jacek2023@reddit | LocalLLaMA | View on Reddit | 171 comments

[-]

jacek2023@reddit (OP)

https://preview.redd.it/a6o74t43d35h1.png?width=1197&format=png&auto=webp&s=4083c13d60da34255094761ca134a50d26097022

ggml-org/gemma-4-12b-it-GGUF · Hugging Face

Posted by jacek2023@reddit | LocalLLaMA | View on Reddit | 13 comments

[-]

jacek2023@reddit (OP)

[https://www.reddit.com/r/LocalLLaMA/comments/1tvtn6m/googlegemma412b\_hugging\_face/](https://www.reddit.com/r/LocalLLaMA/comments/1tvtn6m/googlegemma412b_hugging_face/)

ggml-org/gemma-4-12b-it-GGUF · Hugging Face

Posted by jacek2023@reddit | LocalLLaMA | View on Reddit | 13 comments

[-]

jacek2023@reddit (OP)

[https://www.reddit.com/r/LocalLLaMA/comments/1tvtn6m/googlegemma412b\_hugging\_face/](https://www.reddit.com/r/LocalLLaMA/comments/1tvtn6m/googlegemma412b_hugging_face/)

google/gemma-4-12B · Hugging Face

Posted by jacek2023@reddit | LocalLLaMA | View on Reddit | 171 comments

[-]

jacek2023@reddit (OP)

https://preview.redd.it/8tsvau0hb35h1.png?width=1163&format=png&auto=webp&s=231a022a3a8e2dbbdf6d9ee6ff4214421f2ffd7f

google/gemma-4-12B · Hugging Face

Posted by jacek2023@reddit | LocalLLaMA | View on Reddit | 171 comments

[-]

jacek2023@reddit (OP)

https://preview.redd.it/nqasdrqeb35h1.png?width=1217&format=png&auto=webp&s=144767b7483ed8a34f89311baea3f01497b713d8

ggml-org/gemma-4-12b-it-GGUF · Hugging Face

Posted by jacek2023@reddit | LocalLLaMA | View on Reddit | 13 comments

[-]

jacek2023@reddit (OP)

https://preview.redd.it/40ds11on935h1.png?width=740&format=png&auto=webp&s=9fd2e0e78c6b50e055077d0fdb9b9e8be19ed52c 15-12=3

Qwen 3.7 Plus just briefly appeared and then disappeared on OpenRouter.

Posted by ihatebeinganonymous@reddit | LocalLLaMA | View on Reddit | 28 comments

[-]

How does the new abliteration tool Apostate compare with others? - Abliterlitics

Posted by nathandreamfast@reddit | LocalLLaMA | View on Reddit | 12 comments

[-]

jacek2023@reddit

Your HF link is 404 upload some ready-to-use models

Calling it now Microsoft is buying Unsloth.

Posted by Wrong_Mushroom_7350@reddit | LocalLLaMA | View on Reddit | 290 comments

[-]

jacek2023@reddit

Is there a model you use daily? What's your use case?

Macbook M5 Pro 24GB or 48GB

Posted by Resident_Bell_4457@reddit | LocalLLaMA | View on Reddit | 66 comments

[-]

jacek2023@reddit

I am afraid 24GB is very limited memory for LLMs

New Microsoft models are not open, right?

Posted by ihatebeinganonymous@reddit | LocalLLaMA | View on Reddit | 7 comments

[-]

jacek2023@reddit

They wrote "locally" https://preview.redd.it/rvwzohoj415h1.png?width=605&format=png&auto=webp&s=9293f7f04f846eb88d442cf52636dc30fcac30a9

Calling it now Microsoft is buying Unsloth.

Posted by Wrong_Mushroom_7350@reddit | LocalLLaMA | View on Reddit | 290 comments

[-]

jacek2023@reddit

unsloth runs llama.cpp converter to convert safetensors files to gguf files you can do it too

Would you consider getting an NVIDIA RTX Spark laptop?

Posted by gamblingapocalypse@reddit | LocalLLaMA | View on Reddit | 157 comments

[-]

jacek2023@reddit

Yes if linux and llama.cpp :) No if Windows only No if closed source software only

StepFun 3.5 MTP by pwilkin · Pull Request #23274 · ggml-org/llama.cpp

Posted by jacek2023@reddit | LocalLLaMA | View on Reddit | 21 comments

[-]

StepFun 3.5 MTP by pwilkin · Pull Request #23274 · ggml-org/llama.cpp

Posted by jacek2023@reddit | LocalLLaMA | View on Reddit | 21 comments

[-]

jacek2023@reddit (OP)

I hope it will work with both 3.5 and 3.7 because I prefered 3.5 in my local tests

Intel Arc Pro B70 llama.cpp benchmarks posted

Posted by jacek2023@reddit | LocalLLaMA | View on Reddit | 48 comments

[-]

jacek2023@reddit (OP)

I am able to produce lowest resolution short video (like 100 frames) in about minute on 5070 and 3090, is it slower then?

Intel Arc Pro B70 llama.cpp benchmarks posted

Posted by jacek2023@reddit | LocalLLaMA | View on Reddit | 48 comments

[-]

jacek2023@reddit (OP)

do you have some benchmarks for wan or ltx?

I know… I know… But how to replace ChatGPT locally?

Posted by Thin_Pollution8843@reddit | LocalLLaMA | View on Reddit | 11 comments

[-]

Ignoring benchmarks, how do the newest local models (gemma 4 31B, 26BA4B, Qwen 3.6) “feel” to you? What do you think they compare to?

Posted by opoot_@reddit | LocalLLaMA | View on Reddit | 42 comments

[-]

jacek2023@reddit

Qwen 27B is good for agentic coding, Gemma 31B is good for creative tasks, but I use also 120B models

Intel Arc Pro B70 llama.cpp benchmarks posted

Posted by jacek2023@reddit | LocalLLaMA | View on Reddit | 48 comments

[-]

Is agenting usage increasing CPU usage for you?

Posted by superloser48@reddit | LocalLLaMA | View on Reddit | 10 comments

[-]

Dual rtx 3090 build

Posted by Sufficient_Phone_242@reddit | LocalLLaMA | View on Reddit | 68 comments

[-]

jacek2023@reddit

Without the open frame there is will be always some noise.

Is agenting usage increasing CPU usage for you?

Posted by superloser48@reddit | LocalLLaMA | View on Reddit | 10 comments

[-]

jacek2023@reddit

Who is "everyone"? Could you share some links?

Intel Arc Pro B70 llama.cpp benchmarks posted

Posted by jacek2023@reddit | LocalLLaMA | View on Reddit | 48 comments

[-]

You commented on my post, I shared someone’s benchmarks for people considering B70s. I replied to your comment saying that buying B70s might be easier than buying four 3090s. I already have three 3090s.

Intel Arc Pro B70 llama.cpp benchmarks posted

Posted by jacek2023@reddit | LocalLLaMA | View on Reddit | 48 comments

[-]

jacek2023@reddit (OP)

Why do you think I am considering four B70s?

Intel Arc Pro B70 llama.cpp benchmarks posted

Posted by jacek2023@reddit | LocalLLaMA | View on Reddit | 48 comments

[-]

jacek2023@reddit (OP)

I use -sm tensor with my 3 cards

Intel Arc Pro B70 llama.cpp benchmarks posted

Posted by jacek2023@reddit | LocalLLaMA | View on Reddit | 48 comments

[-]

jacek2023@reddit (OP)

I’ve been trying to buy a fourth 3090 for a long time, but prices are rising and availability is very low. At this point, I think buying four B70s would be easier than finding 3090s

Intel Arc Pro B70 llama.cpp benchmarks posted

Posted by jacek2023@reddit | LocalLLaMA | View on Reddit | 48 comments

[-]

jacek2023@reddit (OP)

People on Internet always ask "is it worth to buy..." and I still don't understand what they expect

Intel Arc Pro B70 llama.cpp benchmarks posted

Posted by jacek2023@reddit | LocalLLaMA | View on Reddit | 48 comments

[-]

jacek2023@reddit (OP)

I have no idea, but I see SYCL pull requests in llama.cpp, so I assume the backend is still being improved. These benchmarks at least establish a baseline. GPU works and it’s a much more affordable than 5090 (to run big models you need VRAM first and speed is often less crucial)

I hate to be this guy but: Any good, recent CODING models in the 70-80B range?

Posted by ParaboloidalCrest@reddit | LocalLLaMA | View on Reddit | 112 comments

[-]

jacek2023@reddit

It is allowed to use local models forever, it's not cloud

I hate to be this guy but: Any good, recent CODING models in the 70-80B range?

Posted by ParaboloidalCrest@reddit | LocalLLaMA | View on Reddit | 112 comments

[-]

jacek2023@reddit

I have multiple finetunes of GLM Air but I don't use it for coding. Currently I use Qwen 3.6 27B with pi and I have zero problems with it - actually Claude Code (with Opus) started to annoy me with its speed.