KattleLaughter

Claude code source code has been leaked via a map file in their npm registry

Posted by Nunki08@reddit | LocalLLaMA | View on Reddit | 805 comments

[-]

KattleLaughter@reddit

Apparently Claude Code already uses axios so...

Qwen3-VL's perceptiveness is incredible.

Posted by Trypocopris@reddit | LocalLLaMA | View on Reddit | 102 comments

[-]

Looked into llama.cpp source code, the min token and max token argument translates to target min/max pixel of the input image. If the total pixel of the image is below min token, it will scale up the image, and will scale down if it exceed the max pixels.

Qwen3-VL's perceptiveness is incredible.

Posted by Trypocopris@reddit | LocalLLaMA | View on Reddit | 102 comments

[-]

KattleLaughter@reddit

Thank you. I also found Qwen team member saying the same thing in the llama.cpp github issue. Apparently 1024-2048 image tokens is the sweet spot for OCR task. https://preview.redd.it/t5tiq7efch0g1.png?width=1856&format=png&auto=webp&s=550a941c2e8cbda4a86e24b6a61288a6d5ca98d5

Qwen3-VL's perceptiveness is incredible.

Posted by Trypocopris@reddit | LocalLLaMA | View on Reddit | 102 comments

[-]

KattleLaughter@reddit

https://preview.redd.it/5hsgotrm1h0g1.jpeg?width=3072&format=pjpg&auto=webp&s=86fa375e3d08689c58004c756c312c5dee8d5459 I have used your output to draw on the image. The bbox looks slightly off, still impressive though. The bigger issue is the "there are none" problem. I have tested both vllm, llama.cpp, full precision (vllm), unsloth BF16(cpp) and UD-Q4\_K\_XL (cpp). All having this issue, so should be unrelated to quantization. Does anyone have insight on why might this happen?

Qwen3-VL's perceptiveness is incredible.

Posted by Trypocopris@reddit | LocalLLaMA | View on Reddit | 102 comments

[-]

KattleLaughter@reddit

I have been using the HuggingFace unquantized 8B with nightly vLLM (e5e9067e61600eedd4e75bd1c512ec52872916aa). It keeps telling me "There is none" with the same prompt. In fact vLLM Qwen 3 VL coordinate output has been very spotty for me. Did they fix something with the GGUF or llama.cpp?

Qwen's VLM is strong!

Posted by dulldata@reddit | LocalLLaMA | View on Reddit | 33 comments

[-]

KattleLaughter@reddit

How many times do we need to tell them "Don't use publicly available data for benchmark"

Got the DGX Spark - ask me anything

Posted by sotech117@reddit | LocalLLaMA | View on Reddit | 626 comments

[-]

KattleLaughter@reddit

Qwen 3 32B@Q8 with decode 4 tps is just horrendous lol

Here we go again

Posted by Namra_7@reddit | LocalLLaMA | View on Reddit | 81 comments

[-]

KattleLaughter@reddit

Taking 2 months (nearly full time) for 3rd party to hack a novel architecture is going to hurt llama.cpp a lot which is sad because I love llama.cpp.

Qwen3-VL-30B-A3B-Instruct & Thinking (Now Hidden)

Posted by TKGaming_11@reddit | LocalLLaMA | View on Reddit | 51 comments

[-]

KattleLaughter@reddit

I think with word for word OCR task being too verbose tends to degrade the accuracy due to "thinking too much" which actually prevented itself from giving a straight answer of what could otherwise be an intuitive task. But for task like parsing table that require more involved spatial and logical understanding, thinking mode tends to do better.

Oh my God, what a monster is this?

Posted by NearbyBig3383@reddit | LocalLLaMA | View on Reddit | 148 comments

[-]

KattleLaughter@reddit

regression test

Hilarious chart from GPT-5 Reveal

Posted by lyceras@reddit | LocalLLaMA | View on Reddit | 241 comments

[-]

KattleLaughter@reddit

Validation of what? I was to see how LLM would read/react to something like this （jokingly mind you) Honestly I don’t care either way. You are making a lot of assumptions here.

Hilarious chart from GPT-5 Reveal

Posted by lyceras@reddit | LocalLLaMA | View on Reddit | 241 comments

[-]

KattleLaughter@reddit

https://preview.redd.it/qn8ayn161nhf1.png?width=1344&format=png&auto=webp&s=fd616e5a37a80b97e3aacc24cd54af0007ecfc9f

Hilarious chart from GPT-5 Reveal

Posted by lyceras@reddit | LocalLLaMA | View on Reddit | 241 comments

[-]

KattleLaughter@reddit

I will just leave this here https://preview.redd.it/q4eu0ga21nhf1.png?width=1620&format=png&auto=webp&s=021bfe6f7f85a8e4a3ffb8e301b3fb6182543f0b

Hilarious chart from GPT-5 Reveal

Posted by lyceras@reddit | LocalLLaMA | View on Reddit | 241 comments

[-]

KattleLaughter@reddit

lmao, somebody please tell it was a typo

OpenAI, I don't feel SAFE ENOUGH

Posted by Final_Wheel_7486@reddit | LocalLLaMA | View on Reddit | 184 comments

[-]

KattleLaughter@reddit

But I felt SAFE from the harm of the truth.

Heads up if you're using Gemma 3 vision

Posted by Admirable-Star7088@reddit | LocalLLaMA | View on Reddit | 40 comments

[-]

KattleLaughter@reddit

heads up the LM Studio default Q4 perform notably worse than Q8. Do you happen to use unquantized version when using directly and using Q4 with LM studio?

Openweb UI, LM Studio or which interface is your favorite .... and why? (Apple users)

Posted by EmergencyLetter135@reddit | LocalLLaMA | View on Reddit | 36 comments

[-]

KattleLaughter@reddit

I have openwebui connecting to openrouter for trying out bigger models. Also connecting to LM Studio for local models. All my coversation is in the same place. Creating account for friends and sharing links of coversation about some magic prompt with them is great too.

Zonos, the easy to use, 1.6B, open weight, text-to-speech model that creates new speech or clones voices from 10 second clips

Posted by SoundHole@reddit | LocalLLaMA | View on Reddit | 130 comments

[-]

KattleLaughter@reddit

If you are using Windows docker desktop with WSL enabled, remember to disable host network mode in docker comppose and map the port instead. Host network mode does not work with WSL.

Trump announces a $500 billion AI infrastructure investment in the US

Posted by fallingdowndizzyvr@reddit | LocalLLaMA | View on Reddit | 371 comments

[-]

KattleLaughter@reddit

I don't know, AI is like steam enine or diesel engine in a way that it is a piece of technology that is going to replace a lot of manual labour and no one is able to stop its adoption. It is also very probable that, unlike steam/diesel engine, is going to replace more jobs than it creates unfortunately. But the alternative is not investing in the infra in US, and the same still happens just it being shifted to oversea and you have even less control over it. I can't think of a way more plateable than how Trump put it. Then of couse you could argue we should not speed up the process by actively investing in it.

Deepseek is overthinking

Posted by Mr_Jericho@reddit | LocalLLaMA | View on Reddit | 209 comments

[-]

KattleLaughter@reddit

You meant large parameter models are autistic !?

Visual Basic 6 rebuilt in C# - complete with form designer and IDE, runs directly in browser (WASM)

Posted by AvaloniaUI-Mike@reddit | programming | View on Reddit | 124 comments

[-]

KattleLaughter@reddit

Yeah, bring back so many memories from middle school

Don't Overplan, Do Prototype

Posted by yektadev@reddit | programming | View on Reddit | 87 comments

[-]

KattleLaughter@reddit

I mean, it is literally the entire reason why agile is preferred over waterfall. Do people have no idea why the industry was adopting agile in the first place? Waterfall is the ideal way of planning in a perfect world but agile is a realistic way of getting things actually done. You could look at SpaceX vs NASA if you want more examples. Don't get me wrong, NASA were doing amazing works given the political constraints. They could not afford to have rockets exploding on launch pads and everything needed to be go per planned. SpaceX StarShip have more prototype explosions in a year than any other space agency. Ultimately, as a project manager if your stakeholders understand why rocket exploding on a launch pad is a good thing in the long run, you are doing one hella good job.

Farewell MongoDB: 5 reasons why you only need PostgreSQL

Posted by PauseDistinct2044@reddit | programming | View on Reddit | 107 comments

[-]

KattleLaughter@reddit

I always go for flat head instead of philips screwdriver because everyone knows flat head is better.

New X.Com Payment Feature page spotted

Posted by yuvalsteuer@reddit | programming | View on Reddit | 44 comments

[-]

KattleLaughter@reddit

I have zero idea why Elon Musk would think having everything in the same app like WeChat would be a good idea to begin with. Western market, unlike Chinese, is not a walled garden. There is always a better option (which is not trying to do everything at once) due to the competition. And even if he were to be remotely successful, Apple / Google could just push out better products with half the efforts due to the inherent advantages of being the platform owner. It just makes no business sense at all.