KattleLaughter

Claude code source code has been leaked via a map file in their npm registry

Posted by Nunki08@reddit | LocalLLaMA | View on Reddit | 805 comments

Qwen3-VL's perceptiveness is incredible.

Posted by Trypocopris@reddit | LocalLLaMA | View on Reddit | 102 comments

KattleLaughter@reddit

Looked into llama.cpp source code, the min token and max token argument translates to target min/max pixel of the input image. If the total pixel of the image is below min token, it will scale up the image, and will scale down if it exceed the max pixels.

Qwen3-VL's perceptiveness is incredible.

Posted by Trypocopris@reddit | LocalLLaMA | View on Reddit | 102 comments

KattleLaughter@reddit

Thank you. I also found Qwen team member saying the same thing in the llama.cpp github issue. Apparently 1024-2048 image tokens is the sweet spot for OCR task. https://preview.redd.it/t5tiq7efch0g1.png?width=1856&format=png&auto=webp&s=550a941c2e8cbda4a86e24b6a61288a6d5ca98d5

Qwen3-VL's perceptiveness is incredible.

Posted by Trypocopris@reddit | LocalLLaMA | View on Reddit | 102 comments

KattleLaughter@reddit

https://preview.redd.it/5hsgotrm1h0g1.jpeg?width=3072&format=pjpg&auto=webp&s=86fa375e3d08689c58004c756c312c5dee8d5459 I have used your output to draw on the image. The bbox looks slightly off, still impressive though. The bigger issue is the "there are none" problem. I have tested both vllm, llama.cpp, full precision (vllm), unsloth BF16(cpp) and UD-Q4\_K\_XL (cpp). All having this issue, so should be unrelated to quantization. Does anyone have insight on why might this happen?

Qwen3-VL's perceptiveness is incredible.

Posted by Trypocopris@reddit | LocalLLaMA | View on Reddit | 102 comments

KattleLaughter@reddit

I have been using the HuggingFace unquantized 8B with nightly vLLM (e5e9067e61600eedd4e75bd1c512ec52872916aa). It keeps telling me "There is none" with the same prompt. In fact vLLM Qwen 3 VL coordinate output has been very spotty for me. Did they fix something with the GGUF or llama.cpp?

Qwen's VLM is strong!

Posted by dulldata@reddit | LocalLLaMA | View on Reddit | 33 comments

Got the DGX Spark - ask me anything

Posted by sotech117@reddit | LocalLLaMA | View on Reddit | 626 comments

Here we go again

Posted by Namra_7@reddit | LocalLLaMA | View on Reddit | 81 comments

Qwen3-VL-30B-A3B-Instruct & Thinking (Now Hidden)

Posted by TKGaming_11@reddit | LocalLLaMA | View on Reddit | 51 comments

KattleLaughter@reddit

I think with word for word OCR task being too verbose tends to degrade the accuracy due to "thinking too much" which actually prevented itself from giving a straight answer of what could otherwise be an intuitive task. But for task like parsing table that require more involved spatial and logical understanding, thinking mode tends to do better.

Oh my God, what a monster is this?

Posted by NearbyBig3383@reddit | LocalLLaMA | View on Reddit | 148 comments

Hilarious chart from GPT-5 Reveal

Posted by lyceras@reddit | LocalLLaMA | View on Reddit | 241 comments

KattleLaughter@reddit

Validation of what? I was to see how LLM would read/react to something like this (jokingly mind you) Honestly I don’t care either way. You are making a lot of assumptions here.

Hilarious chart from GPT-5 Reveal

Posted by lyceras@reddit | LocalLLaMA | View on Reddit | 241 comments

Hilarious chart from GPT-5 Reveal

Posted by lyceras@reddit | LocalLLaMA | View on Reddit | 241 comments

Hilarious chart from GPT-5 Reveal

Posted by lyceras@reddit | LocalLLaMA | View on Reddit | 241 comments

OpenAI, I don't feel SAFE ENOUGH

Posted by Final_Wheel_7486@reddit | LocalLLaMA | View on Reddit | 184 comments

Heads up if you're using Gemma 3 vision

Posted by Admirable-Star7088@reddit | LocalLLaMA | View on Reddit | 40 comments

KattleLaughter@reddit

heads up the LM Studio default Q4 perform notably worse than Q8. Do you happen to use unquantized version when using directly and using Q4 with LM studio?

Openweb UI, LM Studio or which interface is your favorite .... and why? (Apple users)

Posted by EmergencyLetter135@reddit | LocalLLaMA | View on Reddit | 36 comments

KattleLaughter@reddit

I have openwebui connecting to openrouter for trying out bigger models. Also connecting to LM Studio for local models. All my coversation is in the same place. Creating account for friends and sharing links of coversation about some magic prompt with them is great too.

Zonos, the easy to use, 1.6B, open weight, text-to-speech model that creates new speech or clones voices from 10 second clips

Posted by SoundHole@reddit | LocalLLaMA | View on Reddit | 130 comments

KattleLaughter@reddit

If you are using Windows docker desktop with WSL enabled, remember to disable host network mode in docker comppose and map the port instead. Host network mode does not work with WSL.

Trump announces a $500 billion AI infrastructure investment in the US

Posted by fallingdowndizzyvr@reddit | LocalLLaMA | View on Reddit | 371 comments

KattleLaughter@reddit

I don't know, AI is like steam enine or diesel engine in a way that it is a piece of technology that is going to replace a lot of manual labour and no one is able to stop its adoption. It is also very probable that, unlike steam/diesel engine, is going to replace more jobs than it creates unfortunately. But the alternative is not investing in the infra in US, and the same still happens just it being shifted to oversea and you have even less control over it. I can't think of a way more plateable than how Trump put it. Then of couse you could argue we should not speed up the process by actively investing in it.

Deepseek is overthinking

Posted by Mr_Jericho@reddit | LocalLLaMA | View on Reddit | 209 comments

Visual Basic 6 rebuilt in C# - complete with form designer and IDE, runs directly in browser (WASM)

Posted by AvaloniaUI-Mike@reddit | programming | View on Reddit | 124 comments

Don't Overplan, Do Prototype

Posted by yektadev@reddit | programming | View on Reddit | 87 comments

KattleLaughter@reddit

I mean, it is literally the entire reason why agile is preferred over waterfall. Do people have no idea why the industry was adopting agile in the first place? Waterfall is the ideal way of planning in a perfect world but agile is a realistic way of getting things actually done. You could look at SpaceX vs NASA if you want more examples. Don't get me wrong, NASA were doing amazing works given the political constraints. They could not afford to have rockets exploding on launch pads and everything needed to be go per planned. SpaceX StarShip have more prototype explosions in a year than any other space agency. Ultimately, as a project manager if your stakeholders understand why rocket exploding on a launch pad is a good thing in the long run, you are doing one hella good job.

Farewell MongoDB: 5 reasons why you only need PostgreSQL

Posted by PauseDistinct2044@reddit | programming | View on Reddit | 107 comments

New X.Com Payment Feature page spotted

Posted by yuvalsteuer@reddit | programming | View on Reddit | 44 comments

KattleLaughter@reddit

I have zero idea why Elon Musk would think having everything in the same app like WeChat would be a good idea to begin with. Western market, unlike Chinese, is not a walled garden. There is always a better option (which is not trying to do everything at once) due to the competition. And even if he were to be remotely successful, Apple / Google could just push out better products with half the efforts due to the inherent advantages of being the platform owner. It just makes no business sense at all.