Ok_Warning2146

Trump signs narrower executive order on AI oversight after industry objections

Posted by Ok_Warning2146@reddit | LocalLLaMA | View on Reddit | 46 comments
China Expands Travel Curbs to Top AI Talent at Private Firms

Posted by Ok_Warning2146@reddit | LocalLLaMA | View on Reddit | 53 comments
unsloth vs bartowski MTP ggufs

Posted by Ok_Warning2146@reddit | LocalLLaMA | View on Reddit | 13 comments
Chinese CXMT unveils DDR5-8000 RAM

Posted by Ok_Warning2146@reddit | LocalLLaMA | View on Reddit | 137 comments
How do I make MTP work in llama-server?

Posted by Ok_Warning2146@reddit | LocalLLaMA | View on Reddit | 28 comments
Simple table to compare 3090, 4090 and 5090

Posted by Ok_Warning2146@reddit | LocalLLaMA | View on Reddit | 45 comments
$300k DGX B300 is actually a better deal than buying 24 RTX 6000s

Posted by Ok_Warning2146@reddit | LocalLLaMA | View on Reddit | 37 comments
The exact KV cache usage of DeepSeek V4

Posted by Ok_Warning2146@reddit | LocalLLaMA | View on Reddit | 57 comments
Running llama.cpp on Snapdragon Hexagon NPU seems promising

Posted by Ok_Warning2146@reddit | LocalLLaMA | View on Reddit | 6 comments
Japan's Rakuten is going to release a 700B open weight model in Spring 2026

Posted by Ok_Warning2146@reddit | LocalLLaMA | View on Reddit | 48 comments
How much will you pay for a PCIe Nvidia B100, B150?

Posted by Ok_Warning2146@reddit | LocalLLaMA | View on Reddit | 14 comments
Top 10 open weight models in lmarena

Posted by Ok_Warning2146@reddit | LocalLLaMA | View on Reddit | 21 comments
The current state of the Chinese LLMs scene

Posted by Ok_Warning2146@reddit | LocalLLaMA | View on Reddit | 107 comments
How to convert my fine tuning from adamw to muon in pytorch?

Posted by Ok_Warning2146@reddit | LocalLLaMA | View on Reddit | 1 comments
Nvidia B100 is essentially H100 w/ HBM3E + Key Perf metrics of B200/B300

Posted by Ok_Warning2146@reddit | LocalLLaMA | View on Reddit | 2 comments
Kimi Linear 30% gain in pp and higher context merged to llama.cpp

Posted by Ok_Warning2146@reddit | LocalLLaMA | View on Reddit | 13 comments
Gemini Pro 3.1 preview is catching up to Opus 4.6 slightly in coding

Posted by Ok_Warning2146@reddit | LocalLLaMA | View on Reddit | 4 comments
How to get the most from llama.cpp's iSWA support

Posted by Ok_Warning2146@reddit | LocalLLaMA | View on Reddit | 19 comments
Top 10 non-Chinese models at lmarena.

Posted by Ok_Warning2146@reddit | LocalLLaMA | View on Reddit | 6 comments
llama.cpp llama-server running SSM models VRAM fix merged

Posted by Ok_Warning2146@reddit | LocalLLaMA | View on Reddit | 1 comments
llama.cpp Kimi Linear llama-server bug fix

Posted by Ok_Warning2146@reddit | LocalLLaMA | View on Reddit | 0 comments
Kimi-Linear support is merged to llama.cpp

Posted by Ok_Warning2146@reddit | LocalLLaMA | View on Reddit | 22 comments
What happened to Zhuiyi Tech (the inventor of RoPE)?

Posted by Ok_Warning2146@reddit | LocalLLaMA | View on Reddit | 8 comments
Anyone running llm on their 16GB android phone?

Posted by Ok_Warning2146@reddit | LocalLLaMA | View on Reddit | 47 comments
llama.cpp MLA KV cache support for KimiLinear-48B-A3B

Posted by Ok_Warning2146@reddit | LocalLLaMA | View on Reddit | 38 comments
R200 and RTX 6000 Rubin speculation

Posted by Ok_Warning2146@reddit | LocalLLaMA | View on Reddit | 16 comments
Backend agnostic llama.cpp support for Kimi-Linear-48B-A3B

Posted by Ok_Warning2146@reddit | LocalLLaMA | View on Reddit | 7 comments
Ten former Samsung employees arrested for tech leak to CXMT

Posted by Ok_Warning2146@reddit | LocalLLaMA | View on Reddit | 7 comments
The mystery of Apple M3 Ultra GPU performance

Posted by Ok_Warning2146@reddit | LocalLLaMA | View on Reddit | 31 comments
Does llm software debugging heavily depends on long context performance?

Posted by Ok_Warning2146@reddit | LocalLLaMA | View on Reddit | 6 comments
In depth analysis of Nvidia's Jet Nemotron models

Posted by Ok_Warning2146@reddit | LocalLLaMA | View on Reddit | 2 comments
Figured out why my 3090 is so slow in inference

Posted by Ok_Warning2146@reddit | LocalLLaMA | View on Reddit | 14 comments
How come my 3090 is just as fast as my 3050 for Qwen3-1.7B?

Posted by Ok_Warning2146@reddit | LocalLLaMA | View on Reddit | 20 comments
Energy efficiency of 5090 is slightly worse than 4090

Posted by Ok_Warning2146@reddit | LocalLLaMA | View on Reddit | 51 comments
What is the point of Nvidia's Jet-Nemotron-2B?

Posted by Ok_Warning2146@reddit | LocalLLaMA | View on Reddit | 13 comments
What happens to GGUF converted from LLM that requires trust_remote_code=True?

Posted by Ok_Warning2146@reddit | LocalLLaMA | View on Reddit | 7 comments
How to make PocketPal inference faster on android?

Posted by Ok_Warning2146@reddit | LocalLLaMA | View on Reddit | 4 comments
Qwen3-30B-A3B 2507 Instruct vs Thinking

Posted by Ok_Warning2146@reddit | LocalLLaMA | View on Reddit | 7 comments
RTX PRO 5000 Laptop 24GB GDDR7 10496 cores 175W

Posted by Ok_Warning2146@reddit | LocalLLaMA | View on Reddit | 43 comments
pytorch 2.7.x no longer supports Pascal architecture?

Posted by Ok_Warning2146@reddit | LocalLLaMA | View on Reddit | 18 comments
M5 Ultra can do well for LLM, video gen and training

Posted by Ok_Warning2146@reddit | LocalLLaMA | View on Reddit | 16 comments
What is the best way to fine tune an abliterated model?

Posted by Ok_Warning2146@reddit | LocalLLaMA | View on Reddit | 1 comments
Increase in context size causes run time to explode?

Posted by Ok_Warning2146@reddit | LocalLLaMA | View on Reddit | 11 comments
Intel Granite Rapids CPU on sale at Newegg up to 65% off MSRP

Posted by Ok_Warning2146@reddit | LocalLLaMA | View on Reddit | 24 comments
What does gradient_checkpointing do when using HF transformer for full training?

Posted by Ok_Warning2146@reddit | LocalLLaMA | View on Reddit | 3 comments
Intel 6944P the most cost effective CPU solution for llm

Posted by Ok_Warning2146@reddit | LocalLLaMA | View on Reddit | 69 comments
What will happen to an llm when you double the RoPE scaling factor?

Posted by Ok_Warning2146@reddit | LocalLLaMA | View on Reddit | 7 comments
Kimi-K2 is a DeepSeek V3 with more experts

Posted by Ok_Warning2146@reddit | LocalLLaMA | View on Reddit | 38 comments
Architecture Review of the new MoE models

Posted by Ok_Warning2146@reddit | LocalLLaMA | View on Reddit | 27 comments
M3 Ultra is a slightly weakened 3090 w/ 512GB

Posted by Ok_Warning2146@reddit | LocalLLaMA | View on Reddit | 272 comments