Ok_Warning2146
-
Trump signs narrower executive order on AI oversight after industry objections
Posted by Ok_Warning2146@reddit | LocalLLaMA | View on Reddit | 46 comments
-
China Expands Travel Curbs to Top AI Talent at Private Firms
Posted by Ok_Warning2146@reddit | LocalLLaMA | View on Reddit | 53 comments
-
unsloth vs bartowski MTP ggufs
Posted by Ok_Warning2146@reddit | LocalLLaMA | View on Reddit | 13 comments
-
Chinese CXMT unveils DDR5-8000 RAM
Posted by Ok_Warning2146@reddit | LocalLLaMA | View on Reddit | 137 comments
-
How do I make MTP work in llama-server?
Posted by Ok_Warning2146@reddit | LocalLLaMA | View on Reddit | 28 comments
-
Simple table to compare 3090, 4090 and 5090
Posted by Ok_Warning2146@reddit | LocalLLaMA | View on Reddit | 45 comments
-
$300k DGX B300 is actually a better deal than buying 24 RTX 6000s
Posted by Ok_Warning2146@reddit | LocalLLaMA | View on Reddit | 37 comments
-
The exact KV cache usage of DeepSeek V4
Posted by Ok_Warning2146@reddit | LocalLLaMA | View on Reddit | 57 comments
-
Running llama.cpp on Snapdragon Hexagon NPU seems promising
Posted by Ok_Warning2146@reddit | LocalLLaMA | View on Reddit | 6 comments
-
Japan's Rakuten is going to release a 700B open weight model in Spring 2026
Posted by Ok_Warning2146@reddit | LocalLLaMA | View on Reddit | 48 comments
-
How much will you pay for a PCIe Nvidia B100, B150?
Posted by Ok_Warning2146@reddit | LocalLLaMA | View on Reddit | 14 comments
-
Top 10 open weight models in lmarena
Posted by Ok_Warning2146@reddit | LocalLLaMA | View on Reddit | 21 comments
-
The current state of the Chinese LLMs scene
Posted by Ok_Warning2146@reddit | LocalLLaMA | View on Reddit | 107 comments
-
How to convert my fine tuning from adamw to muon in pytorch?
Posted by Ok_Warning2146@reddit | LocalLLaMA | View on Reddit | 1 comments
-
Nvidia B100 is essentially H100 w/ HBM3E + Key Perf metrics of B200/B300
Posted by Ok_Warning2146@reddit | LocalLLaMA | View on Reddit | 2 comments
-
Kimi Linear 30% gain in pp and higher context merged to llama.cpp
Posted by Ok_Warning2146@reddit | LocalLLaMA | View on Reddit | 13 comments
-
Gemini Pro 3.1 preview is catching up to Opus 4.6 slightly in coding
Posted by Ok_Warning2146@reddit | LocalLLaMA | View on Reddit | 4 comments
-
How to get the most from llama.cpp's iSWA support
Posted by Ok_Warning2146@reddit | LocalLLaMA | View on Reddit | 19 comments
-
Top 10 non-Chinese models at lmarena.
Posted by Ok_Warning2146@reddit | LocalLLaMA | View on Reddit | 6 comments
-
llama.cpp llama-server running SSM models VRAM fix merged
Posted by Ok_Warning2146@reddit | LocalLLaMA | View on Reddit | 1 comments
-
llama.cpp Kimi Linear llama-server bug fix
Posted by Ok_Warning2146@reddit | LocalLLaMA | View on Reddit | 0 comments
-
Kimi-Linear support is merged to llama.cpp
Posted by Ok_Warning2146@reddit | LocalLLaMA | View on Reddit | 22 comments
-
What happened to Zhuiyi Tech (the inventor of RoPE)?
Posted by Ok_Warning2146@reddit | LocalLLaMA | View on Reddit | 8 comments
-
Anyone running llm on their 16GB android phone?
Posted by Ok_Warning2146@reddit | LocalLLaMA | View on Reddit | 47 comments
-
llama.cpp MLA KV cache support for KimiLinear-48B-A3B
Posted by Ok_Warning2146@reddit | LocalLLaMA | View on Reddit | 38 comments
-
R200 and RTX 6000 Rubin speculation
Posted by Ok_Warning2146@reddit | LocalLLaMA | View on Reddit | 16 comments
-
Backend agnostic llama.cpp support for Kimi-Linear-48B-A3B
Posted by Ok_Warning2146@reddit | LocalLLaMA | View on Reddit | 7 comments
-
Ten former Samsung employees arrested for tech leak to CXMT
Posted by Ok_Warning2146@reddit | LocalLLaMA | View on Reddit | 7 comments
-
The mystery of Apple M3 Ultra GPU performance
Posted by Ok_Warning2146@reddit | LocalLLaMA | View on Reddit | 31 comments
-
Does llm software debugging heavily depends on long context performance?
Posted by Ok_Warning2146@reddit | LocalLLaMA | View on Reddit | 6 comments
-
In depth analysis of Nvidia's Jet Nemotron models
Posted by Ok_Warning2146@reddit | LocalLLaMA | View on Reddit | 2 comments
-
Figured out why my 3090 is so slow in inference
Posted by Ok_Warning2146@reddit | LocalLLaMA | View on Reddit | 14 comments
-
How come my 3090 is just as fast as my 3050 for Qwen3-1.7B?
Posted by Ok_Warning2146@reddit | LocalLLaMA | View on Reddit | 20 comments
-
Energy efficiency of 5090 is slightly worse than 4090
Posted by Ok_Warning2146@reddit | LocalLLaMA | View on Reddit | 51 comments
-
What is the point of Nvidia's Jet-Nemotron-2B?
Posted by Ok_Warning2146@reddit | LocalLLaMA | View on Reddit | 13 comments
-
What happens to GGUF converted from LLM that requires trust_remote_code=True?
Posted by Ok_Warning2146@reddit | LocalLLaMA | View on Reddit | 7 comments
-
How to make PocketPal inference faster on android?
Posted by Ok_Warning2146@reddit | LocalLLaMA | View on Reddit | 4 comments
-
Qwen3-30B-A3B 2507 Instruct vs Thinking
Posted by Ok_Warning2146@reddit | LocalLLaMA | View on Reddit | 7 comments
-
RTX PRO 5000 Laptop 24GB GDDR7 10496 cores 175W
Posted by Ok_Warning2146@reddit | LocalLLaMA | View on Reddit | 43 comments
-
pytorch 2.7.x no longer supports Pascal architecture?
Posted by Ok_Warning2146@reddit | LocalLLaMA | View on Reddit | 18 comments
-
M5 Ultra can do well for LLM, video gen and training
Posted by Ok_Warning2146@reddit | LocalLLaMA | View on Reddit | 16 comments
-
What is the best way to fine tune an abliterated model?
Posted by Ok_Warning2146@reddit | LocalLLaMA | View on Reddit | 1 comments
-
Increase in context size causes run time to explode?
Posted by Ok_Warning2146@reddit | LocalLLaMA | View on Reddit | 11 comments
-
Intel Granite Rapids CPU on sale at Newegg up to 65% off MSRP
Posted by Ok_Warning2146@reddit | LocalLLaMA | View on Reddit | 24 comments
-
What does gradient_checkpointing do when using HF transformer for full training?
Posted by Ok_Warning2146@reddit | LocalLLaMA | View on Reddit | 3 comments
-
Intel 6944P the most cost effective CPU solution for llm
Posted by Ok_Warning2146@reddit | LocalLLaMA | View on Reddit | 69 comments
-
What will happen to an llm when you double the RoPE scaling factor?
Posted by Ok_Warning2146@reddit | LocalLLaMA | View on Reddit | 7 comments
-
Kimi-K2 is a DeepSeek V3 with more experts
Posted by Ok_Warning2146@reddit | LocalLLaMA | View on Reddit | 38 comments
-
Architecture Review of the new MoE models
Posted by Ok_Warning2146@reddit | LocalLLaMA | View on Reddit | 27 comments
-
M3 Ultra is a slightly weakened 3090 w/ 512GB
Posted by Ok_Warning2146@reddit | LocalLLaMA | View on Reddit | 272 comments