kryptkpr

ehartford/dolphin-2.5-mixtral-8x7b has a very persuasive system prompt

Posted by kryptkpr@reddit | LocalLLaMA | View on Reddit | 6 comments
I Generated 1 Billion Tokens (So You Don't Have To): Introducing ReasonScape

Posted by kryptkpr@reddit | LocalLLaMA | View on Reddit | 27 comments
Guide to serving Ring-mini-2.0 with VLLM (and a quick eval)

Posted by kryptkpr@reddit | LocalLLaMA | View on Reddit | 3 comments
Anyone test two DGX Sparks linked via their ConnectX yet?

Posted by kryptkpr@reddit | LocalLLaMA | View on Reddit | 9 comments
ReasonScape Evaluation: AI21 Jamba Reasoning vs Qwen3 4B vs Qwen3 4B 2507

Posted by kryptkpr@reddit | LocalLLaMA | View on Reddit | 30 comments
A local llama in her native habitat

Posted by kryptkpr@reddit | LocalLLaMA | View on Reddit | 149 comments
The Titan 18U AI Homelab Build Log and Lessons Learned

Posted by kryptkpr@reddit | LocalLLaMA | View on Reddit | 30 comments
It's Mamba time: Comparing Nemotron Nano v2 vs Falcon-H1 vs Qwen (og) vs Qwen (2507)

Posted by kryptkpr@reddit | LocalLLaMA | View on Reddit | 40 comments
PSA: GPU Host Interface board power cables can melt, too.

Posted by kryptkpr@reddit | LocalLLaMA | View on Reddit | 0 comments
Ruminate: From All-or-Nothing to Just-Right Reasoning in LLMs

Posted by kryptkpr@reddit | LocalLLaMA | View on Reddit | 8 comments
3060 [x16 PCIe riser] vs 3060 [x1 USB extension]: A quantitative comparison of eGPU prompt and text generation performance across multiple inference engines

Posted by kryptkpr@reddit | LocalLLaMA | View on Reddit | 74 comments
Petals: decentralized inference and finetuning of LLMs

Posted by kryptkpr@reddit | LocalLLaMA | View on Reddit | 2 comments
toe2toe: If LLMs could play Tic Tac Toe, would Llama or NeMo win?

Posted by kryptkpr@reddit | LocalLLaMA | View on Reddit | 2 comments
Jank can be beautiful | 2x3060+2xP100 open-air LLM rig with 2-stage cooling

Posted by kryptkpr@reddit | LocalLLaMA | View on Reddit | 39 comments
AlteredWorlds: History re-imagined by command_r_plus_08_2024, illustrated by flux.1-schnell

Posted by kryptkpr@reddit | LocalLLaMA | View on Reddit | 14 comments
[Model] Meta Llama 3.1 Know Issues & FAQ · Issue #6689 · vllm-project/vllm

Posted by kryptkpr@reddit | LocalLLaMA | View on Reddit | 0 comments
Introducing Tcurtsni: The Reverse-Instruct LLM Chat App

Posted by kryptkpr@reddit | LocalLLaMA | View on Reddit | 32 comments
llama.ttf - a font which is also an LLM

Posted by kryptkpr@reddit | LocalLLaMA | View on Reddit | 9 comments
Presenting "The Muse" - a logit sampler that makes LLMs more creative

Posted by kryptkpr@reddit | LocalLLaMA | View on Reddit | 4 comments
The LLooM - a highly experimental (local) AI workflow to visualize and "weave" stories out of underlying logit probabilities

Posted by kryptkpr@reddit | LocalLLaMA | View on Reddit | 60 comments
The correct answer to all A100/A6000 and other "production" setup questions

Posted by kryptkpr@reddit | LocalLLaMA | View on Reddit | 1 comments
BiLLM achieving for the first time high-accuracy inference (e.g. 8.41 perplexity on LLaMA2-70B) with only 1.08-bit weights across various LLMs families and evaluation metrics, outperforms SOTA quantization methods of LLM by significant

Posted by kryptkpr@reddit | LocalLLaMA | View on Reddit | 4 comments