RMCPhoto

Renting GPU time (vast AI) is much more expensive than APIs (openai, m, anth)

Posted by RMCPhoto@reddit | LocalLLaMA | View on Reddit | 33 comments
Abstracting the Prompt and Context

Posted by RMCPhoto@reddit | LocalLLaMA | View on Reddit | 2 comments
Llama 4 system message on Whatsapp

Posted by RMCPhoto@reddit | LocalLLaMA | View on Reddit | 6 comments
Structured outputs with Ollama - what's your recipe for success?

Posted by RMCPhoto@reddit | LocalLLaMA | View on Reddit | 11 comments
Do you think that Mistral worked to develop Saba due to fewer AI ACT restrictions and regulatory pressures? How does this apply emergent efforts in the EU?

Posted by RMCPhoto@reddit | LocalLLaMA | View on Reddit | 10 comments
How is it that Google's Gemini Pro 2.0 Experimental 02-05 Tops the LLM Arena Charts, but seems to perform badly in real world testing?

Posted by RMCPhoto@reddit | LocalLLaMA | View on Reddit | 64 comments
Optimizing prompts and prompt templates for the new wave of reasoning models?

Posted by RMCPhoto@reddit | LocalLLaMA | View on Reddit | 0 comments
VQA / VLLM for identifying highlights in sporting event videos? (Semantic search)

Posted by RMCPhoto@reddit | LocalLLaMA | View on Reddit | 8 comments
Compression Aware prompting for quantized models

Posted by RMCPhoto@reddit | LocalLLaMA | View on Reddit | 1 comments
Recommend a basic Chat interface open source front-end to build from (fastApi?, react?)

Posted by RMCPhoto@reddit | LocalLLaMA | View on Reddit | 4 comments