LiquidGunay
-
Best way to add "Memory" to LLMs?
Posted by LiquidGunay@reddit | LocalLLaMA | View on Reddit | 62 comments
-
Chatkit-js with LangGraph Agents?
Posted by LiquidGunay@reddit | LocalLLaMA | View on Reddit | 6 comments
-
Fastest way to serve llama 3 8b
Posted by LiquidGunay@reddit | LocalLLaMA | View on Reddit | 33 comments
-
Is the TPU really an ASIC?
Posted by LiquidGunay@reddit | hardware | View on Reddit | 56 comments
-
No AWQ for Gemma 3?
Posted by LiquidGunay@reddit | LocalLLaMA | View on Reddit | 26 comments
-
GRPO for VLMs?
Posted by LiquidGunay@reddit | LocalLLaMA | View on Reddit | 0 comments
-
The real use case for DIGITS is SLM training
Posted by LiquidGunay@reddit | LocalLLaMA | View on Reddit | 16 comments
-
How to make Coding LMs more creative?
Posted by LiquidGunay@reddit | LocalLLaMA | View on Reddit | 8 comments
-
Is Mamba inference faster than Transformers? (in practice)
Posted by LiquidGunay@reddit | LocalLLaMA | View on Reddit | 8 comments
-
What happened to the Nvidia VLM?
Posted by LiquidGunay@reddit | LocalLLaMA | View on Reddit | 6 comments
-
Is serving a quantized model faster?
Posted by LiquidGunay@reddit | LocalLLaMA | View on Reddit | 5 comments
-
Hard RAG benchmarks?
Posted by LiquidGunay@reddit | LocalLLaMA | View on Reddit | 5 comments
-
Task specific fine-tuning using distillation?
Posted by LiquidGunay@reddit | LocalLLaMA | View on Reddit | 2 comments
-
What architectural changes would be required to make an omni model?
Posted by LiquidGunay@reddit | LocalLLaMA | View on Reddit | 11 comments
-
If companies were waifus
Posted by LiquidGunay@reddit | Jokes | View on Reddit | 2 comments