qnixsynapse

Are there any models that use sparsemax instead of the usual soft-argmax/softmax?

Posted by qnixsynapse@reddit | LocalLLaMA | View on Reddit | 13 comments
Implemented LLaMA 3.1 8B's function calling from scratch, some challenges and feedback!

Posted by qnixsynapse@reddit | LocalLLaMA | View on Reddit | 29 comments
Local LLaMA-3-8B on an Intel GPU and GPT-4o/mini: a speed comparison.

Posted by qnixsynapse@reddit | LocalLLaMA | View on Reddit | 13 comments
This "upcoming-gpt-mini" from OpenAI also fails this river crossing "puzzle".

Posted by qnixsynapse@reddit | LocalLLaMA | View on Reddit | 13 comments
Language Models can't reason, so I asked them to reason for this famous problem.

Posted by qnixsynapse@reddit | LocalLLaMA | View on Reddit | 1 comments