Open-Source Apple Silicon Local LLM Benchmarking Software. Would love some feedback!
Posted by peppaz@reddit | LocalLLaMA | View on Reddit | 9 comments
DifficultyFit1895@reddit
Have you thought about incorporating some quality benchmarks? This would be great for comparing quality of results vs speed for different models and quantizations.
Here are a couple that I have used:
https://github.com/EleutherAI/lm-evaluation-harness
https://github.com/LiveCodeBench/LiveCodeBench
MediaToolPro@reddit
Nice work! As someone who works with MLX models on Apple Silicon, a few feature requests that would make this really useful:
powermetrics— tok/s per watt is increasingly important for efficiency comparisonsLooks clean and well-designed. Will give it a spin on my M4!
peppaz@reddit (OP)
You should read the readme or screenshots, It has all of those things! haha
MediaToolPro@reddit
Ha fair enough, that's what I get for jumping straight to the comment box. Just gave it a proper look — really thorough. The powermetrics integration is exactly what I was hoping for. Excited to run some comparisons on my M4!
peppaz@reddit (OP)
What make and model are you running? I was not able to try on anything other than a base M4 with 24gb
peppaz@reddit (OP)
Working on an Ollama/OpenAI API compatible LLM Benchmarker with low overhead and exportable benchmark graphics. It is pretty feature rich Would love to get some users. It is apple dev certificate signed and has a binary release available on Github. I did extensive testing and development over the last few months, I'm hoping Local LLM and Apple Silicon enthusiasts find it useful, and give feedback so I can make it better. Thanks for checking it out!
https://github.com/Cyberpunk69420/anubis-oss/ https://imgur.com/a/X64WsWY
Total-Context64@reddit
SAM dev here, I'll try and poke around with it when I have the chance. This is pretty interesting to me.
peppaz@reddit (OP)
Awesome - your project looks awesome I'm gonna check it out
Total-Context64@reddit
🤘