Has anyone run gemma 4 or Bonsai 8B models on Orange pi 5?
Posted by bhakt_chungus@reddit | LocalLLaMA | View on Reddit | 7 comments
Has anyone run gemma 4 or Bonsai 8B models on Orange pi 5?
I am extremely new to this and am wondering if I can run a very small model with decently fast throughput on one of these chips. If anyone was successful in doing so that would be helpful to know.
honuvo@reddit
Hi, not on an Orange Pi, but a Raspberry Pi 5 16GB. I had posted a few days ago and am currently benchmarking again. Already tested gemma 4 E4B, so here's a sneak peek:
If thats decently fast enough for you I don't know. The E2B is of course even faster.
honuvo@reddit
I got around to compile the llama.cpp-fork for Bonsai 8B and tested that. Maybe I did something wrong, maybe the calculations aren't really optimized for ARM CPUs, I don't know. Not interested in looking into that model more, but here are the results: | model | size | params | backend | threads | test | t/s | | Bonsai 8B Q1_0 | 1.07 GiB | 8.19 B | CPU | 4 | pp512 | 3.27 ± 0.00 | | Bonsai 8B Q1_0 | 1.07 GiB | 8.19 B | CPU | 4 | tg128 | 2.77 ± 0.00 |
H_NK@reddit
Confused by the tables, just model size specs and no actual performance metrics. Did you paste something wrong on accident
honuvo@reddit
Are you on mobile? You have to swipe horizontally on the tables to see the right end. Headers are model, size, params, backend, threads, mmap, test and t/s. Especially test and tokens/sec are what you mean, right?
H_NK@reddit
That’s it, my bad no clue that was a feature
H_NK@reddit
!RemindMe 1 day
RemindMeBot@reddit
I will be messaging you in 1 day on 2026-04-04 22:43:43 UTC to remind you of this link
CLICK THIS LINK to send a PM to also be reminded and to reduce spam.
^(Parent commenter can ) ^(delete this message to hide from others.)