Local AI with Gemma 4 and OpenWebUi

Posted by jumper556@reddit | LocalLLaMA | View on Reddit | 15 comments

Good day everyone

I'm probably missing something, but is it still really this difficult to run a local LLM with memory and basic tool calling?

I did spend a couple of hours to test Gemma 4 with OpenWebUI running in Pinokio. I have a RTX 5090 and 64 GB of RAM hence I chose the 31b version.

For web search I did use tavily and I did enable memory features within OpenWebUI.

It all seens slow and the menory feature is not reliable. At the same time a local TTS integration is not that easy to setup. Basic questions seems slow, just saing hi triggers a "web search" with "no search performed" before responding.

What I'm hoping for:

- Full local AI setup

- Web search if not enough infornation is present

- Reliable Memory for past conversation facts which builds up knowledge about me over time

- Optional TTS function to speak with my Model

I did not try to setup open claw because it seems to be having too much access to my system without control, or should I better be taking this route?

Am I missing something? Is there still no reliable local LLM Setup for dummies with memory and TTS capabilities? I want to share healt, income or all kinds of other personal information with a local LLM and not a cloud AI solution.