How I got MCP working in the llama-server web UI (A brief guide for noobs)

Posted by arcanemachined@reddit | LocalLLaMA | View on Reddit | 34 comments

Intro

I heard about the recent addition of MCP support to llama-server and I was interested in getting it working.

I have only briefly toyed with MCP, so I'm not super familiar with the ins and outs of it.

I spent a while screwing around getting it working, so I am offering this brief guide for my fellow noobs so they can spend less time spinning their wheels, and more time playing with the new feature.

Guide

First, ensure that uv is installed: https://docs.astral.sh/uv/getting-started/installation/
Create a config file in the directory of your choice with some MCP servers:

config.json { "mcpServers": { "time": { "command": "uv", "args": ["run", "mcp-server-time", "--local-timezone=America/Chicago"] }, "fetch": { "command": "uvx", "args": ["mcp-server-fetch"] }, "ddg-search": { "command": "uvx", "args": ["duckduckgo-mcp-server"] } } }

From the same directory, run this command:

uvx mcp-proxy --named-server-config config.json --allow-origin "*" --port 8001 --stateless
When you run this, it will list the name of each MCP server. You will need to replace the sse at the end of each URL with mcp for a given MCP server to work in the llama-server web UI config
In the llama-server web UI, go to Settings -> MCP -> Add New Server, and add each server in your config. For example:

http://127.0.0.1:8001/servers/ddg-search/mcp
Click Add to finish adding the server, the check the toggle to activate it.

The configured MCP servers should now work in the llama-server web UI!

Hopefully this is helpful to someone else!

[-]

webitube@reddit

This worked for me! Thank you!

[-]

WinterElfeas@reddit

Is the MCP feature global or only for Web UI? Like if I use the llama server API directly (from another app), will it be aware it can query the web through MCP

[-]

arcanemachined@reddit (OP)

The MCP server in my guide can server the web UI, or any other application that can query it. The web client is just another "consumer" of the MCP server.

But no, the "other app" would have to contact the MCP server itself. It wouldn't go through llama-server.

[-]

WinterElfeas@reddit

Ah sad, I was hoping to configure MCP, and have VSCode chat openAI to point to my local llama cpp and to gain web search capacity

[-]

Superuser2051@reddit

I found this page after an hour. thank you so much.

I have started my own test mcp server using fastmcp library. But getting network connect error when adding to llama.cpp .

Anyone got success?

[-]

akaTLG@reddit

Click the Edit button. You should see a toggle to turn on "Use llama-server proxy".

[-]

Superuser2051@reddit

Amazing. Thanks a lot :)

[-]

Superuser2051@reddit

Code:

[-]

BrightRestaurant5401@reddit

but shouldn't the tools also have a description for the llm to understand what it is?

[-]

iLoveWaffle5@reddit

Thanks for this! I learned a lot about uvx :)

I am noticing that the `ddg-search` tool often returns:

"No results were found for your search query. This could be due to DuckDuckGo's bot detection or the query returned no matches. Please try rephrasing your search or try again in a few minutes"

Did you happen to find any other alternatives that do not require an API key? What do you use today for web search?

[-]

arcanemachined@reddit (OP)

You could try Brave, they give like 1000 searches per month for free.

You could also try running an instance SearXNG (I have a simple repo that implements a container service that runs it, ask an agent for help if you use it and encounter any issues).

You have to be careful with SearXNG though... I hear that Google will blacklist you temporarily if they detect that you are using it (I have Google disabled in my config if you use that).

Easiest move is probably just go with Brave's free tier though.

Feel free to ask elsewhere also, I'm not an expert, and this thread's not guaranteed to get a lot of activity. Would probably a good /r/LocalLLama post on its own :)

[-]

No-Statistician-374@reddit

Or... and I had to check this with Gemini myself... add the link with the API key (I chose Tavily) that you get after logging in there, and then find that it doesn't fetch. You need to then add "--webui-mcp-proxy" to your llama-server launch command, and turn on "use llama server-proxy" for the web-search MCP. It will now work :)

[-]

c64z86@reddit

That doesn't work for me. The "use llama-server proxy" setting doesn't show up anywhere.

[-]

No-Statistician-374@reddit

Go into settings in the webUI, MCP, manage servers (where you should have already added the MCP server of your choice), then click the 'pen' icon to edit it, and there you will find the 'Use llama-server proxy' switch. Turn it on, update.

[-]

c64z86@reddit

Ah it's working now, thank you!!

[-]

c64z86@reddit

Thank you, but it's not showing :/

[-]

c64z86@reddit

Where exactly is the "use llama-server proxy"? I can't find it anywhere.

[-]

tongwuumn@reddit

thanks! works perfectly

[-]

arcanemachined@reddit (OP)

Ah, sorry about that! I totally for forgot to mention it. Thanks for the correction!

[-]

ea_nasir_official_@reddit

make sure you paste the server urls as op placed them, people from the future. dont paste them the same as the ```mcp-proxy``` output or you will be burned. Just throwin that out there for those coming from google like me

[-]

arcanemachined@reddit (OP)

Indeed, the URLs need to be manually tweaked. That was fun to figure out.

[-]

alfpacino2020@reddit

exelente funciona ya leo mis carpetas y archivos desde el llm gracias!

[-]

iamapizza@reddit

In the other thread, the OP was able to display images, any idea how was that done?

https://www.reddit.com/r/LocalLLaMA/comments/1rrycc6/llamacpp_brave_search_mcp_not_gonna_lie_it_is/

[-]

arcanemachined@reddit (OP)

In the other thread, the OP was able to display images, any idea how was that done?

Not a clue, sorry. If you can find out, please report back so I can add it to the guide. :)

Another thing I noticed is that the MCP servers seem to turn themselves off on each new chat, I have to go in and turn them back on again. Is that normal?

[-]

iamapizza@reddit

Note, when using via docker/docker compose, even if the mcp-proxy is in docker, the URL needs to be localhost, not the container name! The search seems to happen in your browser rather than through llama server side.

[-]

BerryGloomy4215@reddit

why would you need the fetch mcp? it seems that ddg already provides both search and fetch.
https://github.com/nickclyde/duckduckgo-mcp-server?tab=readme-ov-file#available-tools

[-]

arcanemachined@reddit (OP)

I'm not married to DDG, so I'll leave it in there.

[-]

gaps_ar@reddit

Don't know if it's just me but it doesn't work with the uvx run mcp-server-time, I have to remove the run argument.

[-]

arcanemachined@reddit (OP)

Damn it! I will fix that. Thanks.

[-]

Positive-Stock6444@reddit

worked a dream, thanks for sharing! I had to change uv to uvx for the time mcp, but otherwise, perfect.

[-]

arcanemachined@reddit (OP)

Thanks for the feedback. I changed that command in the guide to use uvx, just in case.

[-]

ProfessionalSpend589@reddit

That's really cool.

I'm new to python and I'm trying to figure out how to check what MCP servers can be trusted. How do people tackle this problem?

[-]

Kahvana@reddit

The best way is to read through the source code, and build the mcp servers yourself.

[-]

No_Swimming6548@reddit

Thanks, definitely useful. Not everyone who likes local models is a dev.