PSA: Make sure your API ports aren't exposed to the open internet

[-]

offlinesir@reddit

If anyone still remembers from a bit ago, https://www.freeleakhub.com/ (website now taken offline, but was showcased on this sub) was a way to find the servers accidentally left open and use them for chats / API requests. A lot of the servers had really small models, but some of them had the full deepseek r1 or Qwen 3, behind no paywall at all. Anyways, I'm surprised still so many servers are accidentally left open, you'd think this would be detected by most by now.

[-]

CharmingRogue851@reddit

I was surprised the creator shared this with the public in the first place tbh

[-]

daHaus@reddit

they do it to increase awareness

[-]

dodiyeztr@reddit

It's mostly UPnP leaving them vulnerable in some routers

[-]

daHaus@reddit

That's always the first thing I disable on devices

[-]

ForsookComparison@reddit

I used Llama-CPP but was one of these.

I saw the internet background static - logs of requests looking for activedirectory installs and other microsoft products known to have vulnerabilities.

Then someone identified that I had a Llama-CPP server up and running. Someone asked a few questions about trivia (I remember the logs had a few questions about the French Revolution?) and that was it.

I'm not as reckless anymore and it was on a server with no critical information - but it definitely put the fear of open-ports in me

[-]

zipperlein@reddit

Didn't GPT4free do something similiar? It's still online on Github though.

[-]

Maleficent_Celery_55@reddit

GPT4Free uses purposefully open servers, not accidentally opened ones.

[-]

Trilogix@reddit

Oh yeah, for all those that spread ignorance saying : "Running in server/http instead of cli is totally safe and offline" ...

98% of the Local AI Apps lead to outside API or tcp/ip protocol traffic, that says it all.

[-]

ambassadortim@reddit

Are you making up numbers or have sources to reference

[-]

Geekenstein@reddit

98% of statistics are made up on the spot.

[-]

Trilogix@reddit

You are right, I did round it but still incorrectly. I have tried myself all the available Local AI Apps out there (like 30+ apps) and all of them call back home and most of them use http or different tcp/ip protocols.

98% was me being nice and considering I may have missed something. (Let me know what did I miss).

Not interested in trolling, I already created the solution, HugstonOne which isolates the models and data for a total safe/secure inference.

[-]

XiRw@reddit

If you know how to properly use a firewall you wouldn’t need to go through all this.

[-]

Trilogix@reddit

Is because I know how to use a firewall that I went through all this.

My firewalls are the third layer LOL.

[-]

XiRw@reddit

Which one are you using?

[-]

Trilogix@reddit

Simplewall is my preferred, and I had to rebuild BlackIce defender. What´s yours?

[-]

XiRw@reddit

My main 3 were always Eset, Bitdefender, and Kaspersky. I liked using paranoid mode that would detect any incoming or outgoing traffic I could control. I don’t want to pay for subscriptions anymore so I just write bat scripts now for internet traffic that blocks all except a few from accessing the internet.

[-]

Trilogix@reddit

Now you are speaking my language. Being in control of the traffic in your machine it should be legit. Most don´t care, but is not my business. I tried but I can´t be like the most :)

I saw you mentioned your work at Malwarebytes, was it any fun, or just checking logs all day boring than accounting!

[-]

gigaflops_@reddit

I'll tell you how-

Somebody tries to expose OpenWebUI to the internet for personal use because it's theoretically secure with it's email/password authentication.

They end up getting confused and expose the ollama port instead of or in addition to the OWUI port.

[-]

epyctime@reddit

Nah people are legit just dumb https://www.reddit.com/r/LocalLLaMA/comments/1n8n9u6/comment/nciiiim/?context=1

[-]

BumbleSlob@reddit

I’m struggling to figure out how someone can accidentally have ports exposed externally in the modern era. Is this like a “people have their PC Ethernet connected to a switch Ethernet connected directly to a modem” phenomenon?

Like with a router you would have to actively try to do this. Maybe misconfigurations? 🤔

[-]

epyctime@reddit

See: https://www.reddit.com/r/LocalLLaMA/comments/1n8n9u6/comment/nciiiim/?context=1

[-]

Pixelmixer@reddit

Imagine someone googling how to open their network NAT for Call of Duty multiplayer (or any other game). One suggestion is to set their device as the DMZ host to avoid setting up specific routes.

The crowd with just enough tech savvy to set this up overlaps a lot with the crowd that is tech savvy enough to start up an ollama server locally.

[-]

Mickenfox@reddit

It's completely insane how we're all OK with this status quo.

Imagine if your phone couldn't receive calls, only make them, because your operator ran out of phone numbers and had to implement a NAT and randomly switch them around, so in order to communicate you had to go through a centralized system or do all kinds of technical nonsense.

[-]

Pixelmixer@reddit

Luckily I don’t think it’s actually the status quo. MOST people just get their COD fix via UPNP and never have to concern themselves with port forwarding these days.

[-]

wolttam@reddit

I think not everybody is running ollama behind a SOHO router. Some people will be running it on a rented server with a public IP

[-]

TheLexoPlexx@reddit

And Ubuntu minimal comes with ufw disabled (or not even installed) and no other pre-config.

[-]

ForsookComparison@reddit

I didn't believe this until I checked a few weeks ago, I was so used to my servers having it already-enabled.

[-]

TheLexoPlexx@reddit

Yeah, I was kind of confused as well.

[-]

58696384896898676493@reddit

Yeah this is my guess too. I'm generally not too worried about accidental open ports in my homelab as it's behind a router with no ports opened or forwarded. But I need to remind myself that my dedicated server and VPS are not behind a router, so I always make sure UFW is active on those machines and take a lot more precautions about security on them.

[-]

ForsookComparison@reddit

’m struggling to figure out how someone can accidentally have ports exposed externally in the modern era

Someone makes a game server with friends following a simple tutorial. Leaves a fat port-range open. Forgets about it for years (very common). Runs Llama-Server or Ollama on 0.0.0.0. Port happens to be in this range.

[-]

a_beautiful_rhind@reddit

Probably more likely on hosted servers where the user needs it open. I am behind nat so need something like gradio share to expose anything.

[-]

jeffbagwell6222@reddit

They more than likely have uPNP enabled. So when they activate the port on their OS their router automatically opens the port.

[-]

mikael110@reddit

That's is luckily noy how uPNP works. The router does not automatically open a port just because a client has it open. An application has to explicitly make a request via UPnP IGD request to open a specific port in the router. So unless Ollama is designed to open ports via uPNP, which I'm 99% sure it is not, this is not the issue.

If uPNP opened ports just because they were in use then this kind of issue would be significantly more widespread than it is. Personally I suspect a lot of the open APIs comes from people running Ollama on rented GPU servers, think runpod, vast.ai and the like. Those services provides you with an IP address and the ability to open ports directly to the internet by default, and doing so is common.

[-]

jeffbagwell6222@reddit

Interesting. I assumed ollama used upnp. Thanks for clarification.

[-]

claythearc@reddit

that’s probably some of it, 1100 sounds like a lot but even the massive models get like >200k downloads.

My guess is there’s another group of people who want to vibe code on their phone or something and don’t think through the ramifications

[-]

smayonak@reddit

Also they might have ngrok set up to expose Ollama over API. If someone is getting all their setup instructions from ChatGPT, they're going to have some chatgproblems

[-]

kmouratidis@reddit

Unless they're paying for ngrok, every day or so the server kills itself and you need to start it again, and I think this comes with a new URL every time.

[-]

No_Swimming6548@reddit

Someone that learns stuff from Youtube.

[-]

actual_account_dont@reddit

With a reverse proxy you only need one port open. Then host based routing. I see bots trying common subdomains (www, ssh, ftp, etc.) looking for something interesting. Sounds like they were scanning for ports though

[-]

GTHell@reddit

Uhhh,cough, docker —port cough

[-]

SportEffective7350@reddit

Who the heck leaves all those random ports open? Isn't the default in all devices to start up with all ports closed and you must open them manually?

I do have an open port for AI but it's a very high, non-standard one.

On the other hand this is giving me a silly idea to apply a basic, scripted soft IP check since I always connect from the same networks, and if the check is failed, alter the prompt/context to make the AI act hostile or something.

Imagine the Cisco worker doing "What is 2+2?" and then "INTRUDER!!! YOU SHOULD NOT BE HERE!". It'd be funny.

[-]

alamacra@reddit

So, I asked Kimi as to why people act so aggressive about this, and unfortunately your suggestion, as funny as it is, wouldn't be safe to implement, since some of the Llama.cpp bugs are overflow errors, I.e. the attacker can write directly to the stack and log into your PC, and then do whatever he likes.

Honestly, not sure at all now how to share the LLM with friends safely.

[-]

SportEffective7350@reddit

Oh I have proper security layers, but I just want to spook Cisco employees, haha.

[-]

Dreadedsemi@reddit

I remember googling something about stable diffusion, and an IP address with full stable diffusion webUI linked in the results. even if intentional, it's still a bad idea.

[-]

bishakhghosh_@reddit

One simple way to prevent it to use a tunneling tool such as pinggy that supports IP whitelisting and basic auth. Apply both to be safe I guess?

[-]

Robbbbbbbbb@reddit

I've been thinking about making this post for a while lol

There are many, many more exposed than 1,100. Try nearly 15,000.

Do they all work? Nah. But when I checked Shodan the other week, nearly every one I picked at random allowed me to pass a test "tell me a story" query.

[-]

Mickenfox@reddit

Applications should absolutely not have to manage network security. This should nearly always be responsibility of the OS. Feel free to complain to Microsoft.

[-]

Robbbbbbbbb@reddit

With that logic, no web application should have a logon page lol

If the web interface is exposed on TCP/80 or 443, then that's one thing. But the API has zero ability to be protected with a token/key, meaning there's no native way to secure it.

There's a reason why all larger LLM projects (GPT, Claude, Grok, etc...) all natively support authentication on their respective APIs.

[-]

Artistic_Mulberry745@reddit

hey guys, I am on MacOS with Ollama installed through home-brew. In my .ollama folder I don't have a logs folder to check. Only folder I have is "models" in there. Do logs get deleted on restart or something? I always stop the Ollama service when not in use, could it be that?

[-]

Pro-editor-1105@reddit

Out of interest in this post, I went to shodan.io and i found a random server in brazil where I requested and got a response. Honestly if it is such a massive github project and shit they should pay much better attention to security than this, especially with the ability to run models like this.

[-]

biblecrumble@reddit

Feel free to go ahead an put a PR in to improve the security of this 100% free, open source, community-maintained project. In the meantime, I think dumbasses should probably stop exposing random services to the public internet.

[-]

vibjelo@reddit

100% free, open source, community-maintained project

I'm not sure you're confusing parent's reply with someone else's, but Ollama is mostly maintained by a paid team, who work at Ollama, which is a for-profit endeavor. It's not exactly community-maintained by the typical definition of that concept.

I do agree dumbasses should stop exposing stuff on the public internet though :)

[-]

Pro-editor-1105@reddit

100% free, open source, community-maintained project.

Somewhat but not everything is open source.

[-]

tiffanytrashcan@reddit

The newest behemoth, 123B, released days ago. (X v2) At 32k context. A good webui... I don't want to expose the service but it's one that Cisco didn't look for. Hosted in South Korea.

Of the 3 I quickly checked only 1 asked for authentication. (using this server, most are way smaller models) I didn't test the output of behemoth, that has an actual real world cost and I would feel bad. That beast has measurable energy usage. (not a small quant either, this is serious hardware just chilling on the open internet)

[-]

SkyFeistyLlama8@reddit

And that's how Wintermute started.

[-]

vibjelo@reddit

I'm not in their Discord anymore, but when I used to lurk there from time to time, it was filled with people who never opened a terminal before, asking how to deploy Ollama on the public internet so they could use it from their phone and anywhere else. It's great that people are learning new things, I love it too! But once you start involving deploying stuff to the public internet, you need to have some baseline level of knowledge to not be hacked within hours, and the software needs to have some sensible defaults. I saw a lot of advice going around that basically boiled down to "bind to port 0.0.0.0 and then access the IP" with no disclaimers or warnings about what that actually means.

If only someone tried to warn Ollama about not being sufficiently it clear about what kind of experience and skills they expect their users to have before deploying software on the internet cough https://github.com/ollama/ollama/issues/7116 cough

Color me surprised the internet is now filled with unauthenticated and vulnerable servers running Ollama.

[-]