What’s a low memory way to run a Python http endpoint?

[-]

fiskfisk@reddit

uvicorn should not use 600MB by itself. Are you allocating memory in your application to handle requests?

Bjoern is commonly mentioned as a low memory use http server for Python:

https://github.com/jonashaag/bjoern

I'd just evaluate bottle.py and the built-in http server as well. Not sure about gunicorn's requirements.

[-]

alexp702@reddit (OP)

bjoern seems very old - 4 years since update!

[-]

fiskfisk@reddit

If it works, it works. It's not like the basic http protocol has changed in 25+ years.

[-]

ra_men@reddit

And this is why we have mass supply chain vulnerabilities.

[-]

fiskfisk@reddit

Something can be stable without security issues. Software that doesn't evolve tends to be more stable and have less issues, but of course, you should always check if there are any known issues.

[-]

ra_men@reddit

Before this year I’d agree with you. Now we’re seeing so many 10-20 year old pieces of software have security gaps due to ai pen testing that everything feels up for grabs.

[-]

Sure, age isn’t a perfectly reliable indicator of security. But it’s the same with any bug: new code can introduce bugs, regardless of who wrote it, and up to now people using the software and reviewing code over time was the only thing that can catch them.

So far AI has managed to catch a couple holes which no one noticed before, and this is a pretty recent development. It looks like most of the people doing the research with it are doing the responsible thing and letting developers know, and over time AI pen testing will just be part of the development process for secure software.

[-]

ra_men@reddit

Again I agree with you, but that doesn’t address the main concern here which is lack of updates. 4 years without updates for an http server is a quick no-go in any professional environment, that’s 100s of relevant CVEs. Obviously here for OPs case I doubt that matters.

[-]

alexp702@reddit (OP)

It does matter to me. We’re building a system for the future and whilst this component is not large or high frequency in use it is important.

[-]

ra_men@reddit

I’m not sure what people’s issue is with the concern over lack of updates, it’s definitely a legitimate concern.

[-]

murdoc1024@reddit

I dont get why you 're being downvoted, your viewpoint is perfectly valid.

[-]

unkz@reddit

Bottle's security surface is pretty damn small.

[-]

ra_men@reddit

Bjoern (if that’s what you meant) is pretty small, but that doesn’t mean the surface is small. It’s actually fairly broad, as are all http servers.

[-]

alexp702@reddit (OP)

Yes, but bjoern uses the older V2 WSGI protocol which is now (apparently according to my AI) WSGI V3. Personally I don't go for stuff that's not obviously maintained and relatively active - it causes problems down the line.

[-]

yerfatma@reddit

a single endpoint that needs exposing on http. Nothing fancy

But not this one.

[-]

kaszak696@reddit

Your AI is hallucinating to you, and you'd know that if you'd spent a minute reading the original spec instead of just lazily having a robot lie to you.

[-]

alexp702@reddit (OP)

Not doing a full implementation just rapidly prototyping a solution to see the memory usage. If the AI can get it up and running in 10 minutes warts and all that’s good for this purpose. I’m impressed - in about 2hrs it has allowed me to test almost every proposal here. I just spotted that in the output, the 4 year unmaintained problem still stands.

[-]

UsefulOwl2719@reddit

With this approach, you will always be disappointed in the long run. I look for deps that I would be happy using if they never received another commit. Recent churn indicates a project is still in development or is trying to design itself around a moving target.

[-]

axonxorz@reddit

If you're concerned about having up-to-date implementations, your LLM has failed you. You should be targeting ASGI, even if you aren't using it's concurrency advantages.

[-]

japherwocky@reddit

they're running it in a docker container, it's docker not uvicorn

[-]

Huge-Habit-6201@reddit

How about cherrypy?

[-]

jkh911208@reddit

Use Go

[-]

nggit@reddit

Try https://github.com/nggit/tremolo . In my docker it never eats > 50MB for simple usage. It has been made with minimum cyclic ref so memory will stay low for long run. But t's not the fastest / well-known server.

[-]

thegoz@reddit

have you considered not using Python? if it’s a simple process it might be worth it. I dont know if you have any AI ar your disposal but rewriting into a different language is kinda doable

[-]

WJMazepas@reddit

It does seem crazy, because I have dev servers with 512MB of RAM and a medium FastAPI application uses 300MB on startup

This node comparison, is running a similar program?

[-]

LoreBadTime@reddit

I was greasy, and used half the AWS free SSD as swap. Not that I need to use the server that much

[-]

tRfalcore@reddit

is that proncounced like greddy like bread

[-]

alexp702@reddit (OP)

I am running on a Mac - Arm64 image for all. Node seems to have a baseline of 128MB - seems a well documented thing to do with the garbage collector. You can reduce it with some command line flags, but it then starts to become unstable. My actual python program is using about 70Mb on start up (possibly due to libraries) - which I can live with. My surprise is how hard it seems to be to serve this without eating up RAM.

We have a bunch of 16Gb Macs, developing a docker based system. Most code based containers are Node, with a Python one stuffed in there. I want to make sure the team has as much ram left as possible, which began this investigation. Disk isn't a problem - just ram as we grow the number of services.

[-]

hstarnaud@reddit

I noticed docker for Mac isn't really good at freeing up memory. It will request memory when it needs it but then the container keeps that memory allocated until it shuts down even if it's not using it anymore.

So it's maybe not your memory on server start-up which is at play here, rather your peak memory usage within a request

[-]

WJMazepas@reddit

Hmmm maybe it's due to running on Docker in MacOS that is using all that memory.

Does the node code is also running in Docker?

[-]

alexp702@reddit (OP)

Yes everything pretty similar to a windows PC. Side note we were using Amd64 images on arm macs - they use about 30% more ram to emulate.

Will pick up tomorrow I think!

[-]

makinggrace@reddit

Is that RAM usage the containerized usage? How much of it is just starting the container?

[-]

alexp702@reddit (OP)

I am using uv run in the container - I think it may be part of the problem, as it seems to not matter what I try it stubbornly wants 512MB.

[-]

makinggrace@reddit

Oh you're running it in Docker. Docker is not a lightweight container. Can you use something like Podman?

[-]

corey_sheerer@reddit

UV should not be installed in your runtime. You should create a multi stage build and copy the installs into your runtime step. Search for recommended container builds for UV and should find something to improve upon what you have

[-]

nickN42@reddit

Which base image are you using?

[-]

alexp702@reddit (OP)

Currently python:3.13-slim

[-]

yerfatma@reddit

Do you have to use Docker? That may be a lot of your overhead. What is this single endpoint doing?

[-]

AreWeNotDoinPhrasing@reddit

They are using docker on arm Mac’s emulating amd64 in the containers lol they have no idea what they’re doing and won’t listen to reason (based on other comment chains they are replying to).

[-]

nickN42@reddit

I think you need to look into your service first. I just tried to run a hello-world server with uvicorn (using uv run) using slim python image -- although I went with uv provided one, based on trixie-slim; and according to podman stats it takes about 30 megs of RAM running.

[-]

JohnDisinformation@reddit

Have you trimmed it down?

apt-get purge -y --auto-remove -o APT::AutoRemove::RecommendsImportant=false

Stick that in your docker file.

[-]

cheerycheshire@reddit

OP has a problem with ram usage, not image size...

Also, for python, it's better to use multi-stage builds - use non-slim version to build libs (can apt install everything needed, no need to uninstall and purge stuff later), then copy built libs over to a slim image of same OS (where you don't use apt at all, unless of course you need some system tool) and your project files.

[-]

geravalas@reddit

That won't reduce memory usage or image size unless you squash.

[-]

Feeling_Ad_2729@reddit

600MB is almost certainly your application's imports, not uvicorn itself. Uvicorn alone uses maybe 30-40MB. The usual culprits: numpy/pandas at import time, heavy ML libraries, pydantic v1 vs v2 (v2 is much leaner).

Profile what's actually using memory first:

import tracemalloc
tracemalloc.start()
# ... your app startup ...
snapshot = tracemalloc.take_snapshot()
for stat in snapshot.statistics('lineno')[:10]:
    print(stat)

For genuinely lightweight Python HTTP: - bottle — single file, zero dependencies, runs on the stdlib WSGI server or gunicorn - falcon — built for low overhead APIs, much lighter than FastAPI - aiohttp — if you need async but not FastAPI's ecosystem - flask + waitress — simple, predictable memory

If you genuinely need minimal footprint (IoT, serverless cold starts), bottle + gunicorn in a slim Docker image usually lands around 50-80MB total. But fix the import problem first — swapping the HTTP framework won't help if you're importing pandas in your handler.

[-]

alexp702@reddit (OP)

Thanks - I will check that as some of that may be happening.

[-]

lanupijeko@reddit

how did you check it's python?
If you are checking container memory usage, could it be the container's context?

[-]

alexp702@reddit (OP)

Htop shows the usage all in the “uv run uvicorn” process.

[-]

CatolicQuotes@reddit

Nothing fancy? What exactly is this nothing fancy? What package does it use? Python loads everything into memory.

[-]

paperlantern-ai@reddit

This is almost certainly not uvicorn itself - a bare uvicorn app should sit around 30-40MB. The fact that you're seeing 512MB+ regardless of which server you try points to something else in your container setup. Since you mentioned using uv run inside the container, that's likely a big contributor - uv should only be in your build stage, not your runtime. Try a multi-stage Dockerfile: build/install deps with uv in the first stage, then copy just the venv into a clean python:3.13-slim final stage. You'll probably land around 80-100MB total.

[-]

Mr-Cas@reddit

600MB is crazy. That must be a misconfiguration. My full fledged feature rich full stack software projects with large Flask APIs, hosted using waitress, consume about 40-60MB.

[-]

jvlomax@reddit

Have you tried python -m http.server?

[-]

aisingiorix@reddit

Not recommended for production: https://docs.python.org/3/library/http.server.html#http-server-security

[-]

jvlomax@reddit

Sure, I wouldn't use it for critical production infrastructure. But I don't know what OP wants so it may be an option

[-]

ra_men@reddit

does not perform input validation such as checking for the presence of CRLF sequences

This is similar to the recent Axios compromise fyi

[-]

VEMODMASKINEN@reddit

Use Go, it's easier to build and the image will be less than 10mb.

[-]

alexp702@reddit (OP)

I need python - various libraries on the end point require it, unless there is some trick to Go <-> python?

[-]

WJMazepas@reddit

Nah, it wont be a good integration between then.

If it was an endpoint just checking some stuff and returning a value, then yeah Go would be fantastic for that.

If you need a lot of python stuff, then stick with Python.

Go also have a lot of libraries similar to Python, but it can be a good amount of work of porting a Python code to Go

[-]

thekicked@reddit

You may want to use memray to see which parts of the server are taking up a lot of memory

https://bloomberg.github.io/memray/

[-]

AlexMTBDude@reddit

You need to make your mind up about the units that you use; Is it millibits (mb), megabits (Mb) och megabytes (MB)?

[-]

ReadyAndSalted@reddit

What in the world is a millibit? mb means megabyte.

[-]

npisnotp@reddit

I think he meant "mib" which is the abbreviation for "mebibyte".

"Megabyte" is computed as 10⁶, while "Mebibyte" is computed as 2²⁰; as you can imagine the curve for "Megabyte" raises slower than "Mebibyte".

This is the reason why when you buy a 500 "gigabytes" disk it only have 465, because it really have 500000000000 bytes and not 536870912000; unfortunately the correct unit terms are not widely used.

[-]

cheerycheshire@reddit

No, the person above is just pedantic - SI prefixes go m for milli (1e-1) and M for mega (1e6).

Also, it would be MiB - again, mebi is Mi, byte is B (small b is bit). mi prefix doesn't exist, and Mib would be mebibit, not mebibyte.

[-]

AlexMTBDude@reddit

Nope: https://en.wikipedia.org/wiki/Megabyte

Ever taken a comp sci course?

[-]

duskhat@reddit

Have you? In what world is it realistic to ask this question for anything other than Megabytes?

[-]

jmelloy@reddit

Absolutely nobody read his message and thought that “mb” meant millibits.

[-]

alexp702@reddit (OP)

Sorry all Megabytes. All the others are irrelevant to me ;-)

[-]

corey_sheerer@reddit

I'm not sure, but I'd assume 600mb isn't a requirement as I have deployed a few fastapi services to kubernetes with quota limits at 500mb. Haven't tried to lower it, but it seems plausible to lower this to maybe 200mb. It seems like something else may be at play. How did you install Fastapi into your image?

[-]