Rewriting 4,000 lines of Python to migrate to Quart (async Flask)

[-]

databot_@reddit

What's the main benefit of migrating? Performance?

[-]

Performance is nice. But the main motivation was to be able to use the most recent Python features. This is especially true for async / await but also what's to come with things like free-threaded python. My older framework hasn't really be getting updates for years now. I wanted something that's living and active and that would evolve with Python and the ecosystem.

That said, perf is improved, so yay for that.

[-]

DootDootWootWoot@reddit

I really wish I could do this for our APIs except we're talking tens if not hundreds of thousands of lines that would need to be looked at to support a transition from sync to async code. It's just not tenable for an application of sufficient size and one of the reasons I think you'll see folks struggle in adopting native python async/await patterns.

We've made some specific small steps to improve concurrency in very specific examples. For instance flask allows you to rewrite pre/post request handlers in async while the main handler is not. You can also thread specific operations when you need to if there's a situation where you know you'd like to manually await execution.

I feel like your reasoning to avoid fast api didn't make a ton of sense to me. In pedantic you can be as constraining or not as you wish, but it's super nice to define these schemas upfront. I can't imagine the majority of your API inputs you truly don't want validation. I'm not familiar with your application though.

One question regarding your mongo use. How do you manage your schemas as they change? Do you really not care if you have N schemas in mongo that all need to be supported at some future date? Seems like a maintenance nightmare. Db migrations for sql or no sql alike make the application code so much simpler.

[-]

mikeckennedy@reddit (OP)

Part 2 :)

> I feel like your reasoning to avoid fast api didn't make a ton of sense to me. In pydantic you can be as constraining or not as you wish, but it's super nice to define these schemas upfront.

I know how nice Pydantic and FastAPI is. I have some APIs written in it and love it. As noted, I also use Pydantic extensively since it's also the class basis for our data access layer via Beanie.

But let me give you an example of why the juice isn't worth the squeeze here.

I have a web form like this:

Name: ________
Email: ________
Age: ________

I can model it with Pydantic as:

class CreatePerson(BaseModel):
name: str
email: str
age: int

But this will not do for a web form. If the user forgets to fill out one field, I want to say "oops, please fill out age", not HTTP 400 bad request.

Ok, then we can make age: int optional via age: int|None. Cool.

What if they swap email and age and write a string there. Well, still don't want HTTP 400 bad request. So now age has to be age: int|str|None and of course name and email have to be optional again.

Now they can submit the form! In my code though I still have to parse the darn age:

try:
model.age = int(model.age)
except ValueError:
return "form with error message that age isn't valid as a number"

And on and on it goes. So yes, we can make Pydantic bend over to adapt. But at this point, just handle it differently. It's cleaner and simpler. That's what I'm talking about in the challenges of using FastAPI for primarily server side HTML sites.

[-]

riksi@reddit

If the user forgets to fill out one field, I want to say "oops, please fill out age", not HTTP 400 bad request.

You can use a form that submits to a json endpoint, that returns correct errors, and use the error response to paste errors into the form in html.

Well, still don't want HTTP 400 bad request.

See above.

[-]

mikeckennedy@reddit (OP)

Can I do this without JavaScript? I think no. And why would I want to write JavaScript when there is a perfectly good server side processing? Adding a whole set of FastAPI APIs just to do validation before submitting a form via HTTP Post to another endpoint doesn't seem that practical.

[-]

DootDootWootWoot@reddit

Yes modern html form validation exists without js. You can specify valid values and ranges. And the reason why is so you don't hit the server unnecessarily. It's a better UX if the simple stuff is kept client side.

[-]

mikeckennedy@reddit (OP)

It's great, and we use it all the time for our forms. However, you cannot assume that every request that hits your site is going through HTML validation. There are bots, other automation, hackers, all variety of things directly submitting the form without running or respecting HTML client-side validation.

[-]

riksi@reddit

Because you can re-use the same rest-api that you provide to clients.

[-]

mikeckennedy@reddit (OP)

This website does not have a REST API for clients.

[-]

poppy_92@reddit

Just because you can do this, doesn't mean you should. Why would you needlessly add another slow network call? Fastapi is already slow enough as is.

[-]

riksi@reddit

You already need to make http requests to validate a normal form.

[-]

DootDootWootWoot@reddit

I don't understand shouldnt your frontend also be ensuring you only send valid ages in addition to your api or are you just not doing any client side validation. If you really wanted to accept any value, then I wouldnt bother throwing int on age and instead call it a string which is sufficiently loose to respond as you wish.

[-]

mikeckennedy@reddit (OP)

Hey! Let me respond in two subthreads here since the two topics you bring up are pretty unrelated (but both interesting):

> I really wish I could do this for our APIs except we're talking tens if not hundreds of thousands of lines that would need to be looked at to support a transition from sync to async code.

I think there are tools and techniques you can apply. Here are 3 ideas:

mypy supports checking for mistakes around using async functions. See https://mypy.readthedocs.io/en/stable/error_code_list2.html#check-that-awaitable-return-value-is-used-unused-awaitable You could employ tools like mypy (and ideally Ruff in the future) to cache what would have otherwise been missed. This would make the "looking over" part less risky.
Switch to a framework like quart or fastapi that supports both sync and async code. Take sections that would benefit a lot from async and rewrite them in async and leave the rest along. This would mean a little duplication. For example if you data layer has a def get_report(), you'd also need an async def get_report_async() or whatever you call it's async twin. This is a hassle but also a way to build in lower level async support that is tested and you can later delete the sync version if you go full async.
Use some code that adapts from sync -> async -> sync. I had this for our site running for a few years (!) because I wanted to use Beanie which is only async but the framework, Pyramid, was only sync and I needed to bridge the gap. Was excellent actually for what you could expect of it.

[-]

riksi@reddit

You could use gevent which should be easier to integrate on a huge project.

[-]

DootDootWootWoot@reddit

Even gevent you need to ensure all your code is coroutine friendly. But yeah there are several options. It's been a while since I've really dug into this.

[-]

jojurajan@reddit

Great writeup. Thanks for breaking it down into smaller steps of conversions. As you mentioned, it is the small things that take most of the initial development time while switching, reading from cookies, setting up users in the request cycle and sending back proper responses and error messages.

In the end, you mentioned there were unit tests too. Was any rewrite required with the 2nd change, i.e from sync to async?