Handling multiple Alembic migrations with a full team of developers?

Posted by Big-Information3242@reddit | Python | View on Reddit | 24 comments

This has been frustration at its best. We have a team of 10 developers all working on the same codebase. When one person updates or adds a column to their local database we get a revision. However if multiple do so we have multiple revisions so which one is the HEAD? this is costly, time consuming and a bunch of mess.

How would you or are you handling this type of use case? I get it Alembic works good if its a sole developer handing it off to another developer and its a one off, but with multiple devs all checking in code this is a headache.

Back in the days of SQl we had normal SQL scripts with table updates that would just be appended to. No need for Heads or revisions. It just worked

[-]

dusktreader@reddit

I made this ages ago to deal with this exact issue. Basically it topologically sorts your migrations and produces a flat graph:

https://github.com/dusktreader/ambix

It might still work, or provide inspiration. I haven't touched it in a few years.

[-]

alexpenev@reddit

Your CI/CD should do an alembic heads check as well as all it's usual unit tests. Any PR that does not correctly chain will then fail CI.

[-]

GraphicH@reddit

Yeah, in CICD we actually run all alembic migrations from 0 -> latest against a PSQL instance, and we also run the alembic autogenerate against it to make sure its "clean" for autogeneration or the build fails. This is a process issue related to bad build practices.

[-]

rogersaintjames@reddit

Also write a test that runs up and down the migration ladder, you don't want to to find out that your down is broken when you need it.

[-]

Darwinmate@reddit

I don't understand the autogen step. What's the purpose?

[-]

GraphicH@reddit

We had situations where some people would generate the alembic for some model changes, then make more changes in the course of they're work, and forget to regenerate the update. Our test fixtures for general functionality don't run all the alembics from zero, because that would be slow, those we populate with the SQLAlchemy models as-is with the functionality SQLAlchemy provides for that, so our tests with their second changes would be fine but the code would be broke in qa/prod if it got deployed. There are also some devs (who for awhile, until I beat it out of them) basically would not use the auto-generate, write the upgrades by hand, and miss shit. So we just put the test in for the "autogenerate" to basically make sure that after all alembics were applied to an empty DB, there were still no "changes" that alembic thought need to be made based on the model definitions. This has actually saved me a few times from a broken deploy, so it was worth figuring out to do in my mind.

[-]

Darwinmate@reddit

Ah very clever. Thank you for taking the time to write this out.

Sounds like you're a team lead of some kind. Wish I had you as a boss.

[-]

GraphicH@reddit

Ha! Thanks, yeah, I've been kind of lead of 3 teams for a few years now. I find that giving people the "whys" of something I want us to start doing is usually a much easier way of getting people on board with it.

[-]

alexpenev@reddit

Yup we do this and dont have issues. CI catches it, and it takes maybe five seconds to fix the divergence and push an edit. Personally I prefer to edit the alembic hex string to fix the sequence order, as it avoids making a blank merge file, but both work. After a few hundred migrations all those empty files get annoying, and you can just squash everything into a one migration and "start over" from one file.

[-]

s-to-the-am@reddit

This is such a great idea, I might implement this myself.

[-]

GraphicH@reddit

It's not bullet proof, if you have "data migrations" that move data between tables and stuff; its harder for the tests to actually do anything in those cases, but we at least make sure all the schema changes apply cleanly starting from an empty database. This is so developers can also just blow away their local env completely, start with an empty psql server, and just have the service start up. No need to fetch db snapshots, or ship them around, to we do have support for getting them for testing purposes.

[-]

IrrerPolterer@reddit

This. Add a check to CI and when i tfails simply update the previous revision pointer(s)

[-]

Challseus@reddit

I would first start with this general rule:

If Person A is about to make a schema change, they make sure the team knows in advance so that anyone else in the same boat can hold off on pushing their changes until Person A pushes theirs, gets it merged, and everyone rebases off of main/master to get the change.

Note: I have not had much success with the merge command. I basically had to force everyone to communicate to make things work

[-]

Chroiche@reddit

This isn't really practical in bigger teams.

[-]

Challseus@reddit

That's fair.

[-]

MichaelEvo@reddit

This is what I’m doing, but my team is only 3 people right now.

Ultimately, if I get enough people, I’ll probably migrate us to using GraphQL or something massive overkill. Something that’s closer to no schema. Or we’ll stop using SQLAlchemy.

[-]

shaylh@reddit

We keep a history file with migrations order, something like:

d (head) c > d b > c a > b

This file is automatically generated by the same command that creates a new migration file. Also there's a test that verifies that migrations match the current code.

Then, if two devs create a new migration from the same head, the second one to try and merge to main will have a conflict, forcing them to rebase and properly order the new migration.

Source: company with ~200 devs working on ~30 DBs, each with their own history and migrations.

[-]

benjaminbrownie@reddit

Run migrations up & down as part of your CI, this will make CI fail if you have multiple people changing the DB.

Whoever merges first gets to be the next head, and those behind them need to fix their down down_revision.

Is your app fairly new? the DB probably will change less over time

[-]

GraphicH@reddit

I do generally have gripes with how alembic approached upgrades; mainly: they tried to re-invent a code revision system. I have one already, it's called git. I've always preferred flyway's approach: just sequential scripts that run in an deterministic order based on their names and a simple version string. That said alembic's autogeneration for SQLAlechmy model changes is just too nice a feature to give up and my team has put build tooling in place so that the annoyances around alembic that I have are basically non-issues.

[-]

alexpenev@reddit

If two devs in parallel both name their migrations "002_foo.sql", doesn't one of them need to rename theirs? I guess it's the same problem as alembic: being explicit about which is the upgrade, which is the downgrade, and being able to freely move between them.

[-]

GraphicH@reddit

Yeah but the difference is git's not going to let me merge that, I can happily merge the alembic scenario and it breaks when the upgrades actually run. Its just not great, that's why we actually have tooling to test the alembics in CICD, though I'd have the same for Flyway tbf. I've used both in production scenarios, Flyway had less nonsense on top of it.

[-]

o0ower0o@reddit

I haven't checked alambic built in features for merges, but in our case it is enough to have a file populated on precommit with the latest version/sql change

This means that you won't be able to merge your pr as there will be a conflict on that file. Still annoying that you have to do manually but we don't do that many changes to the tables that we have to look for a smarter solution

[-]

ElectricSpice@reddit

Is this not what merge is for? https://alembic.sqlalchemy.org/en/latest/branches.html

[-]

rghosthero@reddit

The real question would be why do you have many migrations to begin with? It seems like a process problem not a tool problem TBH. If multiple people are doing some changes at the same time and it's a frequent enough issue to post about it then there's something wrong going on.

I don't know how to help you with this problem technically, but I would suggest trying to solve the root cause ( many changes with many developers), not just trying to solve the symptom.