Best devops practices to introduce for a data engineering team with no dev environment?
Posted by andalooooooongjacket@reddit | ExperiencedDevs | View on Reddit | 10 comments
I just joined a team with hundreds of Airflow DAGs in production and no development environment other than localhost. All of the DAGs are in prod with no config variables, and local Airflow deployments test in production using our AWS credentials. When we’re spinning up a dev and QA Airflow environment what best practices should I introduce ASAP to reduce the risk/start migrating over to our new way of working with dev?
Zulban@reddit
Why do you want to do this? Do you have support from management?
andalooooooongjacket@reddit (OP)
I need to do this to test some critical pipelines outside of prod before migration.
My manager and management more generally want to improve devops practices, they are just pretty far away from that goal with there their infra is at atm and will need my support during implementation. I’m asking here for more specific Airflow/devops advice I can use, beyond what I’ve been thinking about so far.
DualityEnigma@reddit
I’d recommend looking at terraform. You can take the existing stack and design it into terraform syntax. Once you have it in terraform you can easily configure a dev/staging/prod environment that is easy to stand up (and tear down while testing).
Once you have this new, clean environment (separate from current prod) you can plan to migrate current prod to the new environment and depreciate the old one (or import it into terraform).
When improving devops you often need to start clean and plan a migration of current prod. AWS makes this pretty easy.
WanderingStoner@reddit
good answer. it should also use modules so that the code is not duplicated for each env
jcpj1@reddit
Have you considered Sam sync from aws? Since your team is already testing on aws, it may be a nice incremental improvement where you can run tests on isolated, ephemeral cloud envs
valence_engineer@reddit
Whatever you do talk to the people running and making these DAGs. Loop them in. No, loop them in more than you think you need to. Have sessions, hand hold teams, embed in teams to migrate, etc, etc.
I have seen tons of these data migration projects die a fiery death when the data engineers get blocked, business goals fall behind, the new processes are blamed, and the heads of those trying to improve things roll.
Cell-i-Zenit@reddit
i would try to move as much as possible into terraform. Make the terraform files environment independent, so you can spawn easily a dev + prod environment with the same files
titpetric@reddit
Localhost is a dev env.
nfigo@reddit
If they flub a deployment in production, what happens? Corrupted data? Can they recover quickly? Does anyone notice?
Lay out all the things that can go wrong. How much data do you lose? How long is the system unavailable? Start by addressing the biggest risks.
AshamedDuck4329@reddit
it's crucial to set up a version control system and implement ci/cd pipelines. also, consider using infrastructure as code. containerization with docker can help too.