Nope. We had a lot of customization work done before we made the choice to deploy Ansible. We do have a RHEL Satellite subscription. Currently managing about 17,740 servers - physical and VMs
I was/am a programmer/dev ops person for a ling time, so part of the learning curve regarding puppet wasn't as harsh since I understand the hows and whys. Plus, having Ruby as the basis for creating new facts coupled with my knowing Ruby made it even better.
I have been maintaining the infrastructure where I work with Puppet for about fifteen months now. I picked up a book off of Amazon and started with that. I am a visual learner, so I went with what worked best for me.
It wasn't all fun and games though. I definitely made some mistakes along the way. Also my environment and code base when I inherited it was made up of Puppet 2=>Puppet 7 machines, so there were some interesting uses of the inline functionality to compensate for a lack of features along the way. Only recently have we migrated the majority of the servers to Puppet 8, so a lot of the older cruft was able to be cleaned up. In fact these code refactors/rewrites probably helped me the most in learning some of the more in-depth concepts.
Hope this answers your question. Let me know if you want to know more, or if I can clarify something.
Well both are tools designed to help perform software installations and configuration management. They both contain configuration files that are structured like code (Puppet has manifest files, Ansible has playbooks) that describe the overall result of how a system should "look". A major difference between the two is that Puppet requires a software agent to be running on the host to perform tasks, whereas Ansible uses either SSH or Windows Remote Management in conjunction with a Python installation to perform tasks.
For additional high-level info, I suggest starting with the respective Wikipedia pages for both tools, then read the links that are referenced in each of the articles.
"Yes, it'll take a developer a month to develop a template for that VM that you asked for. That's normal."
"Oh, you have a stateful server? Sss... that's not so easy to change after the fact with IaC! Can't you just blow away your database server? What do you mean transactions?"
"Oops... turns out that the cloud provide doesn't properly handle scale-set sizes in an idempotent way. We redeployed and now everything scaled back down to the minimum/default! I'm sure that's fine."
"Shit... the Terraform statefile got corrupted again and now we can't make any changes anywhere."
"We need to spend the next six months reinventing the cloud's RBAC system... in Git. Badly. Why? Otherwise everyone is God and can wipe out our whole enterprise with a Git push!"
Etc...
There are real downsides to IaC, and this article mentioned none of them.
I've used IaC for a lot of projects and I've experienced a lot of these downsides as well. Too often I find that IaC advocates completely dismiss the negatives, as well as the learning curve that comes with it
My main problem with IaC is that it's slow AF. It requires you to make a code change first, then commit that to source control, then run a CI tool to deploy it to the cloud. After 10 minutes you find out that you missed a property and now you have to repeat that entire cycle. This then happens another 4-5 times until it works. Alternatively, I could create a resource through the UI and have it working in a few minutes
I hear what you’re saying. The only problem I have with creating it in the UI is that what if it’s three months later and you don’t remember the exact steps you took to create it, and you need to create a new version, or someone else accidentally deleted it?
I feel like there’s a nice stability to infrastructure as code. It serves as documentation of the system as well that anyone can read (as long as the code is readable enough). In my experience when coordinating across multiple people in a team, it can be tough if everyone’s performing click ops. It can feel like building on top of sand, instead of a solid foundation.
I work with Azure and they have a function to create an IaC template from an existing resource. This lets you create a working version through the UI and then have it in code for future modifications. I've been using that method to keep my IaC code in line with my cloud environment
You don't need CI tool and source control to run iac workflows. You can run them just fine from your local machine. I wouldn't want teemobile's or comcast's production credentials on my local machine though.
I work in a very large system you probably use. My changes to low environments are done directly by running the IaC tools locally, and on projects more than small enough that an attempt is a 2 minute process for most things. Missing properties blow up very early, because the tooling is actually decent (as opposed to, say cloud formation). After my changes work in a low environment, and I tested them there, I push the changes up to prod. It's not significantly slower than doing it by hand, especially when you would need to make the very same change across 30+ datacenters by hand in the UI, and then hope I didn't mistype something in a certain region somewhere.
All that is true, but then again, IaC is way better than the alternative that is “oh, John is the only one whi knows how this infra is set up because he did it once. Over the past seven years. Oh and there is the cluster that no one dares to breathe upon, because Matt left the company a year ago and we are screwed if anyone needs to ssh into that one, because nobody has the admin key.
Oh, and what configuration are we running on? There’s a wiki that has not been updated for two years since Jessica quit. Some of the stuff might even be up to date.
I mean... For real I don't know if a single dev that uses Grok to vibe code, thought everyone used either ChatGPT, Gemini or Claude but this is only anecdotal and now that I think of it, I haven't tried Grok myself for coding so maybe it's good, idk
My company uses IaC and we still have a "John" whos the only one that knows how all that crap works. Id have better luck figuring the deployment out as a dev if it were an old school deployment with plain old dockerfiles and bash scripts
Coz why can't the requirement be that they know terraform (or whatever flavour of the month tool)?
Exactly because it's "flavor of the month". I want to focus on doing work on the actual project not wrangling some clunky tools that are supposed to help me actually deploy it but always seem to just do the opposite.
The answer to that question probably depends on whether it's possible to make spaghetti code in terraform. If so, then it wouldn't matter if the other devs know terraform, it would still be a titanic effort to understand and reliably modify the code.
IaC is way better than the alternative that is “oh, John is the only one whi knows how this infra is set up because he did it once. Over the past seven years.
The solution to that isn't necessarily IaC. It's documentation, and it should exist, with or without IaC. Get John to write and refine the documentation until someone else can follow it and get a replacement up and running. John doesn't do it? Too much on his plate? Clear it. John still doesn't? Get someone else to write and refine it and then pull John in for a long hard talk about why he wasn't able to get around to it and steps forward.
IaC may cope better with incomplete documentation than manual rigid process, but either way, you should fix that incomplete documentation so that anyone can follow the process. Sometimes, just sometimes, manual process is okay with enough documentation.
If you can describe the setup in enough detail using documentation to reproduce it, you can dear the setup using IaC tooling.
Yes documentation is necessary whether you use IaC or manual processes, but with IaC it’s way easier (cheaper) to maintain and keep up to date.
Proper IaC is its own documentation (up to a point).
And if you put some effort into it, the detailed documentation of the current and up to date infrastructure setup can easily be generated from the IaC code.
Add to that GitOps way of working with infrastructure and you get full history of configuration with full fidelity audit trail of changes over time.
The main problem with tf is that it attempts to be idempotent while existing only declaratively, and with no mechanism to reconcile partial state. And because of that it must also be procedural without being imperative! You get the worst bits of every paradigm.
If you want to recreate an environment where you've created a cyclical dependency over time (imho this should be an error), you have to replay old state to fix it. Or, rewrite it on the fly. It happened to me on a brownfield project where rancher shit the bed and deleted our node pools, and it took 4 engineers 20 hours to fix. I should know, I drove that shitstorm until 4am on a Saturday. Terraform state got fucked and started acting like HAL: "I'm sorry devs, I'm afraid I can't do that."
In practice it's not hard to avoid that pattern, if you're well aware of it and structure the project like that from the start.
Anyway, pulumi is probably better since it allows you to operate it imperatively. Crossplane is... Interesting. I mean k8s at least has a good partial state + reconciliation loop, so, that part of it makes sense - but you've still got the rest of the k8s baggage holding you back.
I'm writing a manifesto about exactly this; declarative configuration. It really gets me heated.
I think it's also really a problem of cloud provider Apis being imperative. Kuberntes really showed the world how to structure a relatively sane infrastructure API.
I think they mean because it's API is very desired state, and everything works through objects as APIs, which is mind blowing as you get the power there.
But it's no walk in the park until you get comfortable with the ecosystem.
My favorite thing about Terraform is how it occasionally decides that my prod service bus instance should be destroyed because it failed to read the resource somehow.
The biggest issue with it is the tfstate file which is absolute shit design and has no good reason for existing. The current state exists on the provider. The future state exists in code. There is absolutely no good reason to have an intermediary map file that gets corrupted every time a fly farts.
Terraform bills itself as a write-once, deploy everywhere system as though you can build resources on azure and then move them all to aws by flipping a switch. Bullshit. While the different cloud providers may offer similar tooling, they’re completely different architectures with resource definitions that simply don’t map to eachother at all.
Further, the monorepo pattern recommended by hashicorp is asinine. I don’t want separate code files for each environment. I want them all built exactly the same (with the minor exception of things like instance counts) and I want them all built from the same piece of code. I absolutely DO NOT want to promote infrastructure by copying files from a “dev” folder to a “test” folder (which is our process for creating new topics/subscriptions) where they’ll invariably become out of sync.
Terraform is fine if you want to create something simple like a function app with a storage account and keyvault, but for shared resources at the enterprise level, it’s absolute garbage. I have never dealt with a terraform project that wasn’t a nightmare in some way.
I absolutely DO NOT want to promote infrastructure by copying files from a “dev” folder to a “test” folder (which is our process for creating new topics/subscriptions) where they’ll invariably become out of sync.
I think this is why terragrunt is popular, but it's still just a hack on top of everything else with poor editor support to boot.
base terraform has solution for that in form of workspaces, but it's annoying to use. other solutions include separating config files, but it's also a pain. terragrunt technically works on second aspect with separation of tfstates of first.
Could you really even call Terraform "code"? It kinda feels at best like a serialization format where you have to memorize every detail about all the objects write the serialization file by hand. Admittedly I don't have a huge amount of experience with it, and I kind of want to keep it that way.
While I was using it I wanted exactly what you want, a declarative format I can iteratively test, and verify my syntax without having to try to stand up infrastructure in the process. Just give me a library of Python objects that I can build up a structure with, validate offline that my structure at least makes some sort of sense and that I can just initiate standing up infrastructure from once I'm comfortable with it all.
Since I'm currently unemployed I'm spending my copious spare time trying to build a bunch of tools that I would want to use and that I can release as Open Source. Terraform is pretty far down that list right now, but it is something that I eye every once in a while and wonder if I couldn't come up with a better approach. I have a (surprisingly) lot of lisp in my background and I think a lisp-ish solution might be what's called for here.
Just my irritable 2 cents -- I'm not volunteering for anything this year heh heh.
> "While I was using it I wanted exactly what you want, a declarative format I can iteratively test, and verify my syntax without having to try to stand up infrastructure in the process."
Of course you can do that in terraform!
"The terraform validate command validates the configuration files in a directory. It does not validate remote services, such as remote state or provider APIs."
So, Infrastructure as Code really means "as Encoding", whether it's code or data (**insert Lisp joke here**). This is in contradistinction to doing things by hand.
Now, if you wanted that Python library, there's no reason you can't write it yourself on top of Terraform. Write a class for every syntactic concept, using object composition just as the syntax does. You'll treat that as a serialization layer (like a responsible engineer!) and write your preferred abstraction on top.
Heck, I'm getting the willies just thinking about it. PM me (but not your willie!)
Funnily I think my approach would be to write the objects out in C++ and build a terraform serializer for Cereal. It's easy to build a python API on top of that using nanobind and have the C++ code use a dependency graph to insure all the required objects get defined for the infrastructure that needs to get set up. I'm kinda building a dependency graph for a requirements manager I'm working on in my copious spare time. They're not particularly hard to build, but setting up all the rules for how objects interact is kind of time consuming. And for every one you create, you always realize you need two more.
CDK kinda feels like what you want. It's nice to be able to run pdb and step through the code. The downside is that it's just creating CloudFormation and can get itself into a partial rollout state when the only solution I ever found is to delete the state. Take that with a grain of salt, I haven't used it in a few years.
Yeah that does look like what I was wanting when I worked with Terraform. I'll have to poke at that a bit when I have a moment. Most of what I want to do with AWS is pretty simple anyway. For things much more complex than that, most projects will bring in a real devops guy anyway.
If you can write it down, it’s code. Code just means information that has been encoded. We shouldn’t be confused by code also being used as a short way to refer to programming language code, there is also object code, byte code, encoding, decoding, codecs, etc.
To me a useful way to read IaC is “infrastructure as source code”. So it’s not any old encoding, but readable code that can be managed in a source code control system like git.
Anyway, pulumi is probably better since it allows you to operate it imperatively.
Mark against pulumi, they keep removing APIs in breaking major changes and finding documentation or reasoning about it is impossible. Only really worth using if you're paying for their cloud or TFE.
I realize I was being unclear. I was trying to migrate away from terraform, so I had a load of existing state I needed to integrate, but Pulumi removed their APIs for reading tfstate from S3, only supporting their own cloud or TFE. The "solution" was using the aws sdk and reading the JSON blobs myself.
Ah gotcha. When we encountered this need it was also a PITA. We addressed it by importing the existing resources into the new Pulumi code by ID (AWS in our case) through ResourceOptions, after extracting those IDs from the TF state (in what sounds like a similar fashion to you).
Fiddly, and this means technically you have a window where both TF and Pulumi act on the same actual resources, so you have to be able to freeze the TF (at least in parts) while doing the migration.
I use Pulumi with a GCP bucket backed state. Haven't had issues. Their full cloud platform is useful if you want to take advantage of some of their tooling they've built around it (mainly around RBAC, and/or secrets management). But if you just want to write code that can consistently deploy a stack of resources in a cloud, you can totally get by with DIY-managed state.
I'm somewhat compelled by smarter config languages like KCL and Pkl. In a similar space is Cue/Dhall/Nickel, but, for various reasons those don't quite appeal to me.
I've heard a lot of praise about Cue, tried it a bit, but didn't love it. KCL is what really shines imo, and if you look at Pkl I've filed a number of the early issues. What KCL is missing is a specialized registry that isn't artifacthub + github repos ; both of which aren't great for discoverability. Something like crates.io / npm.
I'm a starting to belief that if you want to do IaC right, you need to also apply that to your dev machines. You want to write IaC as soon as possible in your dev cycle.
Kinda like you don't want to write Unit test AFTER you wrote the implementation but BEFORE. Right?
Are there docker images which host entire full stack web based dev environments? That's what I want :)
Terraform isn't "code". A json file also isn't code. Just because something is kept git or is consumed by CICD, doesn't mean its code, or even a good idea.
How do I know this? It's in the name: Hashicorp CONFIGURATION language. TF is fine for certain things. The problems arise when people tie to shove too much into its "programming" model, which had basic things like for loops bolted on like a 5th wheel on a car.
People also try to do strange things with TF. Like storing or executing their companies business logic. Or creating layers of abstraction over regular terraform modules that provide 20% of the features of the underlying module.
Then there is TF-CDK which is real code. But that point, you might as well use the same Go libraries that TF uses underneath?
But the main issue with TF is that it deviates from the "operator" api pattern that kubernetes uses, because of its state file. You end with with 3 potential sources of truth: the cloud provider, the state file and our TF config in git. We have k8s that constantly monitor your deployments, pods, replicas and other k8s objects. the source of truth is what Kubernetes sees and monitors. Extend that to buckets, DBs and any other cloud service with operators and you don't need TF.
Infrastructure as code is not the same as Infrastructure in code. It's about treating the infrastructure the same as your code: source control, deployment pipelines, audibility and rollback. It could be a .ini file, but if it's committed to git, and only applied as part of a pipeline, then it's IaC, IMO.
Unpopular opinion: I think as your organization grows, this is going to tend towards Turing-completeness, and it's better to bite the bullet early and make sure that gets sandboxed, instead of letting it grow organically.
Because the organic solution is going to be you start with static stuff like YAML (or even ini!) and then start having scripts generate a tiny piece of one, and then someone starts using a templating language that was built for HTML instead of config, so now you live with the worst of all worlds: The template stuff has made the config harder to read and yet not much easier to script, yet the scripts have escaped containment and you now can't evaluate a template without those scripts hitting a bunch of network endpoints.
I know it's an unpopular opinion because I haven't been able to sell a single other person on an approach like Jsonnet. We have somehow landed on "No one ever got fired for using YAML"
Still, there are some people who try to write code in SQL, and recently even in Terraform or CSS. My office is making us write tons of .tf files with all these fancy modules, and it's painful AF.
terraform examples would be better as opentofu examples - platform configuration DSLs are a godsend for complex infrastructure environments.
re k8s operators vs tf providers … lol if you aren’t using iac to define your k8s deployments. so k8s has HTTP API - are you making curl requests? (real coders write assembly)
Can't open that page.
Doesn't really matter if it is tf, cdk, pulumi or ansible or cfn. Click ops is the mark of the incompetent.
Have you tested your disaster recovery? Click ops would be a god damn nightmare in that case.
Have you refactored a running infrastructure?
I feel people complaining about terraform state problems could benefit from running the errors through AI, it can help you quickly.
Looking at people struggling with terraform i feel just like the early days of Git almost two decades ago, where the concepts were new and people had not learned them yet. These can be taught and the benefits are incredible.
Iac also mandates knowledge of CI systems and excellent version control skills, these go hand in hand.
Isn't this what .NET Aspire set's out to solve? It allows applications to include the infrastructure that they need to function with the application code / management interface. Wouldn't it make more sense for each language to take the same approach rather than tying everything down to a single vendor aka terraform?
IaC is great, but maintaining linked IaC-stacks can be a pain if you have hard dependencies between them. It's been a while, but last time I did AWS stuff I made sure to avoid such dependency problems unless it simply wasn't possible.
It's all about the IaC tooling you use, and how you refer to your dependencies. Using raw cloud formation is going to drive you up a wall. But that's not IaC's problem, it's because the tool was just not written for people. Even when managemend demanded that we used it, we ended up spending money on tooling to provide real, reasonable pre-execution validators to make things manageable.
At the very minimum, something like terragrunt ends up being more reliable and actually saves time to run hundreds of different little modules that can have reasonable references to each other
I've mainly used AWS CDK, it's been fine and it just transpiles the typescript stacks into CloudFormation JSON. Also did some simple stuff with CloudFormation alone, which wasn't too bad but as you said it obviously isn't that good for making anything complex manually.
XandrousMoriarty@reddit
Yes, Puppet and Ansible have been godsends at my job.
shockputs@reddit
Are you using puppet because you didn't want to pay for ansible's built-in tool for managing multiple server configuration replication?
XandrousMoriarty@reddit
Nope. We had a lot of customization work done before we made the choice to deploy Ansible. We do have a RHEL Satellite subscription. Currently managing about 17,740 servers - physical and VMs
Spike_Ra@reddit
Did you take any classes for Puppet? I use it a little at work and I feel like I could be better.
XandrousMoriarty@reddit
I was/am a programmer/dev ops person for a ling time, so part of the learning curve regarding puppet wasn't as harsh since I understand the hows and whys. Plus, having Ruby as the basis for creating new facts coupled with my knowing Ruby made it even better.
I have been maintaining the infrastructure where I work with Puppet for about fifteen months now. I picked up a book off of Amazon and started with that. I am a visual learner, so I went with what worked best for me.
It wasn't all fun and games though. I definitely made some mistakes along the way. Also my environment and code base when I inherited it was made up of Puppet 2=>Puppet 7 machines, so there were some interesting uses of the inline functionality to compensate for a lack of features along the way. Only recently have we migrated the majority of the servers to Puppet 8, so a lot of the older cruft was able to be cleaned up. In fact these code refactors/rewrites probably helped me the most in learning some of the more in-depth concepts.
Hope this answers your question. Let me know if you want to know more, or if I can clarify something.
ignat980@reddit
What is Puppet and Ansible in relation to IaC? Sorry, I haven't used them and I'd rather hear from a human directly who had experience with them
XandrousMoriarty@reddit
Well both are tools designed to help perform software installations and configuration management. They both contain configuration files that are structured like code (Puppet has manifest files, Ansible has playbooks) that describe the overall result of how a system should "look". A major difference between the two is that Puppet requires a software agent to be running on the host to perform tasks, whereas Ansible uses either SSH or Windows Remote Management in conjunction with a Python installation to perform tasks.
For additional high-level info, I suggest starting with the respective Wikipedia pages for both tools, then read the links that are referenced in each of the articles.
DeanTimeHoodie@reddit
As a dev working for Puppet, this warms my heart. Now, I’m kinda tempted to advertise my team’s product lol
BigHandLittleSlap@reddit
"Yes, it'll take a developer a month to develop a template for that VM that you asked for. That's normal."
"Oh, you have a stateful server? Sss... that's not so easy to change after the fact with IaC! Can't you just blow away your database server? What do you mean transactions?"
"Oops... turns out that the cloud provide doesn't properly handle scale-set sizes in an idempotent way. We redeployed and now everything scaled back down to the minimum/default! I'm sure that's fine."
"Shit... the Terraform statefile got corrupted again and now we can't make any changes anywhere."
"We need to spend the next six months reinventing the cloud's RBAC system... in Git. Badly. Why? Otherwise everyone is God and can wipe out our whole enterprise with a Git push!"
Etc...
There are real downsides to IaC, and this article mentioned none of them.
Loves_Poetry@reddit
I've used IaC for a lot of projects and I've experienced a lot of these downsides as well. Too often I find that IaC advocates completely dismiss the negatives, as well as the learning curve that comes with it
My main problem with IaC is that it's slow AF. It requires you to make a code change first, then commit that to source control, then run a CI tool to deploy it to the cloud. After 10 minutes you find out that you missed a property and now you have to repeat that entire cycle. This then happens another 4-5 times until it works. Alternatively, I could create a resource through the UI and have it working in a few minutes
Cruuncher@reddit
You need an environment you can push to frequently without bottlenecks to test
thoeoe@reddit
My team owns a cli tool people in the company can use to deploy cfn to lower envs
serpix@reddit
May god have mercy on the souls of a custom cli builder when there are existing solutions like cdk.
ignat980@reddit
cdk is AWS only. What if your infra is on OVHCloud?
gyroda@reddit
Or one you can manually tweak and then export the IAC for.
_mkd_@reddit
Why not throw in a pony as well?
Ok-Willow-2810@reddit
I hear what you’re saying. The only problem I have with creating it in the UI is that what if it’s three months later and you don’t remember the exact steps you took to create it, and you need to create a new version, or someone else accidentally deleted it?
I feel like there’s a nice stability to infrastructure as code. It serves as documentation of the system as well that anyone can read (as long as the code is readable enough). In my experience when coordinating across multiple people in a team, it can be tough if everyone’s performing click ops. It can feel like building on top of sand, instead of a solid foundation.
Loves_Poetry@reddit
I work with Azure and they have a function to create an IaC template from an existing resource. This lets you create a working version through the UI and then have it in code for future modifications. I've been using that method to keep my IaC code in line with my cloud environment
Worth_Trust_3825@reddit
You don't need CI tool and source control to run iac workflows. You can run them just fine from your local machine. I wouldn't want teemobile's or comcast's production credentials on my local machine though.
bongoscout@reddit
It is usually pretty easy to create a resource using the UI and import it into your TF state.
serpix@reddit
That does not grant you powers to recreate or modify the resource.
hibikir_40k@reddit
You don't need to be that crazy.
I work in a very large system you probably use. My changes to low environments are done directly by running the IaC tools locally, and on projects more than small enough that an attempt is a 2 minute process for most things. Missing properties blow up very early, because the tooling is actually decent (as opposed to, say cloud formation). After my changes work in a low environment, and I tested them there, I push the changes up to prod. It's not significantly slower than doing it by hand, especially when you would need to make the very same change across 30+ datacenters by hand in the UI, and then hope I didn't mistype something in a certain region somewhere.
DaRadioman@reddit
Exactly, anyone advocating for click ops must really have a tiny fleet/presence. Sure if you have one instance for all it might be ok (might!)
I can't imagine the inconsistencies across our fleet if we tried that crap. You aren't hand setting something across 100 stamps.
And how are you ensuring test and prod are the same? Hopes and Dreams?
Luolong@reddit
All that is true, but then again, IaC is way better than the alternative that is “oh, John is the only one whi knows how this infra is set up because he did it once. Over the past seven years. Oh and there is the cluster that no one dares to breathe upon, because Matt left the company a year ago and we are screwed if anyone needs to ssh into that one, because nobody has the admin key.
Oh, and what configuration are we running on? There’s a wiki that has not been updated for two years since Jessica quit. Some of the stuff might even be up to date.
non3type@reddit
That pretty much exists with IaC as well, it’s just easier for devs to grok.
Gaboik@reddit
Do devs use Grok?
non3type@reddit
You’re making me feel really old if that’s not a joke.
Gaboik@reddit
I mean... For real I don't know if a single dev that uses Grok to vibe code, thought everyone used either ChatGPT, Gemini or Claude but this is only anecdotal and now that I think of it, I haven't tried Grok myself for coding so maybe it's good, idk
non3type@reddit
The word grok pre exists twitters usage of it.
Gaboik@reddit
Wtf for real ? My bad lmao, not my first language 🤣
You have to admit tho, it does not look like an actual word does it ?
non3type@reddit
It’s a made up word from a science fiction book so you’re not wrong 😆.
arcanemachined@reddit
All words are made-up. :(
defnotthrown@reddit
Pre-dates Twitter itself or the world wide web for that matter.
grauenwolf@reddit
To summarize the below thread:
Note the capitalization of the 'G'.
dijalektikator@reddit
My company uses IaC and we still have a "John" whos the only one that knows how all that crap works. Id have better luck figuring the deployment out as a dev if it were an old school deployment with plain old dockerfiles and bash scripts
Chii@reddit
so just ignorant devs? Coz why can't the requirement be that they know terraform (or whatever flavour of the month tool)?
dijalektikator@reddit
Exactly because it's "flavor of the month". I want to focus on doing work on the actual project not wrangling some clunky tools that are supposed to help me actually deploy it but always seem to just do the opposite.
erinaceus_@reddit
The answer to that question probably depends on whether it's possible to make spaghetti code in terraform. If so, then it wouldn't matter if the other devs know terraform, it would still be a titanic effort to understand and reliably modify the code.
Luolong@reddit
Well, at least there is code that someone can take a look at and curse their way to high heaven before coming to grips with what it all does.
orygin@reddit
Yep, still better than guessing what/how it has been deployed, or going through the employee's shell history like a detective on a murder trail...
PurpleYoshiEgg@reddit
The solution to that isn't necessarily IaC. It's documentation, and it should exist, with or without IaC. Get John to write and refine the documentation until someone else can follow it and get a replacement up and running. John doesn't do it? Too much on his plate? Clear it. John still doesn't? Get someone else to write and refine it and then pull John in for a long hard talk about why he wasn't able to get around to it and steps forward.
IaC may cope better with incomplete documentation than manual rigid process, but either way, you should fix that incomplete documentation so that anyone can follow the process. Sometimes, just sometimes, manual process is okay with enough documentation.
Luolong@reddit
If you can describe the setup in enough detail using documentation to reproduce it, you can dear the setup using IaC tooling.
Yes documentation is necessary whether you use IaC or manual processes, but with IaC it’s way easier (cheaper) to maintain and keep up to date.
Proper IaC is its own documentation (up to a point).
And if you put some effort into it, the detailed documentation of the current and up to date infrastructure setup can easily be generated from the IaC code.
Add to that GitOps way of working with infrastructure and you get full history of configuration with full fidelity audit trail of changes over time.
loozerr@reddit
Yes there's only IaC and whatever the mess you described there is 🙂
Hdmoney@reddit
Huge L takes on terraform.
The main problem with tf is that it attempts to be idempotent while existing only declaratively, and with no mechanism to reconcile partial state. And because of that it must also be procedural without being imperative! You get the worst bits of every paradigm.
If you want to recreate an environment where you've created a cyclical dependency over time (imho this should be an error), you have to replay old state to fix it. Or, rewrite it on the fly. It happened to me on a brownfield project where rancher shit the bed and deleted our node pools, and it took 4 engineers 20 hours to fix. I should know, I drove that shitstorm until 4am on a Saturday. Terraform state got fucked and started acting like HAL: "I'm sorry devs, I'm afraid I can't do that."
In practice it's not hard to avoid that pattern, if you're well aware of it and structure the project like that from the start.
Anyway, pulumi is probably better since it allows you to operate it imperatively. Crossplane is... Interesting. I mean k8s at least has a good partial state + reconciliation loop, so, that part of it makes sense - but you've still got the rest of the k8s baggage holding you back.
I'm writing a manifesto about exactly this; declarative configuration. It really gets me heated.
morricone42@reddit
I think it's also really a problem of cloud provider Apis being imperative. Kuberntes really showed the world how to structure a relatively sane infrastructure API.
SquirrelOtherwise723@reddit
Sane?
K8s API is really hard. The cli isn't easy either.
Worth_Trust_3825@reddit
I'm genuinely sure if k8s didn't use yaml it would be much easier.
elidepa@reddit
Sane and easy aren’t synonymous. If you need easy for a simple solution, then k8s is the wrong solution to use.
DaRadioman@reddit
I think they mean because it's API is very desired state, and everything works through objects as APIs, which is mind blowing as you get the power there.
But it's no walk in the park until you get comfortable with the ecosystem.
tequilajinx@reddit
My favorite thing about Terraform is how it occasionally decides that my prod service bus instance should be destroyed because it failed to read the resource somehow.
The biggest issue with it is the tfstate file which is absolute shit design and has no good reason for existing. The current state exists on the provider. The future state exists in code. There is absolutely no good reason to have an intermediary map file that gets corrupted every time a fly farts.
Terraform bills itself as a write-once, deploy everywhere system as though you can build resources on azure and then move them all to aws by flipping a switch. Bullshit. While the different cloud providers may offer similar tooling, they’re completely different architectures with resource definitions that simply don’t map to eachother at all.
Further, the monorepo pattern recommended by hashicorp is asinine. I don’t want separate code files for each environment. I want them all built exactly the same (with the minor exception of things like instance counts) and I want them all built from the same piece of code. I absolutely DO NOT want to promote infrastructure by copying files from a “dev” folder to a “test” folder (which is our process for creating new topics/subscriptions) where they’ll invariably become out of sync.
Terraform is fine if you want to create something simple like a function app with a storage account and keyvault, but for shared resources at the enterprise level, it’s absolute garbage. I have never dealt with a terraform project that wasn’t a nightmare in some way.
Halkcyon@reddit
I think this is why terragrunt is popular, but it's still just a hack on top of everything else with poor editor support to boot.
Worth_Trust_3825@reddit
base terraform has solution for that in form of workspaces, but it's annoying to use. other solutions include separating config files, but it's also a pain. terragrunt technically works on second aspect with separation of tfstates of first.
FlyingRhenquest@reddit
Could you really even call Terraform "code"? It kinda feels at best like a serialization format where you have to memorize every detail about all the objects write the serialization file by hand. Admittedly I don't have a huge amount of experience with it, and I kind of want to keep it that way.
While I was using it I wanted exactly what you want, a declarative format I can iteratively test, and verify my syntax without having to try to stand up infrastructure in the process. Just give me a library of Python objects that I can build up a structure with, validate offline that my structure at least makes some sort of sense and that I can just initiate standing up infrastructure from once I'm comfortable with it all.
Since I'm currently unemployed I'm spending my copious spare time trying to build a bunch of tools that I would want to use and that I can release as Open Source. Terraform is pretty far down that list right now, but it is something that I eye every once in a while and wonder if I couldn't come up with a better approach. I have a (surprisingly) lot of lisp in my background and I think a lisp-ish solution might be what's called for here.
Just my irritable 2 cents -- I'm not volunteering for anything this year heh heh.
RustaceanNation@reddit
> "While I was using it I wanted exactly what you want, a declarative format I can iteratively test, and verify my syntax without having to try to stand up infrastructure in the process."
Of course you can do that in terraform!
"The
terraform validatecommand validates the configuration files in a directory. It does not validate remote services, such as remote state or provider APIs."So, Infrastructure as Code really means "as Encoding", whether it's code or data (**insert Lisp joke here**). This is in contradistinction to doing things by hand.
Now, if you wanted that Python library, there's no reason you can't write it yourself on top of Terraform. Write a class for every syntactic concept, using object composition just as the syntax does. You'll treat that as a serialization layer (like a responsible engineer!) and write your preferred abstraction on top.
Heck, I'm getting the willies just thinking about it. PM me (but not your willie!)
FlyingRhenquest@reddit
Funnily I think my approach would be to write the objects out in C++ and build a terraform serializer for Cereal. It's easy to build a python API on top of that using nanobind and have the C++ code use a dependency graph to insure all the required objects get defined for the infrastructure that needs to get set up. I'm kinda building a dependency graph for a requirements manager I'm working on in my copious spare time. They're not particularly hard to build, but setting up all the rules for how objects interact is kind of time consuming. And for every one you create, you always realize you need two more.
OrdinaryTension@reddit
CDK kinda feels like what you want. It's nice to be able to run pdb and step through the code. The downside is that it's just creating CloudFormation and can get itself into a partial rollout state when the only solution I ever found is to delete the state. Take that with a grain of salt, I haven't used it in a few years.
fumar@reddit
As someone that does a lot of TF work, CDK is ass and has never been production ready imo.
FlyingRhenquest@reddit
Yeah that does look like what I was wanting when I worked with Terraform. I'll have to poke at that a bit when I have a moment. Most of what I want to do with AWS is pretty simple anyway. For things much more complex than that, most projects will bring in a real devops guy anyway.
schplat@reddit
This is what irks me about people always conflating TF and IaC. TF is IaDSL (at best, but yes, more accurately as serialization).
drschreber@reddit
Configuration is code, it may not be in a Turing complete language. But I’d argue it’s still code.
diroussel@reddit
If you can write it down, it’s code. Code just means information that has been encoded. We shouldn’t be confused by code also being used as a short way to refer to programming language code, there is also object code, byte code, encoding, decoding, codecs, etc.
To me a useful way to read IaC is “infrastructure as source code”. So it’s not any old encoding, but readable code that can be managed in a source code control system like git.
Halkcyon@reddit
Mark against pulumi, they keep removing APIs in breaking major changes and finding documentation or reasoning about it is impossible. Only really worth using if you're paying for their cloud or TFE.
Captator@reddit
Could you expand your last bracketed point? I might be misunderstanding, but there are multiple remote state options supported by Pulumi, not only S3.
Halkcyon@reddit
I realize I was being unclear. I was trying to migrate away from terraform, so I had a load of existing state I needed to integrate, but Pulumi removed their APIs for reading tfstate from S3, only supporting their own cloud or TFE. The "solution" was using the aws sdk and reading the JSON blobs myself.
Captator@reddit
Ah gotcha. When we encountered this need it was also a PITA. We addressed it by importing the existing resources into the new Pulumi code by ID (AWS in our case) through ResourceOptions, after extracting those IDs from the TF state (in what sounds like a similar fashion to you).
Fiddly, and this means technically you have a window where both TF and Pulumi act on the same actual resources, so you have to be able to freeze the TF (at least in parts) while doing the migration.
schplat@reddit
I use Pulumi with a GCP bucket backed state. Haven't had issues. Their full cloud platform is useful if you want to take advantage of some of their tooling they've built around it (mainly around RBAC, and/or secrets management). But if you just want to write code that can consistently deploy a stack of resources in a cloud, you can totally get by with DIY-managed state.
WeeklyCustomer4516@reddit
Bucket propio nube ajena, funciona sin pagar extra.
klekpl@reddit
The most interesting thing in this space I found so far (but haven't really used it as it is very niche) is: https://propellor.branchable.com
The idea of using a real programming language with a very strong type system enabling creation of embedded DSL (such as Haskell) is really compelling.
Hdmoney@reddit
I'm somewhat compelled by smarter config languages like KCL and Pkl. In a similar space is Cue/Dhall/Nickel, but, for various reasons those don't quite appeal to me.
I've heard a lot of praise about Cue, tried it a bit, but didn't love it. KCL is what really shines imo, and if you look at Pkl I've filed a number of the early issues. What KCL is missing is a specialized registry that isn't artifacthub + github repos ; both of which aren't great for discoverability. Something like crates.io / npm.
Rezistik@reddit
Pulumi seems like the right move in my opinion. Way easier to parse and figure out and familiar
MiigPT@reddit
I get what you mean thats why aws cdk is the best iac tio for me, sad that there isnt a cloud provider agnostic tool that works as flawlessly as cdk
svix_ftw@reddit
one reason aws cdk works is probably because its only one cloud specific
kevin_home_alone@reddit
No it isn’t.
popiazaza@reddit
Terraform is so painful to work with, but it's too popular to ignore it.
Pulumi is a great middle ground, but it doesn't gain enough popularity to justify it.
.NET Aspire is the hill I will die on. Azure got first class support, and AWS is already hop on the train. Maybe not now, but soon.
seweso@reddit
I'm a starting to belief that if you want to do IaC right, you need to also apply that to your dev machines. You want to write IaC as soon as possible in your dev cycle.
Kinda like you don't want to write Unit test AFTER you wrote the implementation but BEFORE. Right?
Are there docker images which host entire full stack web based dev environments? That's what I want :)
Ok_Hovercraft_1690@reddit
Terraform isn't "code". A json file also isn't code. Just because something is kept git or is consumed by CICD, doesn't mean its code, or even a good idea.
How do I know this? It's in the name: Hashicorp CONFIGURATION language. TF is fine for certain things. The problems arise when people tie to shove too much into its "programming" model, which had basic things like for loops bolted on like a 5th wheel on a car.
People also try to do strange things with TF. Like storing or executing their companies business logic. Or creating layers of abstraction over regular terraform modules that provide 20% of the features of the underlying module.
Then there is TF-CDK which is real code. But that point, you might as well use the same Go libraries that TF uses underneath?
But the main issue with TF is that it deviates from the "operator" api pattern that kubernetes uses, because of its state file. You end with with 3 potential sources of truth: the cloud provider, the state file and our TF config in git. We have k8s that constantly monitor your deployments, pods, replicas and other k8s objects. the source of truth is what Kubernetes sees and monitors. Extend that to buckets, DBs and any other cloud service with operators and you don't need TF.
BeakerAU@reddit
Infrastructure as code is not the same as Infrastructure in code. It's about treating the infrastructure the same as your code: source control, deployment pipelines, audibility and rollback. It could be a .ini file, but if it's committed to git, and only applied as part of a pipeline, then it's IaC, IMO.
SanityInAnarchy@reddit
Unpopular opinion: I think as your organization grows, this is going to tend towards Turing-completeness, and it's better to bite the bullet early and make sure that gets sandboxed, instead of letting it grow organically.
Because the organic solution is going to be you start with static stuff like YAML (or even ini!) and then start having scripts generate a tiny piece of one, and then someone starts using a templating language that was built for HTML instead of config, so now you live with the worst of all worlds: The template stuff has made the config harder to read and yet not much easier to script, yet the scripts have escaped containment and you now can't evaluate a template without those scripts hitting a bunch of network endpoints.
I know it's an unpopular opinion because I haven't been able to sell a single other person on an approach like Jsonnet. We have somehow landed on "No one ever got fired for using YAML"
rer1@reddit
I love this observation. The term makes so much more sense now.
nezeta@reddit
Still, there are some people who try to write code in SQL, and recently even in Terraform or CSS. My office is making us write tons of .tf files with all these fancy modules, and it's painful AF.
NimirasLupur@reddit
Cries in ancient saltstack yaml code …
BCarlet@reddit
How are you enjoying the Broadcom changes?
daltorak@reddit
Powershell Desired State Configuration waves and says hello to your saltstack.
NimirasLupur@reddit
Shared pain is halved pain … as we say in Germany
Halkcyon@reddit
DSC was sadly nothing more than a toy and never properly supported from Microsoft.
eggsby@reddit
terraform examples would be better as opentofu examples - platform configuration DSLs are a godsend for complex infrastructure environments.
re k8s operators vs tf providers … lol if you aren’t using iac to define your k8s deployments. so k8s has HTTP API - are you making curl requests? (real coders write assembly)
serpix@reddit
Can't open that page. Doesn't really matter if it is tf, cdk, pulumi or ansible or cfn. Click ops is the mark of the incompetent. Have you tested your disaster recovery? Click ops would be a god damn nightmare in that case.
Have you refactored a running infrastructure? I feel people complaining about terraform state problems could benefit from running the errors through AI, it can help you quickly.
Looking at people struggling with terraform i feel just like the early days of Git almost two decades ago, where the concepts were new and people had not learned them yet. These can be taught and the benefits are incredible.
Iac also mandates knowledge of CI systems and excellent version control skills, these go hand in hand.
ComfortableTackle479@reddit
And then every junior uses terraform or kubernetes for a landing page.
Ravun@reddit
Isn't this what .NET Aspire set's out to solve? It allows applications to include the infrastructure that they need to function with the application code / management interface. Wouldn't it make more sense for each language to take the same approach rather than tying everything down to a single vendor aka terraform?
Rayner_Vanguard@reddit
Can't open the page
SesbianLex96@reddit
Kid named NixOS
Harha@reddit
IaC is great, but maintaining linked IaC-stacks can be a pain if you have hard dependencies between them. It's been a while, but last time I did AWS stuff I made sure to avoid such dependency problems unless it simply wasn't possible.
hibikir_40k@reddit
It's all about the IaC tooling you use, and how you refer to your dependencies. Using raw cloud formation is going to drive you up a wall. But that's not IaC's problem, it's because the tool was just not written for people. Even when managemend demanded that we used it, we ended up spending money on tooling to provide real, reasonable pre-execution validators to make things manageable.
At the very minimum, something like terragrunt ends up being more reliable and actually saves time to run hundreds of different little modules that can have reasonable references to each other
Harha@reddit
I've mainly used AWS CDK, it's been fine and it just transpiles the typescript stacks into CloudFormation JSON. Also did some simple stuff with CloudFormation alone, which wasn't too bad but as you said it obviously isn't that good for making anything complex manually.