Monorepos vs. many repos: is there a good answer?
Posted by bitter-cognac@reddit | programming | View on Reddit | 335 comments
Posted by bitter-cognac@reddit | programming | View on Reddit | 335 comments
pico8lispr@reddit
Both are terrible but in different ways. I am doomed to switch back and forth for all eternity, or until the next tech layoff finally puts me to rest.
TheWix@reddit
Monorepos that are worked on by multiple teams and contain multiple domains suck. Single team, single domain monorepos are fine.
The idea that so many things can share so much code, and that shared code is changing so frequently that it is too cumbersome to put them in different repos is wild to me.
daishi55@reddit
Meta has (pretty much) one giant monorepo for literally thousands of projects and it’s the best development experience I’ve ever had
Randommook@reddit
Except when you need to do integration testing in which case jest-e2e deems everything an "infra failure" making your integration tests completely useless.
light24bulbs@reddit
So does Google, so does Microsoft increasingly. These folks don't know what they're about
daishi55@reddit
my mind was blown when i got there. "you mean i can just import this function from 3 teams over and it just works?" the idea that any code from anywhere in the codebase can be part of my project with no hassle is insane.
verrius@reddit
The problem is "no hassles" isn't really true. I think both Google and and Meta essentially wrote their source control to handle things, because most source control doesn't handle repos as big as they have, with as many users as they have. Which means if you're used to having any sort of standard tooling on your source control, you can get fucked.
light24bulbs@reddit
I realized a while ago when I was trying to tool up an enterprise for monorepo is that those tools are actually the real secret sauce behind those big companies, and you will very rarely find them sharing their secret sauce. Google will shovel dog---shit---food like angular all day long but the tools they use to actually build massive technologies and succeed at scale are proprietary.
mistaekNot@reddit
angular is good?
Due_Emergency_6171@reddit
Lot better than react to be honest
The_Hegemon@reddit
Any sufficiently complicated React app tends to be a poorly implemented version of Angular anyway.
light24bulbs@reddit
Question mark being the operate punctuation there
valarauca14@reddit
Yeah stuff like G's internal ABI, C++ compiler, and JVM is stuff you rarely hear discussed. Because despite being (originally) boring projects (modifications of existing OSS) the technical decisions (and reasons behind them) they make are fascinating.
light24bulbs@reddit
It sounds boring until you try to do it yourself then you realize it's fucking difficult and interesting and you wish someone else had done it for you
khumps@reddit
Meta ironically is trying to open source more and more of it. Turns out being able to find new developers in the wild who already know how to use your “secret sauce” is really good for scaling up your dev team (some of these are much more popular than others): - unified api w/graphql - unified/modular frontendreact - unified build system buck2 - source control for large orgs(server open source still WIP)sapling - documentation as code docusaurus
light24bulbs@reddit
Haven't looked at sapling! That would be the most relevant one to this discussion. Any good?
xmsxms@reddit
Until they change the interface and you can't choose which version of the component to use as you need to always be compatible with @HEAD.
enzoperezatajando@reddit
usually it's the other way around: the team supporting the library has to make sure they are not breaking anything. more often than not, it literally won't let you land the changes.
possibilistic@reddit
Insanely awesome.
Good monorepo cultures tend to construct shared libraries. Teams construct library bindings for calling their services and other teams can directly interface. Don't go poking inside another service to pull things out, but do sometimes help write code for the other team if they don't have roadmap time for you, assuming they okay it.
Monorepos are all about good culture.
i860@reddit
Everything you just described is an inherent requirement of using separate repos. Once you break everything down to the root reasons you’ll find that monorepos are used because those things are taking a back seat by a given team using it.
There are almost no legitimate technical reasons to use one other than “well I can clone everything at once and that’s convenient.”
95% of the use cases of them are entirely about convenience. Convenience does not necessarily mean good.
OrphisFlo@reddit
It depends. Quite often, teams will create visibility rules to ensure their internal bits are not accessed from the outside, and ensure people are only using the supported API.
So while you cannot import literally anything in your project, you get to import lots of good first-party supported APIs instead, which is probably what most people want.
There's hassle if you then ask the team to open up some internal bits. It's not the end of the world and is usually a rare enough occurrence not to be a deterrent for monorepo (they're great!).
i860@reddit
You’d be amazed at the garbage and technical debt this “ease of use” results in.
light24bulbs@reddit
Exactly dude. And you should still be careful for sure. You should still enforce relationships and responsibilities and have as well defined boundaries as you can.
But what you don't have is a bunch of hurdles and roadblocks fucking you up when things NEED to interconnect.
i860@reddit
If you have tightly integrated code and docs spread across repos you’re already doing it wrong. By no means does that mean throw the baby out and combine everything into one giant repo because the culture has a pathological approach to engineering. It means you separate things out where it makes sense and uncouple things where possible.
“Too hard!”
KevinCarbonara@reddit
Microsoft doesn't have a monorepo at all. ADO just makes it look like one in certain cases.
SanityInAnarchy@reddit
I've now seen a couple of these, and like many things, it depends entirely on execution.
The best thing about a monorepo is the common infrastructure. Want to keep your third-party dependencies upgraded? You can make that one person's job, and now nobody else has to notice or care which version of the Postgres drivers you have installed. Or, at a larger scale, don't like how long it takes IDEs to crawl your entire tree? Maybe spin up a team to build a giant code search engine, and build a language server on top of that, so things stay fast even when the codebase no longer fits on a single machine.
Github absolutely does not do all of that for you, though. And if you either aren't quite large enough to justify that investment, or you haven't convinced management to give you those core teams, or if you don't at least have a culture of cleaning up after yourself, then it can be so much worse. Want to upgrade a third-party dependency? Good luck, half the stuff that depends on it doesn't have tests, you'll be blamed if you break something... are you sure you don't want to just reimplement that function by hand, instead of upgrading to the version of the library that has it? Don't you want to get your tasks actually finished, instead of having to justify how you spent half the sprint making the codebase better?
light24bulbs@reddit
I see what you're saying. I think there is a very midsize where the average company doesn't hit these monorepo problems until they have 50 or 100 devs on the repo at once. I was saying that GitHub has it solved for the medium size case. They drop you off a fucking cliff for the large case, no doubt about it. For company wide monorepos at Enterprise level, you are fucked, I don't have a clue what the vendor offering is for that.
Green0Photon@reddit
That's because they have additional tooling to make monorepos good.
If your average company set up a monorepo, it wouldn't be good. Even worse, a mid size monorepo within a company.
Only a monorepo for a single team, or for the company with special tooling. No in between.
chamomile-crumbs@reddit
I work at a teeny company with only a few devs, and the monorepo kicks ass. Do they get much more annoying when you add a lot of contributors?
I guess you’d end up with a shit ton of branches and releases and stuff for projects that are somewhat unrelated? Like there’d be a lot of noise for no benefit?
touristtam@reddit
It does get a bit tedious to create and maintain script/rules to trigger only on specific cases and for specific targets.
i860@reddit
Imagine if there were some kind of alien technology we could use to keep these things separate so you don’t have to do any of that.
daishi55@reddit
for sure, it's not just a miracle of monorepos. but buck2 is open source
idontchooseanid@reddit
Not just buck2, I guess. It's also the code search, review tooling and many other solutions to enable modularity. A culture that can accept raw commits / master-branch-is-the-only-version-we-use as versions too.
Smaller companies have to stick certain releases and codebases / languages that don't play well with multiple versions of the same library. They simply don't have big enough teams and just the raw power of having dozens of principal / thousands of senior engineers who can grok the complexity of the build systems.
touristtam@reddit
Companies look for solution off the shelf. As long as the big repo hosting solution (Github, Gitlab, BitBucket, etc ...) won't provide this or very parsimoniously the adoption to single monorepo company wide will not happen.
i860@reddit
Yeah, right. Meta’s monorepo is so large that they have tooling just to check out only parts of it because it’s so unwieldy.
Literally regressive and badly reinventing the wheel.
Individual_Laugh1335@reddit
The caveat to this is they also have many teams that essentially support this (hack, multiple CI/CD, devX) not to mention every lower level infra team optimizes for a monorepo (caching, db, logging). A lot of companies don’t have this luxury.
Sessaine@reddit
ding ding ding ding
ive dealt with too many people that tried to force mini monorepos everywhere, because the FAANGs do it... and they very quickly find out the company doesn't invest in the infra teams making it tick like the FAANGs do.
Elmepo@reddit
I mean, I think the fact that it's Metal, and not your 100 person engineering org is important to note here lol
TheWix@reddit
How much custom tooling did they write for this?
ivancea@reddit
I've worked in a big front&back monorepo, with dozens of domains for dozens of teams, +100 devs. And it worked very well.
Not sure what is your problem with it. Monorepo doesn't mean "not separating modules". It just means, that, a single repo
Rincho@reddit
I don't want to see history, branches and shit unrelated to my work
valarauca14@reddit
Then do
git log $target
instead of a blindgit log
across the whole repo?i860@reddit
And at no point is anyone realizing that if you have to fight the revision control system that perhaps you’re holding it wrong?
ric2b@reddit
git log $target
vs
cd ../target && git log
Not that different and the first one actually seems cleaner to me.
i860@reddit
That’s not the issue at all. This isn’t about
git log
accepting a target. We use that all the time and it’s fine. The issue is one of requiring it because the repo being used is a monorepo involving multiple other projects hence git log without a target becomes relatively useless.ric2b@reddit
Right but I imagine you'd just alias
git log
togit log $my_teams_project
and get on with your work.i860@reddit
There’s no reason to have to do this if you don’t embark on the monorepo rathole in the first place. This is absurd that we’re making recommendations on how to work around intuitive features of the tooling because certain people insist on abusing it for a use case it was never designed for.
ric2b@reddit
The benefits from having continuous integration and less repos to maintain and keep dependencies up to date across hundreds of repos are huge, that's why people go through the trouble.
i860@reddit
You cannot just willy nilly update dependencies for hundreds of repos dependent on them unless they’re all using them in exactly the same way and at that point why does it even need to be in the same repo if it isn’t even the same software?
As I’ve commented before monorepos are about convenience - convenience that breeds technical debt by short cutting best practices.
ric2b@reddit
Who said anything about willy nilly? What exactly can't you test in a monorepo that you can in a multi-repo environment?
For me they're about improving integration test quality and reducing unnecessary busywork.
valarauca14@reddit
> using a standard feature of the tool
> using it wrong
K
i860@reddit
Nice reductive fallacy. I do git log some-sub-dir all day. That doesn’t mean it’s the friggin’ answer for filtering out entirely unrelated commits because your repo is badly designed.
valarauca14@reddit
If people are making commits into your sub-dir/module, yeah the repo is badly designed.
ivancea@reddit
Well, or maybe they're just... Contributing. There are codeowners and reviews for those things, so no problem there either
i860@reddit
His point was that in a monorepo you’re not going to have much choice in the matter unless the entire thing is totally submodule clean and at that point it’s not really a monorepo.
ivancea@reddit
Saying that "it's badly designed" again and again won't make that assertion true. You're feeling into the "only what I like is right" sentiment, which is no-good in technology. Seriously
ivancea@reddit
... So you can't work in a company, basically
Rincho@reddit
How is that?
ivancea@reddit
In any team, you'll have branches "unrelated to your work" (Whatever that means)
Rincho@reddit
again, how is that? in my repos there is only code on what I can work in the future. if Im not dba, then there is no ddl code in my repo and no dba work with this repo
ivancea@reddit
I suppose you've never been working in a service but enough to have multiple domains or parts, with multiple teams or specialists. There are such things.
And no, dividing in repos is not a solution, it doesn't always make sense, at all
Rincho@reddit
I can't imagine such a scenario. Do you have an example?
ivancea@reddit
Any monolith with multiple teams working on it. For example, an HR app, with employee, finance, and contracts domains, with a team in each. There are dozens of domains in an app like that, and it doesn't need to be split into microservices it anything like that
Rincho@reddit
So if this is one app in one language, then it is related to my work. Service with its own domain is still - just a service. You can just get in it without changing a position.
If this is different language, for example low level high performance library that my service use, then it should be in different repository, because again there is no business between our teams. We dont collaborate. Our team use the library and it should be shipped just as any other library - via package manager
ivancea@reddit
I don't know why you bring the language here. A monorepo may have services in multiple languages (e.g. back and front), and it may all be your business. Or not.
It's not. It may, or may not.
Wild supposition! But anyway, it has nothing to do with monorepos or not monorepos. Let alone languages.
I don't know what's your point here really. There may be many things in a monorepo, in many languages (???), and of many domains. Handled by different teams, because you won't touch their domains.
You're suggesting the separation of modules is about languages. It's about business domains or technical pieces (which usually represent a domain...). They all may be in the same repository. For good reasons
Rincho@reddit
I am bringing it as an example because this is the real reason on whicht you can base your decision. "Good reasons" is not. Again, in any repo there should be only code related to each developer's work. If its not related, it should be in separate repo. Any other "good reasons" should be resolved by another tool/process
ivancea@reddit
I can just repeat the obvious: the fact that you want "your work" to be alone in a repo is no reason to split a monolith.
"Good reasons" means many, many reasons to not do so. Many of them are, again, somewhat obvious. You can talk with other architects working in monoliths and they can give you more example.
At this point, it's just you saying "this had to be this way because I didn't see this organization you propose before and it can surely be solved in the way I like" (Quick summary). And that's a nasty argument, I won't enter into. If you don't like and want to ignore what I said, feel free to do so
lIIllIIlllIIllIIl@reddit
How would that even negatively affect your day to day work?
You can just open your code editor on your team's project and tag your pull-request on Github / Gitlab / Azure DevOps, so that your team gets notified but not other people. You can ignore the rest.
Rincho@reddit
Why do I need to do that if I can just have a repo for my team
lIIllIIlllIIllIIl@reddit
Why have a repo for your team if you can just copy the files on a USB stick? Because it makes collaboration easier.
Rincho@reddit
Why do I need their code? If it's changing fast and we must have it, why it is not my team who works on it? If the changes are rare, why it is not a package?
TheWix@reddit
The problem with monorepos aren't entirely technical, though there are technical hurdles, it's the organizational requirements and discipline. Most shops don't have the discipline or time to invest in maintaining a good monorepo.
ivancea@reddit
Absolutely. We invested some time in DX: limiting modules public interfaces, clear codeowners for reviewers, etc etc. The earlier it's done, the easier it is
TheWix@reddit
DX is so critically important. The company I am at now has serious DX issues and it's a symptom of poor organizational issues with the engineering department.
ivancea@reddit
Yeah, probably the worst. That said, it's the engineers the ones that should start working in DX by themselves. Eventually, a specialized team may be created, but that's for later, for bigger companies.
That is, of course, unless there're micromanagement issues that don't let devs work in those things
Tiquortoo@reddit
Then they sure as shit don't have it for properly managing micro repos. Just going to get fucked differently in the workflow.
i860@reddit
We should put all files on a hard drive in a single directory and then build elaborate tooling after the fact to only show certain parts of the directory when needed.
Directories and filesystems are just too hard.
lIIllIIlllIIllIIl@reddit
You have this problem but worse when using multiple repositories.
If you don't know how to divide up your project into different folders, how are you expected to know how to divide it up into different repositories?
i860@reddit
Because I do know how to do this and if I don't I spend ample amounts of time to determine the appropriate separation of concerns. I don't optimize my repo around reorganizing its layout every week. I optimize for modularity.
Monorepos == shitty engineeering. It's that simple.
ivancea@reddit
There's practically no difference for a dev between having a monorepo with 100 projects/modules vs having 100 repos. Apart from having to clone and update 100 repos in the second. Most if not all of your daily tasks and workflows remain the same.
Monorepos are an organizational, slightly more devops-based thing. They allow you to run for example full-project CI/tests knowing everything in that version has to work. You of course limit the tests to change modules and their dependants
lIIllIIlllIIllIIl@reddit
And as we all know, requirements in software never change, so taking the time to do things properly and preventing any future change is the best way to develop good software. /s
I'm joking. Half-joking.
I understand your point, but I also feel that it might be slightly fallacious. What prevents you from taking the time to think about the right separations of concerns properly in a monorepo? What makes you think people spend more time thinking about the right separations of concerns in multiple repos?
In my experience, monorepos vs. multirepos changes absolutely nothing about how much time people spend thinking about these things, but it absolutely does change how easy it is to refactor the separations after we gain more insight into the problem, or the product changes, or teams change, etc.
i860@reddit
My main gripe about monorepos is that they do not want to think about things like SoC. They want to be able to change a bunch of stuff all at once across multiple sub projects so as to not worry about backwards compatibility or proper interfacing. The whole thing turns into an inherently coupled mess as a result.
All the things that are hard about software engineering and that take lots of internal thinking and planning on how to do it right are basically ignored because the use of a monorepo side steps multiple parts of that. The cost however is that the software involved becomes less independent and less modular. Abstraction suffers.
LIGHTNINGBOLT23@reddit
If anything, having one file system is the equivalent to a monorepo. Having multiple different file systems spread out is the opposite.
Asttarotina@reddit
So don't look at them, duh
nsjr@reddit
I never worked on a monorepo really big.
Real question:
1 - Do teams import / use functions from other teams / modules? Or is it expressely prohibited, like, you have to copy and paste a function to your own module?
2 - If you can import and use methods / classes / functions from another module, how does integration tests work?
Currently in the company I work, we have microservices, and if a service grows up too much, the integration tests take a lot of time to run, like 5 minutes or more to run everything, and that's the point that we start to think into breaking stuff into smaller ones, because we make thousands of merges every day
One monorepo, how does the CI/CD works? Because if you don't test "everything" and import, maybe the code that you changed break other thing in another module. If you test everything, it would take hours to run
OrphisFlo@reddit
1- Usually anything that's a public API is fair game to import. Using anything internal is frowned upon as the team owning the shared code loses the ability to update their code without having to fix yours at the same time.
2- Test sharding. You just run the tests in parallel on as many nodes as you can. You don't have to test everything all the time, but you could with the right test granularity. Also, when you have a large test suite, 5m is nothing. It might be hours of waiting time, and you then learn to work in a different way. You should not be blocked on a test run in your CI to start the next task.
3- Since you have a complete explicit dependency graph in your build system, you know what targets depend on the targets that got updated by looking at the change. So you can infer a subset of targets that are impacted, and you don't have to rebuild and test everything.
ric2b@reddit
This is awful, at that point someone needs to setup parallel test running with multiple workers to bring it down to something reasonable.
OrphisFlo@reddit
Even then, you might still have tens of thousands of tests, sharding will work but the cost / roi ratio can be optimized to reduce the cost. You could pay for 10k machines/cores to run all the tests under 30s at all times and they'll end up with a <1% utilization rate for a huge cost.
Each group needs to decide what wait time is realistic and aim for less than that (because it'll grow as software gets bigger). And sometimes, it is realistic not to require everyone to run all the tests "just in case" locally. You run a few, and CI will run the rest and late you know later when it's all done (and hopefully merge your change automatically it is been favorably reviewed).
ric2b@reddit
Obviously you don't pay for them all the time if they're idle 95% of the time, you reserve them when needed.
ivancea@reddit
The other comment already answered most of this. I'll just comment a bit on some details:
We used a lib to control that. Limit the public APIs, and any non-public usage was "marked". It's a very hard thing to do when the repo already exists and it's already tangled, so having a file with those misuses was enough: if a PR changed it, it was reviewed and we usually pushed back on the change. Unless it was really complex in some way.
We built a dependency graph between modules, and then ran only tests on the changed files (in PRs), and the noodles that depended on them. Initially like, everything ran. Eventually, by removing those dependencies, it was quite clear.
That last point also answers your last questions with breaking things. We also had E2E tests that I believe we're always launched.
The suite could take between 30m and 1h. Even with just some dependencies. It was slow, but slow for multiple reasons, not specifically because of the dependencies or number of modules, but other internal optimization things. So having this tests graph I commented was very important in our case
TheRealToLazyToThink@reddit
My current project the dev ops suck. So they are forcing us to split our repo arguing that mono repos are bad.
It's a back end and a front end for the same damn app. Worked on a by a single team. I'd be fighting back more against the stupid, but it's been months and we're still waiting on a proper dev/staging env.
KevinCarbonara@reddit
This is pretty much what git submodules were made for. Submodules are not implemented all that well, tohugh.
TheRealToLazyToThink@reddit
No it is not!!! What is this nonsense. There are cases where you can justify multiple repos. Mine is not one of those situations. Sub-modules would be just if not more stupid than splitting the repo.
I feel like I'm visiting an insane asylum with this thread. Has every one taken stupid pills??????
Bunch of fucking Astronaut Architects.
KevinCarbonara@reddit
Good lord. "just if not more stupid"
FatStoic@reddit
Fire them and hire me, I'm devops and monorepo fucking rocks
Select-Dream-6380@reddit
This is where docker is hugely powerful for development. You may be able to spin up all of your app's dependencies locally, minimizing the need for a "proper" (which I interpret as shared) dev environment. I've worked at one place where our dev environment was hardly used because local development and automated testing like this was so effective. The shared dev environment was basically only used to develop infrastructure and automated deployment changes, and we got to the point where we questioned if we really needed a dev environment at all.
i860@reddit
It’s actually totally healthy to separate those because it makes coupling harder. Coupling in software engineering breaks abstraction and is just downright bad.
The reason many folks think this isn’t a problem is that they’re simply mediocre engineers.
TheRealToLazyToThink@reddit
If you need separate repos to properly decouple your software, I'd argue you are the mediocre engineer.
i860@reddit
I don’t need separate repos for that at all. It’s not an a->b ergo b~>a scenario. I’m saying separate repos keep the process honest.
If you try and argue “well you can keep yourself honest in a monorepo too” then it’s a simple logical question of “then why do you need the monorepo in the first place?”
The answer to that question almost always reveals some pathology in approach and it’s usually one of “because this is just easier and less work!”
TheRealToLazyToThink@reddit
I'm working on a single app with a single team. Splitting it up is creating more work for absolutely zero benefit, besides saving some overworked dev op from figuring out how to configure sonar to scan a .Net and Angular at the same time. Or scan them separately, I don't really care I just don't see that as a good justification for making my job harder.
janyk@reddit
Sure, now decouple all your classes by putting them in their own repos, too
i860@reddit
Yes let’s totally throw the baby out with the bath water of course. People use monorepos because they don’t actually care about coupling. They make it someone else’s problem at the end of the day.
lIIllIIlllIIllIIl@reddit
Coupling is a very vague word that doesn't really mean anything.
Look at these two abstractions for writing a text file. One of them has very low coupling, another one has very high coupling.
Low coupling:
High coupling
Coupling doesn't inherently make for bad abstractions. Some stuff is easier when it's all together. Separating things too much can increase complexity.
TheWix@reddit
Oof, I'd keep the backend and frontend together in the same repo.
look@reddit
Entirely depends on the org/history/processes.
When you’re dealing with an old monorepo containing a giant knot of tightly coupled code, finding any seams to even start refactoring can be a struggle.
One of the first changes I made was splitting the frontend out to a separate repo, mostly just to force engineers to have to think about interface boundaries.
TheWix@reddit
I interpreted the comment to mean this was a backend for a specific frontend which means they're tightly coupled to begin with where a change in one will very likely necessitate a change in the other. If that is the case I wouldn't introduce a hard boundary and keep them versioned together.
If they are likely to change independently then I could see splitting them.
What issues did you have keeping them in the same repo as distinct projects?
TheRealToLazyToThink@reddit
It’s a modern web app, there’s already a well defined boundary. This non-sense just means 80% of stories will need 2 branches, and the environments will end up with broken any time the ci for one end finished before the other.
i860@reddit
It’s called backwards compatibility. You can do it.
TheRealToLazyToThink@reddit
I've done that in the past. Used to work on a proper fat client. We had users we didn't even know about scattered about the enterprise. At one point we were running 3 versions of our service serving around 10 versions of the fat client.
Proper backwards compatibility takes a lot of work, produces a lot of technical debt, and demands constant vigilance.
That's worth it when dealing with 3rd parties, or when you have a fat client and can't fully control when your users update. It's a complete waste of time and effort when you are talking about the front end and backend of a web site talking only to each other.
lIIllIIlllIIllIIl@reddit
Are you my colleague?
The architects at my job also argued for splitting the front-end and back-end into different repositories. It's honestly one of the dumbest decision I've ever experienced in my career. We haven't even launched the product, basic features are already taking months to developer because of how frustrating developing the whole thing is.
And yes, we are also still waiting for proper dev/staging environments since mid-April.
KevinCarbonara@reddit
I believe the technical term for this is called a "repo"
TheWix@reddit
No, I could have an API with a few related deployables in a single repository and I'd call that a monorepo. I could stick each deployable codebase in its own repo, but it might make more sense to just stick them all together, especially if they are likely to change together.
One domain can have many deployable.
KevinCarbonara@reddit
Sure, but that's not what literally anyone means by monorepo these days.
i860@reddit
If they are likely to change together then what’s going on there? If you change the API and then change every downstream project dependent on that at the same time then I’d argue you don’t really have an API. You have the veneer of one.
TheWix@reddit
I can have small apps that listen on an event bus that perform some domain functionality, for example. It doesn't have to call a web api. Both the api and the service could reference the same package for domain logic.
Another example, I may have a REST API and GraphQL api for the same domain. It's possible they are two deployable housed in the same repo.
i860@reddit
Your API code should be able to change (internally) without the interfacing layer changing. There’s no reason all users of the API have to be updated at the same time. If that’s a requirement then it isn’t actually an API, it just a fake glue layer.
JonDowd762@reddit
The term "monorepo" covers two very different situations.
If you have a team that maintains five related npm packages and they all share the same repository that's a monorepo. If all the MS Office applications are in a single repository, that's a monorepo. If the company's entire codebase is in a single repository (e.g. Google, Meta), that's also a monorepo.
TheWix@reddit
Yea, I think of a monorepo as any repo containing more than one deployable.
JonDowd762@reddit
That's generally what I go with too. I do most of my work in a monorepo like this. But it's one of hundreds of repos in the company and nothing what like Google does. I wish there was a better term for single company-wide repository.
TheWix@reddit
A Mega-Monorepo!
catch_dot_dot_dot@reddit
I don't agree with this. Monorepos are the best experiences I've had. In my current job we have like 100 repos and there's always a lot going on and I often have to touch multiple repos in a week.
TheWix@reddit
I've worked in monorepos most of my career (17 years). Only worked at one place where it wasn't bad. The rest were awful. The reason why I don't like them is because they require time, effort, and discipline to maintain well.
If they aren't maintained well then they become a headache and add more communication overhead.
lIIllIIlllIIllIIl@reddit
I'm curious. What communication overhead does it add? Were the monorepos just one big disgusting monolith? What prevented you from just putting the different pieces in different folders and calling it a day?
TheWix@reddit
Thankfully, several weren't one big monolith. The issues were around things like changing core dependencies. The downstream projects need good enough tests so you know if you broke something if the breaking change isn't caught by the compiler. I've had issues where a core library changed without me knowing and several months after the change I found out because my app broke on production after a bug fix release.
WenYuGe@reddit
It's possible to build really scalable Monorepos like Google, Uber, and many other shops. It's also possible to build really consistent experiences across many micro repos.
Good experiences in both require you to adopt the right tools and work with best practices from day one.
Many micro-repo are a little easier to start, most tools are built with setups like this in mind. The problem is you'll have to setup tooling for all the new repos and find ways to make them consistent without creating weird little silos where transitioning across repos in your own org becomes a challenge. With monorepos, you can often implement the tooling once and the return on that initial investment would be for the rest of your code, not just a single microrepo.
Another issue with microrepos is pulling a bunch of components to develop features cross services. Testing is also a pretty big pain, where you need to tag/version match on your own repos. Imagine landing 5 PRs at once, too, on 5 repos, where if 1/5 don't merge, the set of changes remain invalid.
While monorepos require specific tools like Nx or Bazel for managing many build targets, you'll need something to lint the many languages and only on lines changed (imagine linting all 5 million lines of a monorepo), you'll run into issues where it's impossible to stay rebased on main because 50-60 PRs might go into a repo a week (or a day). This leads to dangerous situations where you're not always testing your changes on top of main, which could cause logical merge conflicts.
Lechowski@reddit
Monorepo until the build process starts hindering on productivity. Then split.
Slsyyy@reddit
IMO it is more like monorepo -> many repos -> monorepo.
First stage: having everything in one repo is convenient. You don't care about size of the repository nor about slow CI, because everything works fine on a small scale
Second stage: CI is slow, your code is often broken by folks from other teams. It is normal that you want a separation
Third stage: monorepo is the only solution for increasing complexity of the source code
Notice that the first stage monorepo does not use any fancy monorepo-oriented tools like code searches, fancy CI and graph oriented build systems.
Hacnar@reddit
Something similar happened at my previous job. Monorepo broken down into smaller repos, which people then wanted to bring back into a monorepo.
SanityInAnarchy@reddit
Bazel isn't bad. But actually getting people onto a build process like that, and properly optimizing it, is a fair amount of effort.
FlyingRhenquest@reddit
It feels like no one on the planet is working on build instrumentation. The best ones are cancer. They go downhill from there. There are tons of companies whose builds and development processes are preventing them from making as much money as they should be. You'd think there'd be some money in solving those problems.
light24bulbs@reddit
Even then it's very easy to get that to be modular. If you're writing code in it properly modular way which you should be doing anyway (if you have a big enough project to have this question in the first place) then GitHub actions makes it trivial to only re-run certain jobs based on what changed in what folders. It's pretty dang easy. The rest can usually be solved with parallelization.
Any problem that is tricky because of complex dependency chains will be made much worse by splitting into multiple repos. Truuuust me on that one, I've seen some dark dark times
bwainfweeze@reddit
Mono repo, separate compilation units works pretty well.
myringotomy@reddit
Maybe if we had better version control systems this wouldn't be such a problem.
sanblch@reddit
I wonder if there are any significant advantages of many repos. Because with proper CI even non-crossing projects can co-live in a single repo.
Canthros@reddit
It probably depends on your toolchain, your org, and a bunch of other stuff. From working in a place where some projects were broken out to separate repos and some were not:
If nothing else, it makes some things you have to manage by convention in a monorepo, like file paths for organizing solutions, automatic or unimportant. You can handle all those things with the proper tooling, but that's not the same as them being equally easy or requiring equally limited expertise. And determining which approach is better for you is probably going to depend on a bunch of things that are specific to your situation.
Probably the best answer would be to stay consistent within your ecosystem. If you work at a place that likes monorepo(s), go that route and follow their standard and conventions, etc. If you work somewhere that's oriented around many repos, then try to fit your stuff into that approach, instead. As much as possible, try to go with the (local) flow.
edgmnt_net@reddit
Plenty of open source projects, including some of the largest such as the Linux kernel, are essentially monorepos and that works fine. They almost never really run into scale-related issues.
The more important issue is whether you can split your stuff into robust components with some reasonably-stable API boundaries that can be developed independently. Otherwise you'll end up with more, non-standard tooling just to manage a manyrepo that's more or less a pseudo-monorepo in fact. Many enterprise apps, if that's what this is intended for, do not seem in the right mindset for such an undertaking. You won't be able to split the frontend from the backend nicely in most cases, because they are not really independent. Good luck coordinating changes due to cross cutting concerns across a dozen repos with a complex dependency graph.
The issues you mentioned seem to be self-inflicted to a large degree. Many companies think they know better and reinvent fairly standard practice that's known to scale by doing stuff like: one big repo anyone can write to instead of forking, insufficient reviewing, lack of (dedicated) maintainers, people keep pushing untested changes to the CI due to architectural or mindset issues, no commit hygiene, Git host just squashes PRs into huge commits and so on. Yeah, Git is a bit scary to do properly, but maybe, just maybe... people can learn?
All this also relates to the debate regarding microservices, by the way.
idontchooseanid@reddit
Linux is not a mono-repo. It's just the kernel. Yes it has many subsystems but it is not an API boundary. Linux is very strict about not making anything internal to the kernel an API boundary. The monorepos in tech giants cross many API boundaries.
edgmnt_net@reddit
Indeed, Linux as a whole is not a monorepo, but it's useful to compare even the Linux kernel alone to enterprise projects due to its size and complexity. And if we look at the kernel and userland API boundaries they tend to be much more stable, robust and generally-useful (even the
cp
command copies files for a large variety of purposes, it isn't just ad-hoc glue for some specific functionality). Kernel maintainers are quite strict about accepting ad-hoc additions to public interfaces, aim to make them generally-useful and the ecosystem doesn't really depend on prompt merging of this stuff.The question is how many of those API boundaries are actually necessary when it comes to enterprise projects. Are they essential or just self-inflicted pain? I've seen plenty of examples where some architect thought it was a good idea to have something along the lines of an auth service, a shopping cart service, an orders service and so on, along with just about any feature one can think of in its own service. And soon, any medium-sized app now has tens to hundreds of repos and microservices, though it could have conceivably been done as a cohesive project and probably been much smaller. Another important factor is that many of these projects prefer to iterate very quickly and do not think design ahead sufficiently, so the APIs rarely are enough to support new functionality, requiring more changes and more version bumps as things evolve.
The kernel could have also been one subsystem or even one driver per repo, but what would have been the point? Being able to share code and change internal APIs easily are the main points of a monorepo and a monolith.
Although, yes, as far as I heard, Google monorepos tend to shove a bunch of rather separate applications together and they're less about a unified codebase.
i860@reddit
All of them.
It doesn’t matter if you’re writing some “enterprise app” and not the Linux kernel. You should still approach this cleanly and not cut corners because doing so produces terrible technical debt and bad design.
We need to banish this thinking that just because something is written for non public use that all the tenets of good engineering and design get to be thrown out the window and a monolithic wall of garbage is acceptable.
edgmnt_net@reddit
What I meant was the Linux kernel has no internal API boundaries, no stable internal APIs since version 2.6 was released many years ago. But those enterprise projects often make tens to hundreds of internal services each with its APIs, (perhaps unsurprisingly given what I said) they still change often and that change is a pain to coordinate. I do agree that public versus non-public does not matter.
i860@reddit
The reason the Linux kernel doesn’t have this stability internally is because it’s being maintained by a core group of engineers who are responsible for it. I’d argue they should have some semblance of a contract, even internally (and they likely do - it’s just not overtly stated) but regardless it’s still maintained collectively by the same working group.
Within a company (not a fan of “enterprise”) there are almost always separate teams responsible for different parts of the organization and components used within it. Those teams wanting to write and maintain per project APIs so as to promote healthy abstraction, encapsulation, and separation of concerns is a good thing. The fact that it’s painful due to having so many of them is a simply a byproduct of having so many of them. Placing it all on a monorepo in some kind of attempt to shortcut this process is not the solution. The process exists for a reason.
edgmnt_net@reddit
What's stopping companies from doing the same thing, though? They also have fairly stable positions, at least considering engineers higher up in the hierarchy. Also, it's not like Linux doesn't get a lot of drive-by contributions, there are plenty of non-core devs working on it at any given moment (thousands [1]), including teams of employees from companies which intend to merge stuff upstream.
Frankly, I think it's more of a business vision and talent skill issue. If it's "yet another CRUD" built by massively scaling out dev work to contractors and juniors isolated in team silos, then I kinda get why it's a hard sell. But people learn and I know I've been on both better and worse projects. Building up walls makes learning even less likely to happen. And looking at the success rate in the wild, it doesn't seem good lately.
[1] https://lwn.net/Articles/936113/
i860@reddit
What’s stopping companies from having everyone use the same repo and be cross-concerned with the inevitable massive scope of a shared platform? The fact that it absolutely does not scale for anything non-trivial and that a “platform” usually involves multiple disjoint projects written in a variety of languages and implementations.
The reason it “works” in the Linux case is that the scope is kernel, subsystems, and drivers. The core maintainers perform a lot of herding to ensure “outside” commits are shepherded appropriately and not every commit involves changing an internal API at all.
edgmnt_net@reddit
I'm not really suggesting keeping separate projects together in the same repo. Multiple repos are fine for that, it's just that despite widespread use of microservices and manyrepos, many typical SaaS platforms just aren't a collection of separate projects, they're all cogs in the same system and are highly coupled. No less coupled than drivers in the Linux kernel to a common driver abstraction and involving various cross-cutting concerns and shared code. Once projects go down crazier paths like putting individual components like auth or orders or shopping carts into separate repos, doing anything becomes extremely involved. They need to think and carefully consider which API boundaries they can afford to stabilize and support before any split can occur.
That being said, if Google keeps protobuf tooling, an open source RDBMS fork, a message broker and some VM management tool as separate projects, that's fine and expected. It's probably not a good idea to put them together, in fact it's downright counterproductive just to simplify checking out the repo.
On the other hand, I find it rather unsettling when people split even frontends and backends into separate repos. In most cases, these things are very tightly coupled and should remain in sync, especially if you want to iterate rapidly and not care too much about future-proofing the design upfront. The fact that they're written in separate languages or that you have separate teams really doesn't matter all that much. A monorepo and appropriate technology/tooling can make refactoring easy, even on a large scale, without coordinating PR merging and bumping versions across a bunch of repos.
Sure, if you're willing to design your stuff upfront and have them evolve totally separate, you can do multiple repos, but I find most companies are unwilling to put in the required effort and cope with the friction. They'd really have to consider them as separate projects, the same way you don't go making changes to open source libraries or remote proprietary services you're using every day for every feature.
i860@reddit
I think for the highly involved with each other and innately couples case it’s more fine then not fine. However most people are arguing for monorepos containing totally unrelated code but code which is a dependency such that they don’t have to bother with release management or separate CI for the parent projects they depend on to implement lower layer functionality.
And in the case of FAANG companies they really are throwing the entire kitchen sink in monorepos. I know it firsthand.
BenE@reddit
Not only that, but there's a lot of relevant history behind the choice of Linux architecture. Linux is based on Unix and Unix was an effort to take Multics, a much more modular approach to OSes, and re-integrate the good parts into a more unified, monolithic whole. Even though there were some benefits to modularity (apparently you could unload and replace hardware in Multics servers without reboot, which was unheard of at the time), Multics had been deemed over-engineered an too difficult to work with. Brian Kernighan said Unix was designed as "one of" whatever Multics was multi of.
The debate didn't end there. The Gnu Hurd project was dreamed up as an attempt at creating something like Linux with a more modular architecture. Overly breaking things into pieces seems to be a common trap for engineers.
It's Unix and Linux that everyone carries in their pockets nowadays, not Multics and Hurd.
snarkhunter@reddit
Just do both
bjeanes@reddit
Not as impossible as it sounds: https://github.com/josh-project/josh
SoulsBloodSausage@reddit
Whatever you do, don’t use git submodules.
BasicDesignAdvice@reddit
Never used submodules but why are they so bad?
I currently have an intiative to create a monorepo for our protobuf files (just those files). An engineer brought submodules and others were wary but we didn't gain consensus.
SoulsBloodSausage@reddit
Just think of it this way. Most devs never bother to learn more than push, pull, and occasionally merge. For good reason. They’re relatively simple and easy to manage.
Submodules is pretty much the opposite. Not simple at all. Meaning it’d be hard to get right.
Not saying that’s necessarily a good reason not to use submodules but I’d rather err on the side of caution
Tiquortoo@reddit
Submodule cli api is crap too. Why do you init a submodule that exists, but init a repo that doesn't. From the start the interaction is awkward.
CrayonUpMyNose@reddit
There might be an interesting experience here, care to elaborate?
SoulsBloodSausage@reddit
Ehh not much to say. Last company I worked for relied heavily on submodules instead of mono repo. It was a massive pain in the ass to push a full fledged feature because sometimes you’d have to break it up into multiple PRs across multiple repositories.
snarkhunter@reddit
I'm addition to git I get to use Perforce Helix Core which has Streams, which are what we all wish git submodules were.
drakgremlin@reddit
Usually along team or product lines.
beefsack@reddit
The worst one is actually when companies try to put elements of a tightly coupled application into separate repositories, then do so much gymnastics to try to keep changes compatible between them.
Saetia_V_Neck@reddit
This is why I’m very pro-monorepo. In my experience it is much easier to handle disparate components in a single repo than tightly-coupled pieces in different repos. Having to coordinate pull requests and deployments is a massive headache and time sink.
valarauca14@reddit
> Let's keep our protobuffers in 1 repo that stores all our bindings!
> We'll add a bunch of new fields, rebuild the bindings, and deploy!
> Why isn't application B, C, and D responding to the new fields?!? Why aren't they sending the new fields?!?
> > Did you rebuild them so they have the latest bindings?
> That is $other_team's job.
goqsane@reddit
You’re lucky you are using ProtoBuffers. My insane inept “ChIEF aRcHiTeCt” made their own freaking standard and serializer/deserializer for C#/C++ but at least I guess he adhered to the main problem that you’re pointing out here. One central repo. No local pulls and rebuilds. :D
glaba3141@reddit
Protobuf is garbage so I don't blame them
safetytrick@reddit
The kind of insanity you can get away with when you are focused on solving your own problems instead of just doing what others do...
Your number one priority should always be solving your problems, somewhere after that is following well known industry patterns.
Moonshoedave@reddit
Going to start using $other_team at work lol
Worth_Trust_3825@reddit
Even if they did read them without any changes, would they be supported?
valarauca14@reddit
The only way they'd get transfered along if the orginal protobuffer was copied, instead of say fields being individually assigned after a new object is created.
etcre@reddit
Forwarding this to my management
st4rdr0id@reddit
Absolutely. Creating libraries or modules when you just need packages is overkill. Downloading a whole suite of projects when you only need to work in one is overkill. Both are equally bad.
Monorepos don't make sense most of the times though. In my entire career I just encountered a single case where monorepos might have made sense. In the end it is all about reducing complexity, bloat and cognitive load.
gmes78@reddit
Git lets you clone only a subset of files for this use case specifically.
hippydipster@reddit
I doubt creating modules is overkill. If the environment warrants a serious discussion about monrepo vs multiple repository, then I doubt modularization is overkill.
disposablevillain@reddit
I don't understand why this is so common.
Backson@reddit
Hmm there was another thread a while ago where people were hating on monorepos and I was lile "why though, my team has 4 repos for what is basically a single application, it sucks" and I got downvoted and people were like ", no, monorepos are SHIT, how DARE you!" So I totally get that people splitting their app into a million repos would be a common problem.
Naouak@reddit
Because of the broken window effect and because people tend to break window all the time if no one is watching them.
The broken window effect is basically telling us that the more something is considered bad, the easier it is to introduce even more bad things. https://en.wikipedia.org/wiki/Broken_windows_theory
It's really easy to introduce coupling between to services, really easy. It's also a lot faster to introduce coupling instead of trying to make everything less coupled. It's usually faster to code a http query to a service than emitting an event and consuming it (not the best example but the most common culprit I've seen regarding coupling). Once it's been done, it's even easier to continue like that. And then one day, the coupling is untenable and everything needs to be updated for any change.
Add to that, that people are really bad at separating concerns in general and there's a tendency to not redefine things during their life. You could have a blog service at first then once it becomes too complex, that blog service could become just a blog-listing service while the blog-article-management service was created on the side. You would almost never see developers telling you that the blog service is not a "blog service". When something like that happens, people tends to put features in the wrong place and slowly create more coupling.
EliSka93@reddit
Wasn't broken window theory basically proven to be racist bullshit?
Naouak@reddit
In the context presented here, it has nothing to do with criminality and the criminality application of that effect are not relevant to my explanation.
The concept has been used many times outside criminality to explain that you try to keep something pristine if it is, if it isn't, you tend to be more careless about its final state. You can probably observe that in your everyday life. Whenever you get something new and expensive, you usually are very careful with it the first few days to keep it pristine.
itsgreater9000@reddit
in terms of an economic theory, i think yeah, it's pretty useless. i prefer tragedy of the commons for what they are describing
defmacro-jam@reddit
Build time. If you have a monorepo that takes an hour to build -- you may be able to bring build time down to a small fraction of that by only building the part that changed.
i860@reddit
Imagine if you took it a step further and separated that part that changed into its own separate repo because it’s its own separate thing.
defmacro-jam@reddit
That's what I meant.
TimeRemove@reddit
Because people will follow "best practices" without a full understanding of the CONTEXT. The core idea behind micro-architectures, including isolated repos, is that there is an abstraction layer making different parts of the system somewhat to very uncoupled.
A classic way of doing this is via WebAPI for abstraction, with an API that they're required to keep backwards compatibility for (or, failing that, migration paths/notice of end-of-life) for their [internal] customers.
However, in the real world for performance/ease/speed you'll have different parts pass rich objects or leak internal workings, resulting in tight coupling and needing to move together.
BenE@reddit
This is not making things uncoupled. It's coupling them through a flaky, broadly scoped layer without any static checks, the worst possible type of coupling.
i860@reddit
You’re basically arguing that it should be as coupled as possible so you can test it all at once.
Terrible. Test things in isolation. Separation of concerns!
CherryLongjump1989@reddit
If anything, a bad API or poorly maintained contract might make the communication fragile, but that’s not more coupling—it's just poor design.
You’re basically just promoting tight coupling. Words have meaning. Decoupling is when you have direct dependencies between two systems. You are still allowed to have a contract, but you’re not allowed to have one component depend directly on the other at compile time.
i860@reddit
The last paragraph is a sign of absolutely bad engineering regardless of it being “well that’s just the way we do it here” and why I’ve been harping on monorepos. They encourage and enable said engineering to continue rather than providing the healthy boundaries that multiple modular repos do.
Things can and will move independently of each other using interfacing layers to pass data and programmers need to keep this in their mind at all times. This doesn’t mean supporting 20 major versions back but it does mean not side-stepping or ignoring it.
moonsun1987@reddit
bounded context IS hard and it is not supposed to be a job for programmers. these are hard business decisions that the business has to think long and hard, not bang out in a two-week sprint.
Ravek@reddit
It seems like a lot of people just are unable to project the consequences of their current decisions into the future. Let’s tightly couple the code for four distinct features maintained by four different teams with different management structures, geographical location, and user needs. What could go wrong?
amestrianphilosopher@reddit
Because on average our profession is filled with amateurs. The guy who has piled on technical debt like this at my workplace is worshipped as a god by product and management. He’s driving nearly all of us to quit
nightofgrim@reddit
I have one of those too! All the engineers around him know it, but the suits up top think he’s amazing because he created a thing that every blog in the tech world will tell you not to do, but he did it “better”.
(Staying vague for reasons)
jamesj223@reddit
He built the torment nexus?
EliSka93@reddit
No, just the orphan crushing machine.
davehax1@reddit
I heard it was the hyper orb of agony. Pure rumour, of course
cheapskatebiker@reddit
Yes I see that all the time, one guy/gal delivers tremendous value to the business and every other ner complains about 'technical debt', 'support overhead', and other mambo jumbo.
Why can't they just shut up and show up in the office like real people?
Don't be daft it's /s
fiah84@reddit
hey now I get paid for my amateurism, that makes me a professional!
946789987649@reddit
ashsimmonds@reddit
Sometimes management outsources - eg one project I was tech lead on we needed some front-end work done, I was busy building the API and stuff. Boss found some cheap folk O/S but we couldn't expose our business logic, so had to split into multiple repos. PITA but it worked out ok.
ImNaughtyShiba@reddit
Monorepo A, with package AA and AB Repos B,C,D,E depends on AA AB depends on packages B, C, D, E
fml. And literally 0 time dedicated from business for cleaning up.
Reverent@reddit
Pretty easy way to separate.
Does two services interact with an API? Yes? Separate Repos. No? Then no.
jet2686@reddit
I wish I could give you multiple upvotes!
Evilan@reddit
Our team has found that multi-repo works best for splitting out technologies (Client in one repo, web UI in another, backend in a third, etc etc). However, we do use a monorepo for splitting up the modules that make up our multi-repo strategy (ie our backend has a core module, data module, external module, api module, etc etc).
It's probably not perfect, but it works pretty well for our use-case.
lIIllIIlllIIllIIl@reddit
How do you handle changes to span the backend and frontend? Multiple PRs?
Evilan@reddit
Yep, if a change that effects the backend also affects the frontend, we make multiple PRs depending on what is impacted.
At the same time though, our modules limit the actual scope of what is impacted across those repositories. We also use GitHub for our repository manager and it makes linking to other repositories and PRs ezpz
New-Championship7579@reddit
I’m not the person you asked, but I’ve found that I prefer having changes split across multiple repos because it forces you to break them up into digestible chunks which results in better code review feedback. It’s easy to link a related PR in another repo if someone needs it for context. When rollout of those changes needs to be coordinated across multiple repos, feature flags are your best friend.
pdpi@reddit
Either is fine, as long as you fully commit to your choice, and invest in appropriate tooling. As it stands, publicly available tooling (either open source or commercial) for a multi-repo setup is much more mature, but my experience with well-setup monorepos has been pretty stellar.
_Pho_@reddit
Yup.
I worked at a larger enterprise where they had a couple of devs who were full time ensuring the monorepo stability and DX. Was insanely good idea and paid huge dividends.
baseketball@reddit
unfortunately leaders at smaller shops look at big tech and think we should do what they're doing without realizing the scale and resources required for which these practices make sense.
tristanjuricek@reddit
I’ve struggled enough with both systems, and am currently in a hellscape of a monorepo, to know that “mono vs many” is rarely the significant decision. I wish more places would just monitor lead times (from approved commit into production). It’s rare that the choice of monorepo or many repos is really a major factor; instead, it’s random manual steps, terrible testing environments, etc, that always cause the real problems.
DayByDay_StepByStep@reddit
Weird, I have found the exact opposite to be true. Could list a few of these multi-repo tools? I haven't had much luck.
CallOfCoolthulu@reddit
Sourcegraph for search and batch changes, Renovate for dependency management.
mixedCase_@reddit
This. Monorepos can be amazing but they need the investment to back it up or it goes to shit if you "move fast and break things". Unless common sense can be enforced and mandated top-down with as much automation as possible, it's much better to let the shit-shovelers have their own repos with their own (lack of) standards so each repo has their own quality tier and inexperienced devs with big mouths don't bring down the quality of everything else.
KevinCarbonara@reddit
Yeah, there is. Don't use monorepos. The big companies you've heard of using monorepos have a lot of software to allow them to treat monorepos like many repos. And actually, some of them actually are using many repos and just calling it a monorepo.
i860@reddit
The most hilarious part is they then write all of this tooling to basically “invert” the monorepo into separate smaller parts but never actually realize they’re reinventing the wheel, badly. They become pot committed to the idea and the entire thing turns into a massive sunk cost.
“But I can clone everything at once!”
“But I can search the whole repo for interdependent code!”
“But I can change multiple things at once!”
Always a symptom of a malady somewhere.
KevinCarbonara@reddit
Okay, where can you not do this? I can search like, all of github. And I'm pretty confident github is larger than your company's monorepo.
i860@reddit
That’s my point. You can still do this through any number or mechanisms, even without GitHub involved.
All arguments for monorepos are inherently rooted in convenience and bad design.
enaud@reddit
The best way is to have 2 siloed teams in your company, one using a monorepo and the other using micro repos. Eventually the company will shrink to 1 team that has to context switch between both
dinosaursrarr@reddit
Hello colleague
Worth_Trust_3825@reddit
Oh. So that's why I had such combinations of repositories.
light24bulbs@reddit
😆 did this happen to you, do you need to talk about it?
toddffw@reddit
Is the git history in the room with us right now?
HTTP404URLNotFound@reddit
I guess we all have the experience
urbrainonnuggs@reddit
Or better, your company keeps buying other companies with completely different tech stacks in every different cloud possible so you force every team to start using a terribly over complicated hybrid cloud tool for deploys!
darkpaladin@reddit
Are we co-workers?
paralio@reddit
yes, many repos. no doubt.
light24bulbs@reddit
Monorepo is far better for tightly integrated code, 100%. You should fucking never split modules of the same thing or even documentation between repositories. It sucks balls if you do, and it's fine if you don't so mono repo wins
i860@reddit
How is that really a monorepo then? The code is highly related and effectively part of the same repo. A monorepo involves multiple projects of sometimes completely unrelated code.
light24bulbs@reddit
Yeah wait until you see people try to split interrelated projects across repos because they think that's the same thing as modularity, it is a nightmare.
A good example of what makes something a monorepo would be having the front end and back end and documentation all in the same repository. Or let's say you have two back end services and they share some code. They go in the same repository too. That's a monorepo. It's not just one piece, it's the product
i860@reddit
As to your first point that’s just a dumb way to do it with the caveat that just because something is related doesn’t necessarily mean it goes into the same repo. You absolutely can have front and back end split off separately, or even client vs server for that matter.
One of the things I don’t think you’re considering here is that software that works with other software is inherently versioned. It has to be compatible with older versions that exist for a multitude of reasons. One didn’t get the luxury or just updating everything involved at exactly the same time.
Separate repos keep this requirement honest.
lIIllIIlllIIllIIl@reddit
You're saying that because that's what you're familiar with.
If I refactor a function in my codebase, I don't need to make sure the refactor is backwards compatible. I just update it. I don't gain anything from making everything backwards compatible or versioning my change.
Backwards compatibility is only required when you don't control all the parts and you can't update all the parts atomically. Maintaining backwards compatibility adds a lot of complexity to a project and slows down things a lot. If you value stability over all else, it might be a good deal. If you value time to market, it definitely isn't a good deal.
You can avoid backwards compatibility issues by having atomic builds and deploys inside a monorepo.
Monorepos aren't dishonest about needing versioning, they literally don't need it.
i860@reddit
No. This isn’t just something I’m familiar with. It is a core FACT that this is how software operates. You have no guarantee of being able to deploy all involve parts in one transactional swoop. It is never atomic. As such software has to be written to account for use of older versions for some reasonable period of time and not depend on the impossible goal of everything happening all at once (which is a terrible way to safely deploy things as well).
I’m not arguing about a single function refactor. That should be invisible to your consumers anyways. But if you believe refactor means changing a bunch of arguments and return values such that you need a monorepo to pull it off then it’s just busted. You must provide compatibly layers until everything has converged at which point then you can clean it up.
lIIllIIlllIIllIIl@reddit
Most monorepos are modular monoliths. It's all the same project, but there are multiple parts that may be separated in multiple packages, written in multiple languages, use different tech, etc.
For example, you might have a Go backend with a JavaScript front-end, and one performance-heavy backend module written in Rust. You want your developers to be able to build and run the entire thing during local development using a single command.
That's what a monorepo is.
i860@reddit
Most monorepos at large companies are not actually modular monoloths. They’re massive repos with every piece of software involved in the “platform” checked into a single repo.
This isn’t something where you have a client/server code base with an agnostic network accessible API and multiple per-language implementations in the same repo (IMO even those should be split out) but instead every single piece of software involved in the platform in the same giant repo. They then write tooling to make working with this not be a total nightmare or wall of noise.
And then they try and argue that this is actually somehow sane. It never is.
BenE@reddit
It's sorta nuanced. My take is that you want to reduce code entropy. This means defaulting to monorepos and monolith early on in order to get tightly scoped hierarchically organized logic that reduces the surface for various problems. Then later maybe carefully break out parts that could clearly benefit from being separate, only after having hardened them, and always being very aware that you are broadening their scope, are coupling them through less reliable, less statically checked global layers and they will be more difficult and dangerous to change once they are at that layer so they have to be more mature.
https://benoitessiambre.com/entropy.html
i860@reddit
You’re arguing for tightly coupled code from the start to reduce “code entropy.” On top of that you’re arguing that breaking them out to be less coupled is “scary” because you can’t test everything all at once.
I have no idea why this pathological approach to engineering keeps being pushed. If you don’t have separation of concerns then you have brittle badly designed code. The ability to test it all at once to find problems is a SYMPTOM not the solution.
BenE@reddit
No I'm arguing for tight, hierarchical scopes, not tight coupling. Ultimately, coupling is more determined by the domain requirements anyways. Doing it via apis and queues does not result in less coupling.
Having lots of broadly accessible interconnected pieces makes for a chaotic and unpredictable code base. If you whiteboard this it actually looks like a plate of spaghetti.
Hierarchically organized code with parts tightly scoped to their relevant areas is much more orderly.
Testing units in isolation can be achieved by slightly broadening the scope of a unit when necessary (after weighting the trade-offs of doing so).
pabs80@reddit
It depends a lot on the tooling available and your organization. At my previous employer, we had separate repos for the frontend and backend of the same app. I combined them and it saved me from a lot of problems where we had to keep coordinating pull requests. But I wouldn’t have put the entire company’s software in only one repo, that would have been awful. We were using Github. At my current employer, a very large tech company, there’s a monorepo for the entire company, and that works out very well and you can configure things by folder, stuff that in GH would be at repository level.
hammonjj@reddit
Break it up along team boundaries and have mono repos within a team. Releases get so boned when you have to push multiple repos for a single feature.
vplatt@reddit
One could rationally argue that a given repo should correspond to one of three things:
A set of files that get used by pipelines across multiple repos (not binaries!)
A project that builds to a single deployable service or app.
A project that builds and publishes a binary for later use in a dependency management tool chain (e.g. GitHub Releases with Artifactory)
But.. reason people can disagree on that too. Barring that one has to weigh the # of repos should be determined by the amount of chaos you want to endure in branches/PRs vs. that of the extra pain in dealing with extra repos. I mean, if you're not going to use a solid standard for this, then at least the subjective feel has to be weighed.
The only thing I'm absolutely convinced of now that is that, especially with PR's and other peer review processes, is that monorepos shouldn't be the default anymore. It's simply too chaotic to allow multiple teams with multiple ongoing reviews and PRs to be operating out of the same repo or ADO project.
i860@reddit
Monorepos were created because of all the “hard” coordinated work to do things across multiple independent but involved repos is fundamentally hard. The solution to that was to throw everything into the same repo and declare “success.”
People are paid multiple hundreds of thousands of dollars a year to fundamentally regress our approach to software engineering because they cannot be bothered to do all of the hard stuff that actually makes for good engineering.
After it all implodes under its own weight they’ve usually left the company by that point.
ososalsosal@reddit
If code is being shared by multiple products, then why merge those multiple products into one codebase instead of splitting out that shared code into a module that can be maintained separately?
doktorhladnjak@reddit
Mainly because it forces you to keep the shared code in sync with the code using it. This comes with an overhead cost, though that cost does not go away if you keep them separate. The cost is just delayed for later, at which point it may be more expensive in that resolution may be more complicated.
Consider the case where you have 3 components, each in their own repo, that depend on some shared code. You need some change in the shared code on version 1.1 for one of the components. You code it up as version 1.2, vendor the change, integrate it into the component. Everything is good!
Three weeks later, someone needs to make a change to the shared code for another component. But it turns out your change breaks their code. Your change didn’t consider their case and needs updating. Fixing it now blocks that other change from happening, so they must choose between two options 1. Fix or otherwise integrate your change before making the other change as 1.2.1 2. Add the change as a patch versioned 1.1.1. The team is unblocked and can move fast but the 1.2 integration still need to happen.
Option 2 means a form of tech debt is building up. If this gets delayed repeatedly, they might be on 1.1.5 while the mainline shared code is up to 1.4. If those patch fixes conflict with the minor version changes, it may be even more painful to resolve everything. In the worst case, the codebase eventually becomes forked because the resources are never available to resolve the two. Bug fixes get missed or must be done twice from scratch.
ososalsosal@reddit
Package management usually includes versioning. When it's time for everyone to update they can. Handle it like any other library you'd use that you have no control over.
If significant code is shared between different products then discussions need to be had because the tight coupling you're talking about is a bad idea no matter how you handle it at the repo level. The shared code should be a library with an api and documentation, like any other library
doktorhladnjak@reddit
If your organization has the discipline to do that, it can work relatively well. The longer iterations can be intolerable for a lot of organizations though. Like anything, there’s tradeoffs involved where different organizations might fall on one side or the other.
i860@reddit
You’re basically arguing for bad engineering over good engineering because your organization has no patience for the latter and demands the former.
ososalsosal@reddit
My org is reasonably slow moving, but key products move very fast ad we catch up to the promises that sales make to customers...
So we tend to YOLO proof of concept stuff until it works well enough to deploy, then at some point a similar thing will be built in another product, one of their devs will remember the same function in the other product (some of us tend to shift around every 3 months or so between projects) and steal that code from the other repo and integrate it.
If the feature is nontrivial, at some point we'll decide it is it's own internal product and should be maintained as such. Then we flesh out and settle on an api for it and make it a module. Any updates that need to happen on it will most likely have already happened during the development of the features that rely on it, so the api is usually pretty stable and all that remains is maintenance. If it's not stable then it'll probably end up as 2 separate chunks of code in 2 projects that will gradually drift apart over time.
i860@reddit
Nothing about a monorepo fixes any of this. Just because it “forces” you to do it all upfront doesn’t mean the same fundamental problem isn’t there.
Your issue is one of integration and backwards compatibility. If your change broke an unknown use case it’s because your use cases aren’t really documented anywhere. Chucking it all into the same repo so you can grep everywhere for it at once isn’t the solution whatsoever.
light24bulbs@reddit
Because now when you make a change in that code that you want to affect your other code you have to go and put that first shared code through a release process and get it released just so you can consume it in the product you're actually trying to improve. It's a state-synchronization nightmare as soon as there's even two modules in the chain, let alone three. It's not continuous integration and it literally cannot be. It doesn't scale.
The answer is to do exactly what you just said but put them all in the same repo, which solves both problems admirably.
ElecNinja@reddit
Yeah encountering that now in my job and it's pretty annoying to have to wait a day or two so that I can start using the changes I made in the other repository.
i860@reddit
Just wait until you’ve been part of a massive production outage. Maybe you already have and don’t think it’s a big deal. It is a big deal - especially when you’re working for a very large or important company. Release processes exist for a reason.
There are ways of handling things that don’t require you to wait until the latest change is released in order to work on downstream code. It’s called a development, staging, or integration environment.
i860@reddit
Yes everything you just described in the first part is what is required as part of good software engineering practices. Yes, it’s hard. No, a monorepo isn’t the fix - it’s a shitty hack.
Tiquortoo@reddit
Everything should align with communication and permission structures. Multiple products don't always do either and even less rarely both.
dylan_1992@reddit
With package managers why would we need a monorepo?
doktorhladnjak@reddit
I worked at a company that did this with thousands (yes, thousands) of repos. It was a nightmare.
The biggest problem was that you’d vendor in some internal library, only to discover it needed a new version of some dependency. Then that would conflict with some other dependency that still depended on an old version of something. Sometimes some legacy library was needed which was no longer supported and therefore it had no plans to upgrade. So then you’d have to decide if you want to spend the time to fix it, and risk becoming the new owner by being the last to work on it.
The second big problem was that people would make breaking changes all the time that weren’t properly communicated through semver. So fixing some small bug affecting your service meant having to update your code to keep using the library. Library owners didn’t have the time to be doing careful patch releases on some legacy minor version. They’d just make all changes on the latest minor version then cut a new patch.
At least in a big company, these two problems are solved by monorepo. There’s one version of every dependency. When upgrading, you have to upgrade all the code that depends on it. Similarly, if you change your library, you have to fix user teams’ code. You can’t just throw it over the wall for them to deal with layer.
The downside is that making these changes becomes much more expensive. But it always sort of was. Monorepo just forces you to deal with it immediately.
dylan_1992@reddit
So what’s the difference between a mono repo and setting all dependencies to pull in the snapshot in your packager manager?
doktorhladnjak@reddit
You still have to package code before it becomes available. It still means multiple commits in different repos to make a change as opposed to potentially on atomic commit.
i860@reddit
The “atomic commit” that hits multiple projects at once in a monorepo is such an obvious symptom of a bad approach. You don’t need to be doing this to “make a change” you need to be making the core change and then updating the “client” repos after the fact but before they’re fully updated your core change needs to be backwards compatible with potentially older versions.
Imagine if every change in the Linux kernel involved updating all of GNU user land at the same time and they all had to be deployed together. Most sane engineers would argue that’s completely insane and yet here we are.
Forbizzle@reddit
This is honestly a major skill and culture issue.
i860@reddit
That’s heavily associated with your typical monorepo users.
i860@reddit
Yep. This is because monorepos encourage terrible fucking engineering where cowboy engineers just assume everyone is using the latest HEAD version of everything everywhere. If you have separate repos you’re forced to think about interfacing and this is why bad engineers like monorepos: proper abstraction and interoperability is hard.
light24bulbs@reddit
Because synchronizing an atomic state across multiple repos and hoping your package manager just solves the problem for you is a fucking nightmare. If you don't have a monorepo you are not continuously integrating.
Have fun making sure that a change in A is published so that it gets consumed by B and then making sure that B rebuilds so that it consumes A and then having C rebuild because it depends on the new version of B which depends on the new version of A. You're fucking fucked. It. Is. A shitshow
i860@reddit
Your software design sounds absolutely terrible.
induality@reddit
Because source code management is an easier problem than dependency management
ArtPsychological9967@reddit
After working on a massive Go monorepo. Never again.
doktorhladnjak@reddit
I can assure you working in a company with many Go micro repos is no picnic either. The real issue is large code bases require good tool support to be usable.
i860@reddit
The real issue(s) tend to be tight coupling and lack of proper encapsulation and separation of concerns. Multiple repos make this highly visible (and something that should be fixed!). Monorepos make the problem “go away.” The problem is still a problem.
bwainfweeze@reddit
Massive code base is its own problem.
Fantastic_Credits@reddit
really depends on so much.
If you ask most architects especially if they aren't writing code themselves they will always want a solution chunked as small as possible from the get go as that gives them flexibility to break up applications from an infrastructure perspective.
In the end it comes down to your organization.
Do you plan on sharing or passing off components to another business entity soon?
The real benefit to a seperate repo is portability. If your making something like an npm, nuget, maven, or whatever package it may make way more sense to place that in a seperate repo. Some other items like a class library or anything that isn't the primary application/s might be better to live in a seperate repo.
Is your CI/CD solution or Development Operations silo capable of handling a monorepo?
I encounter a number of companies that have an unsophisticated devops team who owns the CI/CD process and a monorepo might be beyond their ability to ingest or at times that silo or a COE has an approved process that doesn't account for this type of repo. Also side not please stop siloing DevOps and stop hiring people who haven't been in software development under 5 years as devops people its not an entry level position it requires understanding of software development to do and is a training/teaching position not a new silo it's a senior developer position.
Does your device handle multiple IDE instances well?
This one may sound stupid but I've seen this before. If the company gives there developers a potato then breaking up a repo makes it half impossible for people to do their job. A monorepo means 1 ide window (sort of) and just requires less computer.
Do the tools of your language/framework/tooling support monorepo features?
Most dev languages and frameworks easily support this but some its not as easy. Make sure whatever your working in has good support for it.
How big is your organization?
Different architectures, languages, and tools work better for different organization size how you store your code is no different. If your a small shop with a short list of products you support then monorepos are likely the way to go just for convenience sake. Just ensure your using coding best practices and implementing interfaces and writing modular code that can easily be broken off into a separate repo or library if needed anything you produce in a monorepo should be easy to break away if necessary. A monorepo doesn't work for a company with thousands of developers but it works great for a company with 5 and if the organization grows and certain code needs to be shared then you can just break it out into a new repo.
I'm sure there are consideration I'm missing here but for the most part I really think this is a business/organization specific decision. In the end go ahead and do a monorepo worst issue you may experience is needed to create another repo for each lib later.
onetwentyeight@reddit
The many repos in that image should be hundreds to match reality.
honeyryderchuck@reddit
Many monorepos.
funciton@reddit
Tightly coupled, with a 1:n version mapping.
How does that work, you might ask? That's the neat part, it doesn't.
roniadotnet@reddit
Money repos!
wineblood@reddit
The worst of both worlds.
bpikmin@reddit
The worst of many monoworlds
CrayonUpMyNose@reddit
The metaworld, as it were
voxelghost@reddit
There are many answers, but no monoanswer
kyune@reddit
So.....Flux?
God I hate writing Spring Reactive
F1_Legend@reddit
I like writing it, but man do I hate debugging it when it goes wrong...
throwaway_568765@reddit
Perhaps the answer lies in string theory
often_says_nice@reddit
I will say having worked on both, I now prefer many repos. Mostly because you can copy paste the entire repo’s code into ChatGPT and ask “where/how is X implemented”. Or “change this code to implement Y”.
Right now context windows don’t really work well with monorepos. You need to start doing vector embeddings and keep those up to date. It’s a much more involved process
quicknir@reddit
r/AngryUpvote
TCB13sQuotes@reddit
The monorepo trend is bullshit. This causes more issues than what it supposedly solves and one must be crazy to think that is a good ideia to have 300 apps inside the same repo.
hbthegreat@reddit
I'll propose the 3rd newer type of repo. Optimised for LLM ingestion and output
mladi_gospodin@reddit
1 repo = 1 tech stack. No?
Commercial-Ranger339@reddit
Been using nx with a monorepo for 2 years now. It’s an absolute joy
JonDowd762@reddit
Like most questions, the answer is "it depends". Their are pros and cons to each approach and the best solution will depend on your project's needs.
Just stay away from submodules, that's all cons.
sudhakarms@reddit
Monorepos with proper setup. Been using Nx monorepo toolkit for years and it works great.
Computation Caching - Reuses already built artefacts in both local and ci/cd pipelines
Compute/Execute only affected tasks.
Dependency graph generation
Code generation
Define constraints for better code organisation
More info at https://monorepo.tools
salamisam@reddit
One of the companies I work for uses NX.
Tooling for monorepos is very important and adds a lot to the user experience. NX does a good job of this.
QuotheFan@reddit
If you want to separate access, many repos is a good way to do it. For example, in HFTs, people strictly want to keep knowledge proprietary, so everyone only gets access to code they need. So, we go the many repos way. If you are anyways going to give access to everyone on all these repos, why separate them in the first place?
FloydATC@reddit
There is, but you're not going to like it:
It depends.
NiteShdw@reddit
It's about tooling.
Multi repo has the problem of consistency between repos. Updating any of the tooling requires updates to all the repos. When a repo doesn't get updated it gets pit of date and you end up having to have many different versions of the same tools, or worse, different versions of different tools.
Monorepos have the benefit of establishing the same tooling across the board, same commit hooks, same linter, same formatter, same package manager, same CI process, etc.
But, you also have downsides where small changes trigger a build that takes a long time because it has to compile and test everything.
So Monorepos need better, more complex, tools to be efficient.
Multirepos end up with a complex web of different tools and processes that can be equally frustrating.
So... Weigh the pros and cons. Discuss as a team. Make a RATIONAL decision, not an emotional one.
i860@reddit
You can completely automate all of the repo shared hooks, lint definitions etc with all manner of approaches. This doesn’t require everything being chucked into a giant monorepo. Doing that is going backwards.
NiteShdw@reddit
Can you give some examples? I'm dealing with this right now with many repos being copy and pastes of each other but all slightly different and it's a pain in the butt.
i860@reddit
Use a single repo for all the shared definitions. Package it. Install it. Have your repos use it just like any other file on the file system. If that’s not possible write lightweight tooling to ensure dependent repos keep it in sync.
NiteShdw@reddit
I'm already doing shared packages for some of the tooling but I still have to go manually update each one when the version changes. It's MUCH simpler in a monorepo.
i860@reddit
Why so you need to manually do anything? You’re missing automation somewhere. Are you saying you have nothing actually enforcing latest or specific package versions on a particular development host? There should be a release process that handles this. Obviously for working on the shared tooling itself you can locally use the in progress repo until you’ve got the work sorted out. But this stuff should still be packaged and released afterwards.
NiteShdw@reddit
Yeah that's exactly what above my post above was about. It's about tooling. That's what I said.
Multi repo is magically better than monorepo and visa versa. Tooling is what's needed in BOTH situations. So to claim thst multi repo is better but then dismiss the cons by saying "you need better tooling" is ignoring the primary issue.
i860@reddit
The tooling I'm referring to that you're basically equivocating as being the same kind of tooling for monorepos simply isn't. You still need host and package management for everything regardless of how repos are used - whereas tooling for monorepos is specific to monorepos. One is healthy and normal and the other is created to deal with a fundamentally regressive way of working with a version control system.
You're cutting corners here.
NiteShdw@reddit
I'm cutting corners? As in I have neither the time nor the approval from management to spend my time building a bunch of tooling to deal with my current issue and so I have to do the best I can?
Not sure what exactly you're arguing here.
Both options have pros and cons. Neither is inherently the best or worst option. It all comes down to each particular situation and whether which pros outweigh which cons.
It's a poor engineering mentality to have a predetermined preference we without being open to evaluating different options.
kitd@reddit
Note you can do this in the Java world with Jitpack
ayrusk8@reddit
If two applications are tightly coupled and interdependent, a monorepo approach is ideal. Otherwise, it’s best to maintain them separately. However, managing multiple repositories comes at a cost—primarily the increased maintenance effort.
Let me share a rather absurd example from my organization. We have a single application that receives messages from external clients via SQS, processes them, and returns a response. Despite its simplicity, the team decided to create 12 different repositories for this small piece of functionality: separate repos for the receiver, processor, parser, and even individual repos for the IaC code. Now, whenever an issue arises, fixing it takes hours because changes have to be made across multiple repos, followed by time-consuming deployments.
jamescodesthings@reddit
I worked for a company for a few months that was an absolute hellhole.
On the first day it took an hour to clone their monorepo. Fuck ever doing that again.
It was also horrifically mismanaged by someone who wanted to be a big fish in a little pond. The monorepo was the least of their problems.
ryanstephendavis@reddit
The good answer is don't use either in extremes
flytotop@reddit
Google is monorepo, which means all repositories like Search,Ads, gmail etc All in one place. Engineers working on Ads can see the source code of gmail?. And how they get memory of that much to store in a single repository
RecognitionOwn4214@reddit
We're currently moving from multi to monorepo. The only thing that came up in about a year now, was working on multiple problems in multiple independent projects within the repo will make you switch branches more often.
gfranxman@reddit
How many teams do you have? 5? Five repos. 1? One repo. Software is best organized as the organization that creates it.
wmjdgla@reddit
Isn't change-request-based workflow something offered by git forges, not git itself? And as you've also noted, the git ecosystem has built various extensions / add-ons to address its various shortcomings. The same could have been done (and probably has been done) for the other VCS.
qsxpkn@reddit
I'm very surprised author mentions submodules. I thought everyone agreed they were bad and moved on. Anyway, Monorepo all the way. It has many benefits (code reuse, atomic commits) but there's one benefit that I can't live without: eliminating dependency hell.
We use monorepo, and our codebase is Java, Python, and Rust (and a bit of Go -- but we don't really care about Go). We use Pants as our build system. It's great.
thebuccaneersden@reddit
I guess it depends on who you work for and with. Some people like creating new repos to lay their mark. Some people like keeping things together for convenience of reading git commits. IMO, I try to follow OSS ideals, so somewhere inbetween.
18randomcharacters@reddit
At my project, we have many micro service teams. Each backend micro service is its own repo.
But our front end is a monorepo.
I much prefer the backend/smaller repo way
fried_green_baloney@reddit
One opinion mostly favorable for monorepos:
https://danluu.com/monorepo/
zaitsman@reddit
Many repos all the way
i860@reddit
Monorepos are simply a terrible idea. They only exist to allow teams to make multiple changes at once such that everything is operating in kitchen sink mode. Backwards compatibility and interoperability take a back seat (one of the primary reasons of using a monorepo) and code quality of each individual component suffers as a result.
Separate repos force correct approaches to software engineering:
Modularity. Healthy abstraction with low coupling. Backwards compatibility and interoperability.
Yes you can do all of the above with a monorepo but most do not.
And this isn’t even getting into the massive size problems.
twistacles@reddit
The answer is it depends.
CubsThisYear@reddit
I’ve always thought there’s an easy answer to this question. Your repo strategy should be governed by your release strategy. Whatever code you release together as a single version, that’s your repo. It should also follow that there is a single build process (which might of course have sub-parts) for this repo.
This is the essence of what a (git) repo is supposed to represent: an atomic unit of code that is developed, built and released together.
The reason this is important is because it strikes the right balance between assurance and flexibility. If you have two repos that are always released together, they should be one repo, because then you allow your build process to provide a more holistic correctness guarantee (because it gets to “see” all of the code at once). Similarly if you have one repo that contains multiple, unrelated build processes, this should be split up because now you are forcing developers to pull in more code (and thus more complexity) than they really need. You’re also breaking git’s central idea of whole repo versioning because now you are going to have commits that don’t affect one module or the other at all.
Tiquortoo@reddit
Follow team alignment based on actual permissions not roles. Stay mono as long as possible. It simplifies a lot of core workflows and only adds a small bit of actual complexity.
recycled_ideas@reddit
The advantage of a monorepo is that every dependency is immediately obvious and the person who broke shit can fix it right away.
If there's no dependency or the person doing the breaking isn't able and allowed to fix the errors a monorepo is a disaster.
It's that simple.
Don't put a bunch of unrelated shit in a monorepo.
Don't put things you plan to allow multiple live versions of in a monorepo.
Don't put things in a monorepo of you're not going to build the entire repo before merging.
Do put things that need to be kept in sync together.
Do put things that the same people work on together.
FAANG do things that make sense for the way they work, but a lot of the ways they work are stupid artifacts from broken start-up culture.
angrybeehive@reddit
Every good language has package support. Use that to share code. If you really need to obsess about reusing otter things, submodules and version tags is the way.
rongenre@reddit
As long as everyone is on the same release cadence, mono is fine
Elmepo@reddit
This. I had to do a lot of work to separate out some of my teams functionality from a while back specifically because we used trunk based development, aiming to deploy every day, and every other team in that repo used gitflow to release every 3 weeks.
Monorepo is fine imo but it needs tooling/plus strong alignment on your git workflow/release cadence imo.
light24bulbs@reddit
Why do you say that? I disagree with this one hard. It's actually much harder to synchronize releases and state between multiple repos during release time.
Code in the monorepo is continuously integrated by definition. It lends itself very well to continuous deployment.
If anything multi repo needs a lot of synchronization and timed deployment much more. So I don't quite understand your point.
justmebeky@reddit
Yes, monorepo.
RoastmasterBus@reddit
No-ones mentioned a monorepo connecting to many leaner peripheral satellite repos, like a solar system or smaller towns surrounding a large city.
I have noticed many projects usually end up organically going down this route anyway regardless how they initially structure their project, as it’s usually the easiest to work with.
joost00719@reddit
My previous job had a mono repo and it was such a nice development experience. We did need some more ram cuz visual studio ate it all, but it really allows for quick results and it's so easy to navigate the code and see all references.
I'd go back if I could.
supermitsuba@reddit
How does vs use up memory for git?
sebnukem@reddit
We have a monorepo with a pretty good devops team, and it's a much more enjoyable dev experience.
SirLestat@reddit
No
PrefersEarlGrey@reddit
Yes, the good answer is adapt to whatever fits your teams skillsets and needs best. There is never a one size fits all solution for every tech scenario.
pinpinbo@reddit
Monorepo without the hard work of writing the tooling sucks
PeachScary413@reddit
Monorepo unless you have a really really good reason to not have it.
Never split your code repo due to organisation, having two repos just because it's two teams doesn't make sense... if your team members can't adhere to not changing up unrelated services they don't own (without checking with the owners) then you have bigger issues.
HashtagFour20@reddit
monorepos only work when you have an entire team dedicated to making it sane
Isogash@reddit
In most cases, one repo per team. If a team covers multiple functions then you need to revisit your team structure.
redbo@reddit
My dev manager pushed us to do ~20 repos for one product and I told him no. I’m sure it’s possible to succeed that way, but it seems like swimming uphill.
ososalsosal@reddit
I can't even imagine why that thought would cross their mind
augustusalpha@reddit
Multi repo.
LOL .....
cmprsd@reddit
Multi-repo is superior in a few ways. If you hire outside devs it's great, as they don't get access to your entire repo. A lot less messy and more secure.
I find each repo is more focused and less interconnected too. Monorepo definitely needs more tooling.
lunchmeat317@reddit
Nope
Crandom@reddit
Polyrepo but actually build tooling for making changes across all the repos and managing deployments. Monorepos are a never ending losing battle against scale, in builds, ides, merges etc.
standing_artisan@reddit
Just monorepo.