Avoiding extraction as the root cause of spagetthification?
Posted by Wooden-Contract-2760@reddit | ExperiencedDevs | View on Reddit | 29 comments
I’ve seen this happen over and over: code turns into a mess simply because we don’t extract logic that’s used in multiple places. It’s not about complex architecture or big design mistakes—just the small habit of directly calling functions like .Add()
or .Remove()
instead of wrapping them properly.
Take a simple case: a service that tracks activeObjects
in a dictionary. Objects are added when they’re created or restored, and removed when they’re destroyed or manually removed. Initially, the event handlers just call activeObjects.Add(obj)
and activeObjects.Remove(obj)
, and it works fine.
Then comes a new requirement: log an error if something is added twice or removed when it’s not tracked. Now every handler needs to check before modifying activeObjects
:
void OnObjectCreated(CreatedArgs args) {
var obj = args.Object;
if (!activeObjects.Add(obj))
LogWarning("Already tracked!");
}
void OnObjectRestored(RestoredArgs args) {
var obj = args.Object;
if (!activeObjects.Add(obj))
LogWarning("Already tracked!");
}
At this point, we’ve scattered the same logic across multiple places. The conditions, logging, and data manipulation are all mixed into the event handlers instead of being handled where they actually belong.
A simple fix? Just move that logic inside the service itself:
void Track(Object obj) {
if (!activeObjects.Add(obj))
LogWarning("Already tracked!");
}
void OnObjectCreated(CreatedArgs args) => Track(args.Object);
void OnObjectRestored(RestoredArgs args) => Track(args.Object);
Now the event handlers are clean, and all the tracking rules are in one place. No duplication, no hunting through multiple functions to figure out what happens when an object is added or removed.
It doesn't take much effort to imagine that this logic gets extended any further (e.g.: constraint to add conditionally).
I don’t get why this is so often overlooked. It’s not a complicated refactor, just a small habit that keeps things maintainable. But it keeps getting overlooked. Why do we keep doing this?
severoon@reddit
Sounds like the service might also be the wrong place to do this. Why not do it during object creation/restoration itself?
In such a system where you want to track active objects in a managed pool, delegate all actions that result in adding / removing from the pool:
The problem with this approach is that the caller can't delegate creation, right? It needs to set up the object or whatever. Then use the functional pattern that allows this:
In this approach, you allow callers to specify what should be done with an object they don't create / restore / etc:
drnullpointer@reddit
Two rules I follow:
1) if you can't abstract it well, usually it is better to keep duplication and not try to abstract it at all (yet)
2) only abstract it when the abstraction is meaningful in some way, represents something. A dead giveaway of a bad abstraction is that it is hard to come up with a good name.
Creating strange, meaningless abstractions is usually more damaging to the project that having duplication.
Also, I notice that sometimes once you have some more duplicated parts it is easier to form an idea how the abstraction should look like.
kidajske@reddit
I've had many situations like you describe where I spend an inordinate amount of time trying to create a reasonable abstraction because two things feel like they're similar enough that it would be bad practice to have a lot of duplication but in reality as you say they're just different enough that abstracting would create a mess of hard to follow conditional logic and create a bunch of edge cases that would further bloat the implementation. I'd then be stuck with 3-4 large classes that technically work for both use cases but are actually more code that's also garbage code than if I'd just duplicated the original implementation and tweaked it for the second use case. Felt really liberating to stop caring about code duplication as much in this context.
Careless-Childhood66@reddit
Yes yes, lets tightly couple everything so we can avoid writing 3 more lines of code
HademLeFashie@reddit
There's a concept I like to call, "sameness volatility".
I ask myself the following: * Is the repetition between these areas inherent to the problem/business logic, or is it coincidental? * Is the repetition likely to diverge in the future?
And then I weigh that against the simplicity savings, which as you pointed out, is negligible.
ninetofivedev@reddit
This is another component. Perhaps not in this example, but sharing code paths is great until you need to branch. Then you either break it apart or create actual spaghetti passing in 18 parameters and having conditionals everywhere.
Careless-Childhood66@reddit
Finding meaningful abstractions is hard. Extracting "common" patterns based on isomorphy is a bad approach imho. Just because two methods are syntacticly equal doesnt make them equal. Something the DRY crowd needs to accept.
ninetofivedev@reddit
DRY has always been a terrible acronym, but “Maybe Think About Not Repeating Yourself” just doesn’t have the same ring to it.
IronSavior@reddit
Extract doesn't always mean you have to elevate the new thing into a top level namespace. I prefer to err on the side of extracting a lot but only lifting it as high as necessary and making sure that I avoid sharing state between the extracted bit and the part it was extracted from.
For example, (in JS) I will define a function within the current function until it's clear I need it beyond the scope of the current function. When I do, I'll move it up to the enclosing function or module before adding an export or a new module.
In go, I'll go through a similar progression: new function as local var, then maybe non exported package func, then internal submodule, then maybe internal submodule of a parent module.
The point is to aggressively avoid creating a thing that becomes a shared dependency until it's proven advantageous to do so.
Wooden-Contract-2760@reddit (OP)
Yeah, it does very different in non-typed or loosely typed languages. Using result pattern could be a neat counterargument otherwise.
IronSavior@reddit
It might seem easier in a duck-typed language because you might fool yourself into thinking that you don't even have to think about leaking a type dependency. I think the problem is a little more difficult because then interfaces are not explicit and tools can't help you as much.
Wooden-Contract-2760@reddit (OP)
There's a lot to consider when extracting across layers or classes, that's certainly true.
-Dargs@reddit
I swear I saw this exact post a few weeks ago
Wooden-Contract-2760@reddit (OP)
C'mon, with my made up dumb method name of
Track
? Doubt.ninetofivedev@reddit
Software is essentially the equivalent of having 10 authors work on the same book.
What you think is clean, I think is horrendous.
That and there is just little value perceived from tweaking the way code looks. Is it more performant? No. You’re saying it’s more maintainable, according to whom?
LetterBoxSnatch@reddit
Right, and then the next requirement comes in: "the logged messages need to include the context where the attempt was made." It's not actually just that it can be encapsulated within the original site. It's that there is an inherent tension of relevancy; there's relevancy both to the calling context and within the call. It's that tension that results in spaghetti over time, and the larger and more complex your codebase becomes, the more this tension increases, which results in an even greater amount of spaghetti. For every "just do this" solution, you will have a requirement come through that works in opposition to that simple solution that will require a work around. I'm not suggesting anyone try to account for these scenarios, and I'm not suggesting you shouldn't strive for the simple solution, I'm only suggesting empathy around why the unsimple solutions come into existence: it's usually in an attempt to do the simple thing.
freekayZekey@reddit
a part of it is not refactoring in a separate branch. i feel like people instantly reject PRs that include refactoring along the way, so people don’t refactor when they see the same logic all over the place. is it bad? yeah, but it does keep prod safe
eslof685@reddit
The age old battle of SOLID/DRY vs KISS/YAGNI.
Wooden-Contract-2760@reddit (OP)
KISS only suggests that the extraction in the above suggestion should not take place right away.
Just because you have a dictionary to track stuff, you don't have to build a DictionaryService to Add and Remove since that's what exatly the dict can do.
Additionally, just because you cal it in one place and add some decorations (e.g. the logging), you still don't need to extract since it's straightforward what's happening.
When multiple events result in the same execution, that's where DRY kicks in, and it seems the only sensible choice to extract, since geneeralisation is also a sort of "keeping it simple", compared to two different implementations that may or may not be maintained to behave the same forever.
In my opinion, KISS is not violated, and the extra step of
Ctrl+. -> Extract
is hard to argue with being too much effort. Compare it to the effort of guess-searching all calls to the dictionary.Add when there is also a `GetDictionary` property to expose, and a `TryAdd` because one case threw an exception... and a `dictionary[key] = value` call because it felt simpler to the latest contributor...I don't know, arguing with KISS on my example feels just pure laziness to me.
G30RG3D@reddit
Extracting the logic into an encapsulated shared method can be a good thing but it doesn’t solve for these other problems. I have heard folks make the argument that the extraction makes it messier because now you have some places using this refined method which builds a false sense of trust when you really still have to do the guess-searching. Not sure I 100% buy that argument but I can understand it.
Also in your example your extracted shared method is called track but it is really Add that happens to also handle errors. I would find it confusing that something called track is doing something other than tracking. Maybe just a product of it being a simplified example but I think it also speaks to the challenge of finding the right implementation here.
eslof685@reddit
No idea what you're talking about. KISS stands for "Keep it simple, stupid".
Wooden-Contract-2760@reddit (OP)
I set my point to bold in the previous comment.
TL;DR
KISS doesn’t mean avoiding refactoring—it means avoiding unnecessary complexity. If refactoring makes things clearer and more maintainable, it aligns with KISS and DRY.
eslof685@reddit
I don't know why you are replying to me.
Here are the definitions if that's what you're asking about:
https://en.wikipedia.org/wiki/SOLID
https://en.wikipedia.org/wiki/Don%27t_repeat_yourself
https://en.wikipedia.org/wiki/KISS_principle
https://en.wikipedia.org/wiki/You_aren%27t_gonna_need_it
Wooden-Contract-2760@reddit (OP)
I intended to clarify my standpoint, since you said you have "no idea what you're talking about". I'm sorry if I keep failing to clarify.
I thought what you don't understand is why I disagree with your categorization as "age old battle of SOLID/DRY vs KISS/YAGNI", since I see no conflict and think my perspective enforces both sides of the coin.
What have you not understood though? Or let's just continue and focus on why you think this is a battle... (I may have wrongly presumed you think my suggestion goes against KISS)
eslof685@reddit
I guess
eslof685@reddit
The first implementation would be more in the SOLID/DRY category, the second implementation would be more in the KISS/YAGNI mindset.
binarypie@reddit
And you need both. Mastery comes from knowing when to do what and how to refactor in either direction as needs change.
GumboSamson@reddit
I’d go one step further: go immutable and stateless. Shared, mutable state is the root of 99% of bugs (including the one you mentioned here).
Sharing state? Make it immutable.
Mutable state? Keep it private.
Of course, if you’re working with legacy systems, then you just do what you can, then vent on Reddit afterward.
Wooden-Contract-2760@reddit (OP)
The example is not purely aligned with the problem statement, I agree.
There exists many ways to streamline caching, like ConcurrentDictionaries, locks and semaphores, channels...
My general impression is that many devs decline any additional adjustment to code to keep things organized, so we don't even get to discuss the how.