Advice on a major tech upgrade that seems impossible
Posted by ish123@reddit | ExperiencedDevs | View on Reddit | 46 comments
I work at a smaller company that has been very successful over the last 25 years, but has been kicking the can down the road on tech debt for a long time. The sheer volume of the system is hard to describe. We have older J2EE apps that are stuck on Java 7 and an old middleware. We have a modern microservices+angular stack, and some functionality from the old apps has been rebuilt in the new stack, but for the most part, there is a very large number of pages and code that has not moved.
We are now getting pressure from the organization to update to a modern middleware and supported JDK. The problem is, it's tech debt all the way down. The web layer is on Struts 1. DB Layer uses an unsupported, very old ORM with no upgrade path. Code is spaghetti: There is some attempt at separation of concerns, but lots of JSPs have scriptlets and directly access the database. Stuff like that. We're talking hundreds of JSPs, thousands of classes, business logic in JSPs and Action classes, ORM objects used and updated everywhere, minimal unit testing, etc.
My job is to help the organization understand the task before us. Right now executives have the opinion that we can just swap out the middleware for something else. That does not seem possible. Going to modern middleware requires a modern JDK, which means we can't bring the old libraries with us.
Furthermore, I see no way to migrate one thing at a time and keep things working. The app can't run some pages on struts 1 and some pages on struts 7 or whatever modern MVC we choose. So to me, that means we are talking about a rewrite, where we start a new app and move over functionality that we do want to keep. That will be a monumental undertaking.
-
Are there resources that discuss options for this sort of task (start over with a rewrite versus upgrade in place)?
-
Do you have any tips for helping me convey that this is the culmination of 25 years of tech debt and bad choices, and there is no viable upgrade path? I think my only option is to meticulously outline the work required to upgrade an app, and discuss how there is not even a strategy available to execute. Executives are not developers and will not want to hear this.
EvilCodeQueen@reddit
Greenfield is always the temptation, but it rarely goes as well as people anticipate. Most legacy applications are woefully undocumented, with lots of dead-ends, buried bodies, and "temporary fixes" that are still running after years. In most cases, the people who know where the bodies are buried and why are long gone. Because the business isn't going to pause while you rewrite, you essentially need to double your staff and run old and new in parallel during development, or basically put the old system on life support until the new one is ready, which means pissing off everyone who touches the old system.
Because of this, I'm generally team "replace in place". I'm assuming the legacy stuff is all old school "full round trip" web applications as opposed to javascript clients hitting APIs. If that's the case, I would focus on getting more APIs defined and built, even if the back-end behind the API remains the legacy code. Once you're working with solid APIs, it's a lot easier to replace a single API than a chunk of a monolith. Add in tons of integration and contract testing on the APIs instead of trying to unit test every little thing. (I do recommend heavy testing of any business logic.)
As far as selling this to management, it can help to structure your arguments as if you were a management consultant. Use industry reports and sources like Gartner Group to support your case. Make sure that you're addressing their concerns first (which is almost always cost, but occasionally it's something else, like competitive landscape.)
DUDE_R_T_F_M@reddit
I believe struts2 does allow for running in parallel to struts1. It's still a huge pain, but it might be a path.
SpriteyRedux@reddit
However difficult it is to set this up in a way that allows you to upgrade it iteratively, I promise you it will be less difficult than a complete rewrite. The overwhelming complexity of a system will never become easier to deal with when starting from a blank slate. You just have to recreate the overwhelming complexity from scratch. You have to come to understand it either way, the rewrite just takes 5 years longer.
ish123@reddit (OP)
Thank you, I generally agree, I am just concerned there is not a technically viable way to do it piecemeal. Exploring that is something to work on.
Blahbort@reddit
I'm right in the middle of doing this right now with an old JEE stack but I've managed to keep the server side upgraded somewhat over the years. The front-end is a different story, very old libraries and hundreds of pages.
I did find a way to piecemeal migrate the front-end by running the old front-end war alongside the new one as I'm upgrading it to a different component library. I'm securing both with SSO and using feature toggles to enable pages for users when specific pages are finished migration.
Even though the front end libraries are about 20 years apart, I'm able to make the new pages look almost identical to the old ones, so when the user seamlessly navigates from an old page to the new one, they shouldn't notice the difference. By keeping the design the same it's quicker because there is no time spent redesigning pages for new interactions. This lends itself to quicker verification and no retraining effort for users.
Due to the large number of pages, I've written a migration for most of the pages. It pretty much maps old tags with new ones. It's a bit more involved in just that, but that is it at the core. Once I run the migration, the page only needs to be tweaked from there for anything that is unique to that page.
This is still in progress, but I can see the light at the end of the tunnel and I've done other large scale refactorings like this with success.
morosis1982@reddit
The best part about doing it iteratively is that you get to learn about the edge cases complexities slowly, as you go, rather than needing to do it all at once. You can then ensure that there are at least integration level tests for them so that they don't break as you continue. You need to account for this in terms of being able to do occasional refractors along the way as you do.
SpriteyRedux@reddit
Good luck! These projects are so challenging but rewarding when they're finally done
marx-was-right-@reddit
Project like this happened at my work, almost exact same situation. Spring Mvc project built on apache felix and osgi on Java 6. There was no path forward to upgrade it because multiple dependencies were EOL
The solution ended up being that every single business case from the system had to be explicitly defined and painstakingly migrated to a clean mockup of the old system in spring boot. Cost the company a ton of money, but you know what cost more? The 4-5 failed attempts to "just upgrade the versions, how hard could it be!" where they paid people to bash their heads into a wall for months until realizing they couldnt do it, and repeat.
Usernamecheckout101@reddit
Re-write the app
Murky_Citron_1799@reddit
There's a book titled working effectively with legacy code that can help you
Frenzeski@reddit
Kill it with fire is the book i think, highly recommend it
touristtam@reddit
This one? https://understandlegacycode.com/blog/key-points-of-working-effectively-with-legacy-code/
I can confirm that making sure the code coverage is decent and relevant will make it easier to move anything forward.
Maybe a middle ground is identifying the best version of the JDK to upgrade to with the minimum effort possible involved; No need to just to 21 or 22 when 8 might be enough.
BoBoBearDev@reddit
Can you explain why Microservices cannot solve this problem? Because you can easily migrate small independent db tables into microserive and just communicate with JSON which is like platform independent. It should be easy to propose this because everyone is doing it. And microservices is very easy to make. When you have one example with dummy endpoints, it is just copy that into a new repo. You can scale up so easily with this.
IAmADev_NoReallyIAm@reddit
I've done this a number of times. Hell, I jsut realized I'm still doing it, just on a different scale.
First thing you'll want to do is recognize that you don't want to replace the system piecemeal. That's going to just slowdown the process. Greenfield the whole project. Start from scratch. Treat it like you're starting from nothing, because essentially you are. throw away everything you know about the existing system, start fresh. I did this with a client. They kept insisting that they wanted the new system to work just like the old system. They even gave me the user manual for hte old system. The next morning at the meeting ,I literally threw the book back at them and asked what the hell were we doing there? If they wanted the old system, why were we there? This is their chance to make improvements, not just to the system, but to their business processes (this was the key point) as well, find places where we could speed up the processes, the pain points, and make things easier and faster to process. Nothing was off the table, we were going to greenfield everything. Man, the pop in the room as their heads suddenly came out of their asses was so loud...
Do the same thing. This is a chance to design the system properly. Take all that tech debit, and do it right. Will you make all new mistakes and create new tech debit? you bet. And the next guy will fix those in 15 years when he wonders what in the hell you were thinking when he goes to redesign and fix it.
But honestly I would treat this like new development, gather new requirements, and DO NOT ACCEPT "What ever the system does now, I want it to do that" as a requirement... get them to nail down specifics. When I click this, that happens.
If you need talking points, doing new development will be faster and more efficient. It will have a cleaner code base, be faster to implement and easier to validate. The ability to maintain it longer term will be cheaper, and less prone to errors, since it will be a straight up replacement. If you do a piecemeal replacement, the code could become messy and prone to errors since it ill need to continuously need to interface with legacy code, while at the same time require constant upgrades and modifications as you continue through the upgrade process.
fuckoholic@reddit
How do you know the new thing does the exact same thing as the old thing?
josetalking@reddit
Respectfully, hundreds of pages, thousands of classes, 25 years in the making: a rewrite from scratch has a high likelihood of failure.
Probably there is business knowledge in that code that no one in the company can explain or remember (some of it will be valid and some not).
If I was in that company and someone suggested something like that I would oppose openly.
Btw: work with vb6 code migrated to .net. originally written about 25 years ago. Nobody really talks about trying to rewrite that, instead layers have been created on top of it to adapt it to the new architecture.
IAmADev_NoReallyIAm@reddit
Believe it or not, has less of a failure than one might think. Been there, done that twice. One converting VB6 to .Net and one converting monolithic jsp application to a react with microservices.... Both with massive histories behind them. And one them with the directive of "what ever the app does now, do that..." which is how I know to not accept that.... Because yeah digging through 20years of code is shit. That's not how to get requirements. But if you tell them this is a chance to improve your process and fix pain points, they start singing a different tune.
josetalking@reddit
I think you are talking about a different code base size.
In the code base I work with, it is finances, worked by literally hundreds of devs daily.
Trying to rewrite that from scratch would be my signal to start looking for a new job.
ZucchiniMore3450@reddit
I agree with you, if their system is well defined.
Usually it is not well documented and a lot of knowledge is hidden in spaghetti code. Then the problem is: they don't even know what they want.
morosis1982@reddit
My view on these types of problems is that the strangler fig pattern is always the way.
There's an old saying: do you know the industry term for a project specification that is comprehensive and precise enough to generate a program? Code.
https://www.commitstrip.com/en/2016/08/25/a-very-comprehensive-and-precise-spec/
Especially with older stuff you just don't know what it does until you start unwinding and replacing it. Using the agile mindset, you want to get value as often as possible, so you want to replace it in small chunks.
To use one of your examples, the db access from jsp: should these be perhaps an API call, or at least a call to a service interface? It's likely there's some duplication going on and centralising it into a service can help to identify the commonality. Can you restructure those as a part of the middle ware and decouple them from the jsp so that you can then replace the jsp with whatever the next step is? Ideally target a specific area of the application at a time so that these changes can be part of a related group of changes that limit scope from needing to understand everything to only needing to understand one small piece at a time.
That said, it's probably worth at least doing a high level analysis of the entire thing to be able to build a map that you can divide up into least to most valuable for a refactor. What are the data relationships and what service do they supply to the customer, from a relatively high level.
Dry_Author8849@reddit
Well, I would do this:
That's the rough plan. It can take time. Note that I didn't mention using AI as in large codebases you will overflow the context and it will require more work than the benefit you will get. You may try though.
We are in the same situation with a different stack and are applying this. We have 3k+ entities, most of them can be edited by users (have a form in the UI). The code base is 20+ years old. We divided entities by complexity (easy, moderate, hard). We have written our migration tool and generating code from the old codebase. Still working on some tool to help migrating the UI (it has a desktop client and we are migrating to react).
It's a titanic effort anyways, but at least possible.
Cheers!
activematrix99@reddit
Sorry to say this, but if you are asking this level of question on a forum, you don't have the neccessary skills to perform this work, and should admit this to your higher-ups so they can find a consultancy to help you.
angrynoah@reddit
When has a consultancy ever solved a problem this big? They will charge 5x what the company is paying their own devs, and after 2 years, 3 years, 5 years... will have delivered nothing (nothing but "billable hours" anyway).
Just a surefire way to torch money for no results.
activematrix99@reddit
Plenty of good consultancies out there, sorry your experiences have been poor. Have done major transitions in transportation/aviation, medicine, entertainment, manufacturing, and retail and all have been succesful in streamlining processes, reducing technical debt, migrating and optimizing service allocations. You pay for this one way or another. Failed migrations cost more than anything else.
reboog711@reddit
I guarantee to get nothing done with only one and a half years of billable hours. Hire me!
chafey@reddit
I suspect they trust you for technical decisions but not for business decisions. Since this is a huge business decision (projects like this can kill the company!), you need to get someone that the executives will trust to help them understand the business implications of the different options. Work with the executives to select a consultant that they will trust. If they balk at the added cost of the consultant, tell them that you want a second set of eyes to help you identify risks and validate the approach.
ish123@reddit (OP)
Thank you, this is one good idea i am taking away from here. We considered consultants to implement some of the work, but not to scope the project, and that is a useful idea.
Miserable_Double2432@reddit
Consultants are very useful for getting organizations to accept things that they already know. There’s something about an outsider saying it that routes around systemic inertia
botskiller1942@reddit
First, don't fall for "bad choices were made", this mindset will make your job annoying. It is as it is.
I was part in 2 such projects in the insurance industry. We identified where the strength of the devs was and went on with a strategy that made use of them. The consultants came with brilliant, shining plans, we had not the needed skills to back that up so we ignored their ideas.
Because one of our core skill was refactoring we identified where we can make cuts. For example we decided to rewrite parts of the frontend and refactored the Java backend step by step to provide specific endpoints keeping the business logic alive as it was.
As time passed we were able to identify a lot of code that was no longer needed and were able to remove it. Other parts were rewritten when the changes were big enough and the code in question was small enough.
Were we fast? By no means. Was it a good solution? It was for our team. As time passed we learned a lot about refactoring strategies like the mikado method. At some point we were confident enough to make bigger changes. It was a slow evolution and no revolution. We didn't need a big project that would have failed anyway. Did we work without a plan? We had an idea where we want to be, but challenged it from time to time.
Is there a guide you can follow? We didn't find any, just trust your colleagues and work together.
bwainfweeze@reddit
We had a monolith that was written as if it were modular, but wasn’t really and that created a bunch of problems. We had a couple of minor services that reused bits of that code, and some of those were only used for batch processing so we’re effectively offline services. Over time I expanded some SDLC tools to use code we already had instead of duplicating it poorly.
When a little bit of your code runs in a separate process you can start tackling upgrades. If the breaking changes are deep in your codebase, you’ll have to fix these first. But internal tools have a lower SLA so you can move fast and bend things. You can use these to reason about performance improvements or regressions your main app might see.
For the rest, you need a way to upgrade and downgrade a couple of developer’s sandboxes so that it’s not just one person working on knocking down bugs and back porting. IMO this works best if you fix the sandbox so that a person can have two copies on one machine - so split up shared file locations and port numbers to be configurable.
Then you just try to get it to boot, then get a few pages to work, or get the unit tests to pass. Then it’s more pages or the integration tests and so on. Any changes that can go on trunk before the upgrade should be aggressively moved to keep your branch small as possible. Get used to rebasing so it’s not a mess of merges come PR time.
angrynoah@reddit
Struts 1! Amazing. Haven't seen that in ages.
I have no useful advice for you. In my experience non-technical executives are incapable of understanding this kind of problem, because all their intuitions about the world are rooted in Atoms, and this is a phenomenon of Bits.
evergreen-spacecat@reddit
Never ever say it’s impossible. NASA patched an old probe 15 million miles away with 46 year old software https://www.jpl.nasa.gov/news/nasas-voyager-1-resumes-sending-engineering-updates-to-earth/. Simply figure out the effort and communicate it. If very expensive, management will go for your alt solution
metaphorm@reddit
steps to doing an overhaul like this
identify your entire list of desired upgrades and prioritize them
take the highest priority upgrades on the list and try to reason about the potential impact and blast radius of each one. if any of them have a well-contained blast radius, upgrade those first.
for the bigger ones that are heavy lifts due to high blast radius and lots of inter-related dependencies, come up with an action plan to work the problem. the action plan should focus on both resource allocation (time and personnel) as well as sorting out the dependency graph.
grind it out. it's gonna suck but you'll be glad you did it when it's over.
lmullen3@reddit
Just start from scratch
Subject_Bill6556@reddit
Build from scratch on top of the existing db. Run in parallel. Sunset as needed.
gguy2020@reddit
I was in a team which was brought in to completely rewrite an application. The old one was a spaghetti mess of dotnet, MS Sqlserver and hundreds of stored procedures, making debugging a nightmare. Response time on some of the web pages could reach 20 minutes!
My team chose a modern Ui framework, rewrote the entire backend in Java and replaced MSSql with a NoSql database. We hosted our entire stack on AWS instead of onsite at our web provider.
We ended up saving the company tens of thousands of dollars in licensing and hosting fees. Because of hugely increased stability and performance the company was able to reduce the support team by 80 %. The entire conversion took almost two years for 5 developers.
It's not a cheap undertaking but if the company has enough vision it pays off in buckets.
sakkdaddy@reddit
I faced similar problems before and was able to solve it like this: * describe things in terms of “costs of change” where spaghetti/dirty systems have exponentially higher costs per change compared to “clean” systems where initial costs are higher but they keep long term costs of change low. avoid too much technobabble here when speaking with business executives. focus on cost. * provide time estimates for iteratively improving the system * provide time estimates for replacing it with - well-architected. modular, modern stack
if the cost of replacement is similar or cheaper than the cost if iterative improvements, which it often is, then it is a no-brainer for the company. if the cost is higher, then it requires a bit more careful thinking. but emphasize that the long-term goal is to have a stable system where the cost-per-change stays low. and make sure that they understand that any system will eventually need to be changed a bit just to keep up with technology trends and security updates etc.
I have successfully convinced two exective teams to replace horrific legacy spaghetti with new, clean, modular systems using this approach AND delivered on the promises…within reason anyway. (time estimates were a bit off due to training devs how to write good code instead of spaghetti, but the projects were still big successes)
PragmaticBoredom@reddit
This is also a known minefield. The classic rookie mistake in SWE is to look at a legacy codebase and imagine your replacement version will be fast, handle all of the same edge cases, and be bug free.
TedW@reddit
Ok, but what's the alternative? Upgrades are minefields too, and it can't be left as-is forever. There's risk either way.
PragmaticBoredom@reddit
I’m not suggesting one is good and one is bad. If you can accurately estimate both then you weigh one against the other and make an informed choice.
The common mistake is to assume the rewrite will be easy but the legacy code will be too hard without giving both options a fair chance. People prefer working with new code that they wrote instead of legacy code that someone else wrote, so they will project that preference on to their decisions. The degree of projection is inversely correlated with experience, usually.
PickleLips64151@reddit
Add to this the cost of having clean code versus spaghetti code. Low quality code is 3xs more expensive to maintain over time.
Even if you could swap out everything, the future costs will be more expensive to add new features.
MathmoKiwi@reddit
This sounds like the stuff nightmares is made out of
The more I read, the worse it got
Just do a complete rewrite
yxhuvud@reddit
Can you think of parts of the system that would make sense as separate services? If so, one approach is to start by extracting those. It won't solve the whole problem but it could solve some parts of it.
Antique-Stand-4920@reddit
For your second bullet, don't ever say that there's "no viable" option to executives. Instead, just explain what it would take to get the job done regardless of how terrible it is. The executives will be the ones who to decide if it's worth the time and money to pursue this project.
Another thing to consider is if you offered another option where only parts of the old system are migrated. It's possible some parts are easier to move than others. So some value could be gained instead of having an all or nothing situation.
GlasnostBusters@reddit
For example, new middleware scaffolding complete -> take a legacy endpoint -> convert to modern -> test -> repeat.
awkward@reddit
LLMs aren’t too bad at this use case -large numbers of mechanical changes from one well known platform to another one.