Trying to validate if problem is real or not

Posted by _h4xr@reddit | ExperiencedDevs | View on Reddit | 15 comments

Hi community,

I am a Staff engineer and has always operated in infrastructure space. Over the last few quarters, as AI adoption is being pushed drastically hard on everyone, I have started seeing some inefficiencies. I am trying to build a product to address these inefficiencies and wanted to check in with the broader experienced developers community whether the problem is even real, or am I operating in a silo and maybe over experiencing the problem.

What i am trying to build is a core infrastructure platform for agentic coding. In my company, I have seen that migrations take months to quarters when it comes to focusing on a core library that is used in 1000s of other code repos. The management is pushing teams to leverage agentic coding solutions to perform these migrations.

While we provide the relevant prompts and everything to the agentic workflows, Tracing the exact blast radius is fairly impossible today. This generally leads to AI agents coming up with modifications that will lead to incidents in production.

The other bit is, while Agents are good in coding things that leverage open source libraries (because they are trained on them), they struggle when it comes to internal enterprise codebase (resorting to expensive runtime decompiling or hallucinating functions that will lead the compilation to fail)

We are building for this. An execution context engine that blends static analysis of codebase with runtime data to allow agents to trace through the method calls, their performance characteristics and reason about them and leverage that when working on coding related asks.

Wondering if the community thinks whether this problem is real or not.

[-]

DeterminedQuokka@reddit

1000s of repos is a very specific and small scoped problem. Most companies don’t have 1000s of repos.

The general ai to help with a migration is real. But I wouldn’t buy a product for it. I can do it using the existing ones and skills that they can build themselves.

[-]

_h4xr@reddit (OP)

This is true. Most companies definitely won’t have 1000s of repos. Probably need to focus more on exploring where the real value addition is

But also, the repository count alone doesn’t account for the whole problem. We are also looking at complexity of the codebase, dynamic runtime calls through DI frameworks, etc.

That is where my mind has been thinking from

[-]

DeterminedQuokka@reddit

I work at a place with a large complex codebase. That has the problem that the code is bad and it makes ai not work that well. I don’t need a random product. I just need skills that tell it how to use our codebase. Which our engineers have built.

[-]

_h4xr@reddit (OP)

That’s fair enough. And we have been doing something similar too. Repo side context files, agentic skills, etc The iterative journey is ongoing and we have taken a similar approach of building MCPs around our internal infrastructure and tooling.

Most of my line of thought has been from the realization about the incidents, performance issues and long running migrations that continue to exist, even with the promises being made with the current Agentic stack

[-]

_h4xr@reddit (OP)

This is true. Most companies definitely won’t have 1000s of repos. Probably need to focus more on exploring where the real value addition is

Thank you 😇

[-]

Icy_Cartographer5466@reddit

The library migrations thing is a solved problem for the typical big tech use case where deploys are cheap: for changes that don’t break APIs you automatically release them to increasingly large cohorts at build time, for changes that do break APIs you mark them as deprecated however you like and eventually start failing builds that refused to update.

The other thing is probably 99% solved by giving the agent a code search tool that it can use to follow dependencies around and look up source code in different repos.

[-]

Buttleston@reddit

used in 1000s of other code repos

I think I found your problem

[-]

arkantis@reddit

I think you should ask your coworkers or leadership to determine if this is useful or not. The Internet is not your team mate who this benefits. Maybe it sounds useful but TBH your idea is a bit hard to grok what exactly is being proposed

If you write up a concrete proposal and socialize it with other teams or orgs they can agree or disagree. If it sticks and you do more things like this then congrats you're on your way to becoming a senior staff engineer or equivalent 🙂.

[-]

_h4xr@reddit (OP)

That was my initial plan and something I wanted to do. But my organization doesn’t seem to be super supportive much, with the leaderships focus on investing in areas with quick ROI.

I have had a few conversations with people I have known from my own company as well as from a small subset of other companies and have generally heard a positive response to this. Hence, put this up in here to get a more unbiased view.

In terms of idea, in simpler terms, what i am trying to do: 1. Perform static analysis of code repos and map interfaces, classes, structs, methods and method calls 2. Leverage runtime telemetry to bridge service boundaries and also capture performance data like cpu cycles spent per method, memory allocated, etc

Leverage the above 2 to form a code level knowledge graph which can be queried by agents when they try to perform coding level tasks.

[-]

arkantis@reddit

I would recommend starting backwards then. Prep a tiny demo of improved agentic code output with the context you suggested gathering.

I suspect you will find that there is a real cost to gathering that data you propose and then in doing this demo you can compute that into a real-ish ROI/value. And either it's worth it and you can prove it, or it's not because it'd take X effort to gain Y and you've learned something along the way hopefully.

[-]

metaphorm@reddit

do those repos have integration tests and CI systems running? do you CI systems build the app? an agent can open a PR with package migrations for the repo and let CI do the work it's supposed to.

the training problem for internal tools is a real one. this is not fully solved by anyone as far as I know. at my company we use an agentic harness that establishes context about the repo. we also have setup some MCPs for internal APIs and produced some internal RAG knowledge bases to be used by agents. this helps. it's an incomplete solution tough, and human-in-the-loop steering is still quite important.

[-]

_h4xr@reddit (OP)

CI/CD systems are there and they will block something that breaks the app or to a lot of extent a downstream.

We do also have context files that explain different infrastructure components. But mostly, what I am seeing is, integrating a new library, or running a migration across 1000+ code repositories with agents have lead to very mixed results.

And that is where i have been wondering if having easy access to the knowledge base in terms of code directly where the agent can trace where all it needs to make a change or what methods are really exposed by a library, etc will help or not

[-]

T0c2qDsd@reddit

I think there’s a few things that I’ve found that help, even before finding a way to “create the right context”.

First is dividing up the problem — instead of “fix everything that calls this”, it becomes “here is a repository, it needs to be updated. here is an example of common ways it has been updated safely.” If the codebase is over a certain size, just break it down further.

Second is ensuring that you have reliable/deterministic signals that the work is done. Think build/test signals. Those don’t pass? Changes were bad, fix them. (Those aren’t good is a harder but common problem.)

Last trick is giving a second agent review criteria and telling it to check the first agent’s work and flag if test coverage might be a problem. That’s a signal that a person should go look more closely at it.

Here’s the other thing — if it’s a mechanical change, honestly, probably see if you can automatically trigger basic tools on a repo and try that before turning to AI. If it isn’t a mechanical change, make sure it’s reviewed.

[-]

metaphorm@reddit

the agent has to be able to read the code to know what kind of code changes to make. at bare minimum, it will need a change log and thorough technical documentation of the dependency library in the knowledge base.

[-]

_h4xr@reddit (OP)

Thank you. That makes sense and also seems to be inline with what I am observing