When you write the CI/CD policy for an AI agent, are you writing it against your internal review or against what 1,000+ teams found works?

Posted by nkondratyk93@reddit | ExperiencedDevs | View on Reddit | 14 comments

Honestly been sitting with this since LangChain Interrupt opened this morning. Harrison Chase keynoting at 9:30 PT, synthesizing what teams at Clay, Rippling, Workday and the long tail actually shipped in production this past year. SAP Sapphire closing same day with 200+ agents under one stated design rule (governance first).

For the last two years my deployment authorization for any agent has been a single-reader document. Internal compliance signs, internal security signs, we ship. The reader was always us.

Today there's a public practitioner record. So now the question I think most teams haven't answered yet: when you write the CI/CD gate for an agent (scope of credentials, retry policy location, blast radius column, cost ceiling per action), are you writing it against your team's policy review or against what the published synthesis says actually works at scale?

Those two specs collide in places. Per-action cost ceiling vs per-month budget. Credential per logical agent vs per family. Retry policy in the harness vs in the prompt itself.

Asking because I gave my own doc 45 minutes this morning and found five gaps I would not have written down a week ago. Curious where everyone else is landing. Anyone running an agent fleet with a different production-floor reading?

[-]

Empanatacion@reddit

This word salad is so opaque that I'm not sure if this is a joke.

[-]

nkondratyk93@reddit (OP)

not a joke, just poorly written. fair catch.

[-]

Empanatacion@reddit

Well, goodonya for taking the smack talk gracefully.

[-]

nkondratyk93@reddit (OP)

haha thanks. comes with the territory

[-]

Fluffatron_UK@reddit

This is your brain.

This is AI.

This is your brain on AI.

[-]

but_good@reddit

W. T. F.

[-]

nkondratyk93@reddit (OP)

lol yeah, overcooked it

[-]

but_good@reddit

I’m not sure you did anything other than what your llm said to do.

[-]

nkondratyk93@reddit (OP)

fair. the policy is easy - getting anyone to enforce it isn't

[-]

TheTacoInquisition@reddit

This is almost unreadable. I'm going to answer the title, as I don't think the actual post content adds anything.

You should be writing your CI and CD pipelines to solve YOUR OWN problems. That is the only sensible answer. Using someone elses solutions is assuming you have the exact problems they have, and the same solutions will work for you which is cargo culting.

[-]

nkondratyk93@reddit (OP)

fair on the readability - the post was doing too much at once. and yeah, solve your own problems first is the only honest starting point. everything else is just benchmarking theater.

[-]

CubicleHermit@reddit

This.

Also, CI/CD shouldn't be a policy read and interpreted by AI - if it is, you're doing it wrong.

It's obvious if you replace "AI" with "people;" if you had to have a quality engineer reviewing test outputs or a release engineer doing release readiness checklists, we'd all go "no, that's not CI/CD" but somehow replacing people with a stochastic system somehow makes us forget that.

CI and CD should be a deterministic systems.

AI can help you build those systems, and probably should, these days.

AI almost inevitably will be writing at least some of the code that's the upstream input to that systems.

...but AI should never be the primary or only go/no go gate, any more than "it passed a manual code review" or "it worked on my machine" should be the that sort of gate.

[-]

Empanatacion@reddit

Mods, please don't delete this post. It's like performance art.

[-]

nkondratyk93@reddit (OP)

lol better than deleted i guess