the Claude Code source leak today is a good reminder that AI tooling in your release pipeline needs the same code review discipline as everything else

Posted by vitaminZaman@reddit | sysadmin | View on Reddit | 73 comments

512,000 lines of Anthropic's own source code went public this morning because a source map file in their npm package pointed to a publicly accessible zip on their R2 bucket. Human error in the release packaging process, nobody caught it before it shipped, and now the code is permanently mirrored across GitHub, Gitlawb and torrent networks regardless of what any takedown notice says.

The part worth paying attention to isn't the IP exposure, it's the process failure. A misconfigured .npmignore or files field in package.json caused this, which is the kind of thing that should get caught before a package hits a public registry, not after someone downloads and decompresses it. Anthropic's own statement confirmed it was a packaging issue not a breach, which almost makes it worse because packaging hygiene is a solved problem.

It also coincided with a completely separate npm supply chain attack where malicious axios versions with an embedded RAT went live the same morning, so anyone who updated Claude Code between 00:21 and 03:29 UTC today has a different and more serious problem to deal with.

The release pipeline question this raises is whether anyone is actually running automated review on packaging configuration and release artifacts the same way they run it on application code. In most teams the answer is no, release scripts and packaging config get less scrutiny than the code they ship, and that gap is where this kind of thing lives.

[-]

Ok_Abrocoma_6369@reddit

This keeps happening because we treat AI tools like browser extensions instead of enterprise software. If a developer uses a CLI tool that connects to a third party backend it should go through a zero trust gateway by default not bypass controls. The industry keeps chasing new tools but ignores that the underlying network architecture is outdated. You need a unified edge like Cato to enforce policy at the user level regardless of what new agent workflow is introduced.

[-]

LuckPsychological728@reddit

The scary part is the combo event accidental exposure and active supply chain attack at the same time.

That is where defense in depth actually matters because you are no longer dealing with a single failure mode, you are dealing with overlapping ones.

A lot of orgs are now adding extra runtime and network level monitoring so even if something slips through CI CD, it is still visible when it tries to move or exfiltrate data. That is where platforms like Cato come into play, not at the build stage, but at the what is this process actually doing in production layer.

[-]

tastyratz@reddit

I wish this happened to OpenAI.

I want them all to burst but we need Anthropic to stay in that market and fight as long as it exists.

I wonder how damaging it will actually be.

[-]

RegretNo6554@reddit

won’t happen because their CLI is already open source and if they do burst that’d be terrible for competition a lot of people would be priced out

[-]

MedicatedDeveloper@reddit

AI written post about AI. Nice.

[-]

ThatOldGanon@reddit

how can you tell?

[-]

MedicatedDeveloper@reddit

Several it's not x it's y's in there.

General 'density', primarily English speakers tend to be far more terse in getting the main point across than a LLM.

LLMs also tend to write in a weird quasi second person way. It's hard to describe but since it's derived from a chat it seems to steer into a weird nebulous framing.

[-]

ThatOldGanon@reddit

Several it's not x it's y's in there.

the one I see is "the part worth paying attention to isn't the IP exposure, it's the process failure" which definitely doesn't feel like something a real person would say. are there others?

General 'density', primarily English speakers tend to be far more terse in getting the main point across than a LLM.

agreed, there are a couple superfluous uses of the word "own" which one wouldn't normally include.

LLMs also tend to write in a weird quasi second person way. It's hard to describe but since it's derived from a chat it seems to steer into a weird nebulous framing.

I think I see what you mean.

[-]

TheGreenTormentor@reddit

"It's not x, it's y" is basically my sleeper agent activation phrase at this point.

[-]

Infninfn@reddit

Corrective negation is what Opus told me to call it. Seems it's the result of all the RLHF training steering LLMs towards this type of response. Because it makes the response points clearer, and scores evals higher as a result.

[-]

nemec@reddit

sad that nobody can write a four-paragraph post by hand anymore

[-]

DharmaPolice@reddit

Beyond the disclosure issue, isn't there a problem with having publicly accessible files that you don't actually want to be made public?

[-]

Ssakaa@reddit

Yeah. The URL coming out that points to your internal package of source code should not, under any circumstances, make that accessible to anyone outside your organization's boundary. That's the real fuckup, and it's architectural, not a one off "mistake".

[-]

JesradSeraph@reddit

Unpopular opinion: LLM source map files trained from datasets containing any public domain data should not be copyright-able. The training process contains no meaningful act of artistic creation.

[-]

goferking@reddit

Isn't that the default/favorable opinion???

[-]

Free_Treacle4168@reddit

There isn't precedent yet. Code copyrighting is kind of weird to me anyways. Copyright is meant for artistic works, and although code can be art, most of the time it's not IMHO.

[-]

OkDimension@reddit

You must be from outside North America - there are whole companies here that solely exist on software patent licensing and infringement cases.

[-]

Ssakaa@reddit

The fact that people are exploiting the system doesn't mean the system's built on any form of correct logic. There's also companies that exist solely on abusing the DMCA to attack people who really are operating under Fair Use.

[-]

Free_Treacle4168@reddit

I do understand, Code itself is normally copyrighted, specific technologies implemented in code can be patented.

Like I said, I don't think there is firm legal precedent about LLM generated code if it's copyrightable or not.

[-]

bbbbbthatsfivebees@reddit

AI tooling in your release pipeline needs the same code review discipline as everything else

Yes? Is anyone NOT doing this and deploying AI-generated code directly to production?

If so, you have genuinely failed as a company and actually deserve the consequences that come your way when it's revealed that your "vibe coded" prod is actually full of security holes and nonsensical inefficient slop. Anyone who is allowing this nonsense to go live without review is failing at their job.

[-]

DrStalker@reddit

Wasn't Anthropic recently boasting about putting Claude in charge of writing and managing this code?

[-]

dustojnikhummer@reddit

Ouroboros, but instead of eating itself it is creating itself.

[-]

Moontoya@reddit

Ouroslopus

It's eating it's own shit to shit more

[-]

ClarityOfALotus@reddit

this one made me laugh so hard!

[-]

hbdgas@reddit

https://imgur.com/whatever-comes-out-one-end-we-feed-to-other-also-indian-food-tTT23nl

[-]

dustojnikhummer@reddit

I prefer Ouroslop lol

[-]

DrStalker@reddit

Ouroleakus, the snake that posts its own source code online.

[-]

ilyas-inthe-cloud@reddit

the .npmignore / files field thing is what gets me. we've all been there where you assume the build pipeline catches it but nobody actually verified what ends up in the tarball. i started running npm pack --dry-run before every publish specifically because of stuff like this. takes 5 seconds and shows you exactly what's going in. the real lesson here isn't even about AI tooling specifically, it's that packaging is unglamorous work that nobody wants to own, so it falls through the cracks. same reason docker images ship with debug tools and .env files in prod.

[-]

Og-Morrow@reddit

Did think it might be a April Fools Joke at all?

[-]

DrCain@reddit

Confirmed to be a manual deploy step that should have been better automated.

https://xcancel.com/bcherny/status/2039210700657307889

[-]

thatpaulbloke@reddit

Better controlled, maybe, but better automated just means that the mistake happens faster.

[-]

thortgot@reddit

Better automation includes governance. Being manual means you real on a user getting right every time.

[-]

HittingSmoke@reddit

I have to fight this at work sometimes to justify automating systems.

Why do you need to budget four weeks to automating this task when it currently takes a human a few minutes a week to handle?

Because this is how much we currently spend cleaning up mistakes humans make...

[-]

thatpaulbloke@reddit

Governance is governance no matter the identity taking the action - a DLP guard should stop a human just as much as a bot account and PR testing applies whether it was a bot that submitted the request or a person. When you have a manual step that isn't checked or guarded against error then automating it will just allow the same error to happen. Maybe the automation won't make that mistake and maybe it will, but the guards need to be in place either way because Mr Cockup always comes for a visit eventually.

[-]

thortgot@reddit

We're saying the same thing. Enhancing the automation to abide by the governance controls.

Removing the manual ability to push code is a normal CICD pipeline control.

[-]

UltraEngine60@reddit

Gitlawb

I thought this was yet another technology created last week that I did not know about.

[-]

Current-Ticket4214@reddit

The underlying protocol is git. They just added a fancy protocol on top to get you to download their CLI.

[-]

insufficient_funds@reddit

Anyone else the type of sysadmin that knew the words in the OP but has no idea what any of it meant?

[-]

Icedman81@reddit

Ooh wee.

Give a man a fish pole, he'll fish to survive. Give him a net, he'll overfish a little, but still to survive. Give him a trawler, he'll get all the fish extinct. Except with AI, it's more like...

Give the moron developer a tool - he'll use it efficiently
Give the moron developer a tool that doesn't need declared Types - he'll create a fucking monster (Javascript) that can be considered a bane of civilization
Give the moron developer a tool, that is at best 95% correct at guessing statistically and you create an Internetocalypse

People are lazy. Doesn't matter who you are, it's in our nature. This is what this is.

[-]

GroundbreakingMall54@reddit

the fact that it was a .npmignore issue is almost poetic. everyone obsesses over supply chain attacks and zero days but the actual threat is just someone forgetting to exclude a directory before npm publish

[-]

Rentun@reddit

The reality with information security is that there are vanishingly, ridiculously few incidents where the victim organization did everything right but still got breached.

They're almost always the result of a lapse somewhere. Unpatched software, improperly broad permissions, debugging or backdoors left in production deployments, and on and on and on.

It's so rare that it may well not even be considered in most organizations threat modeling that a company gets breached solely because of a 0-day they had no chance of defending against.

[-]

Zolty@reddit

For real how many times does the entirety of the social security database get leaked? Your data is out there for sale.

It doesn't mean you shouldn't try and protect it but we are very much going to a post privacy world. I just don't think there's enough empathy in the world for it.

[-]

HotTakes4HotCakes@reddit

Which is the sort of issue that LLMs exacerbate. It encourages corner cutting, instills a false sense of confidence in people that should not have it, and just perpetuates laziness. You're not using them because you want to work harder.

[-]

HeKis4@reddit

LLMs are only as good as the average public git project. Would you trust a random github project with production ?

[-]

Free_Treacle4168@reddit

They're much worse than the projects that came out before the vibe coding 'revolution'.

[-]

YLink3416@reddit

Would you trust a random github project with production

I already do

[-]

Techwolf_Lupindo@reddit

A quick google shows this is an AI company. Was anything of worth in that code?

[-]

IngSoc_@reddit

I'm sorry, are you telling me you haven't organically heard of Anthropic or Claude before just now?

[-]

UltraEngine60@reddit

woosh

[-]

IngSoc_@reddit

Lol I wouldn't be so sure

[-]

Characterguru@reddit

The release pipeline is the last place most teams apply rigor, and that's exactly backwards. Treat your packaging config like production code: version-controlled, peer-reviewed, and gated behind the same CI checks you'd never skip on application logic.

[-]

joedotdog@reddit

Wonder if we'll get a full 1024 next time? At least the math checks out.

[-]

Sharp_Animal_2708@reddit

this is the part that kills me about the rush to ship AI tooling. we treat npm packages like throwaway wrappers but they're running in prod pipelines with access to secrets and source.

had a similar close call last year -- not a leak but a misconfigured artifact bucket that sat open for weeks before anyone noticed. the fix wasn't better automation, it was adding the same PR review checklist we use for app code to infra and release configs. boring but it actually works.

how many teams here actually review their CI/CD configs with the same rigor as feature code?

[-]

Careful-Criticism645@reddit

Anyone letting AI just build stuff without a human approving every step is an idiot.

[-]

Fine-Platform-6430@reddit

What’s interesting here is that this kind of failure is usually not a tooling problem, but a missing “release boundary check” in CI/CD pipelines. Most teams already validate code, dependencies, and even secrets, but still treat build artifacts (source maps, npm publish config, bundled outputs) as a secondary concern.

practice, that’s where the real attack surface often appears — not in the source code itself, but in what gets packaged and shipped. It feels like supply chain security still hasn’t fully caught up with modern frontend + AI-heavy build pipelines.
Anyone here is actually enforcing artifact-level validation in CI pipelines today, or if this is still mostly manual?

[-]

Fine-Platform-6430@reddit

In practice, that’s where the real attack surface often appears — not in the source code itself, but in what gets packaged and shipped. It feels like supply chain security still hasn’t fully caught up with modern frontend + AI-heavy build pipelines.

Anyone here is actually enforcing artifact-level validation in CI pipelines today, or if this is still mostly manual?

[-]

Fallingdamage@reddit

I still dont publish my binaries or script sources to any public repo sites. I keep them to myself. So far no leaks!

Git and the like are like the social media of coding. Everyone wants to put their goods up in a public place as some sort of flex? Better to keep things to yourself and this type of thing wont happen.

[-]

1h4veare4lpr0bl3m@reddit

512k lines sounds like a small-ish library.

[-]

HotTakes4HotCakes@reddit

regardless of what any takedown notice says.

What's that? They take issue with their work being used by others without their consent?

[-]

skat_in_the_hat@reddit

Nah that couldnt be. They trained it on everyone elses code/tech blogs that they just happily slurped up, and now sell it for revenue. So they must be totally okay with us doing the same to them... right? ... guys? /s

[-]

skat_in_the_hat@reddit

buh buh buh but the AI... It will replace ALL your Engineers! AGENTIC GUYS AGENTIC!!!
Make sure when you hear people talking about how this tech replaces us, you laugh right in their face.

[-]

scandii@reddit

respectfully, shit happens.

no system you can imagine has any chance to cover all vectors of stuff going wrong at even at unreasonable cost, nevermind reasonable cost.

to pin this as an AI issue is honestly just not it.

[-]

synept@reddit

Eh, accidentally shipping source code is not something that typically "just happened" in the more traditional world of software development.

[-]

scandii@reddit

I agree, typically we see API keys to very expensive services in the wild instead which is orders of magnitude easier to catch than this multi-layer slip-up.

[-]

scrittyrow@reddit

Dude what? Im a self taught developer and I know to put gitignores in my code. This is beyond a lapse its completely agregious for a company of that size to publish .env or secrets lmao

[-]

scandii@reddit

...yeah, and yet it happens.

[-]

gumbrilla@reddit

Yeah, I'm with you 100%, it's novel, borderline hilarious.

[-]

Ssakaa@reddit

Ah, worse. They're not even looking at the dumbfuckery. This mistake just exposed their bigger problem.

a publicly accessible zip on their R2 bucket

[-]

TheCyberThor@reddit

We’ve gone so deep into AI that we forget even before AI, shit happens.

[-]

gangaskan@reddit

Every single person is guilty of one thing or another.

With automation, and even ai, it's bound to happen. Especially when hastily running pushes

[-]

techretort@reddit

Thats what I don't get. We all know shit happens, but in this case there were no safeguards or policy's that were there to stop it from making a single mistake by one person into a massive breach like this. Where's the second set of eyes, the review process, the change advisory board sign off?

[-]

gangaskan@reddit

Good question.

[-]

vitaminZaman@reddit (OP)

exactly :(