Prompt injection within GitHub Actions: Google Gemini and multiple other fortunate 500 companies vulnerable

Posted by Advocatemack@reddit | programming | View on Reddit | 99 comments

So this is pretty crazy. Back in August we reported to Google a new class of vulnerability which is using prompt injection on GitHub Action workflows.

Because all good vulnerabilities have a cute name we are calling it PromptPwnd (you can thank the marketing team)

This occus when you are using GitHub Actions and GitLab pipelines that integrate AI agents like Gemini CLI, Claude Code Actions, OpenAI Codex Actions, and GitHub AI Inference.

What we found (high level):

Untrusted user input (issue text, PR descriptions, commit messages) is being passed directly into AI prompts
AI agents often have access to privileged tools (e.g., gh issue edit, shell commands)
Combining the two allows prompt injection → unintended privileged actions
This pattern appeared in at least 6 Fortune 500 companies, including Google
Google’s Gemini CLI repo was affected and patched within 4 days of disclosure
We confirmed real, exploitable proof-of-concept scenarios

The underlying pattern:
Untrusted user input → injected into AI prompt → AI executes privileged tools → secrets leaked or workflows modified

Example of a vulnerable workflow snippet:

prompt: |
  Review the issue: "${{ github.event.issue.body }}"

How to check if you're affected:

Run Opengrep (we published open-source rules targeting this pattern) ttps://github.com/AikidoSec/opengrep-rules
Or use Aikido’s CI/CD scanning

Recommended mitigations:

Restrict what tools AI agents can call
Don’t inject untrusted text into prompts (sanitize if unavoidable)
Treat all AI output as untrusted
Use GitHub token IP restrictions to reduce blast radius

If you’re experimenting with AI in CI/CD, this is a new attack surface worth auditing.
Link to full research: https://www.aikido.dev/blog/promptpwnd-github-actions-ai-agents

[-]

vibecoder012@reddit

my setup right now is kinda mixed cursor for bigger work and lighter tools like wozcode for quick experiments works better than relying on just one tool honestly

[-]

This is why runtime guardrails matter. You can't just sanitize prompts and hope for the best, you need real time detection that catches injection attempts before they hit your AI agents. Companies like Activefence are building guardrails specifically for this. The treat AI output as untrusted advice is spot on but you need the tooling to enforce it.

[-]

Thom_Braider@reddit

CI/CD pipelines should be 100% deterministic. Why would you use inherently probabilistic AI in your pipelines in the first place? Wtf is going on with this world.

[-]

ProdigySim@reddit

"When someone opens a PR, trigger an AI bot to perform a code review"

[-]

eightslipsandagully@reddit

AI code reviews are one of the few use cases I don't mind. I would NEVER merge solely off an AI approval but it does occasionally come up with good suggestions.

[-]

jsdodgers@reddit

I hate them. My company recently put an AI review system in place, and it always spits out total BS style suggestions that my teammates blindly accept.

[-]

blocking-io@reddit

AI code reviews are annoying. Let the humans use AI to review if they want, but they should evaluate what the LLM spits out first before polluting the PR

[-]

nyctrainsplant@reddit

It's all about signal to noise ratio and friction on release. As long as your team is OK with ignoring the nonsense comments it can be OK. In my experience the noise is pretty consistent though, and the catches relatively rare.

To be honest, it doesn't make much sense to dislike AI code review more than some SAST tools, which are often not very deterministic or reliable either.

[-]

EntroperZero@reddit

We enabled Copilot reviews on our GitHub, and I find that about 75% of the comments are junk and waste my time. But, I actually believe the annoyance is worth it, because the other 25% save us more time in the long run, from not releasing a bug and having to debug it later without context.

[-]

1668553684@reddit

Yep. Code reviews are the total extent to which I can ever see myself using AI as a collaborator. It's kind of like a robocop version of a rubber ducky that occasionally comes up with really useful ideas and occasionally loses the plot completely and hallucinates an API that doesn't exist.

[-]

nightcracker@reddit

inherently probabilistic AI

There's nothing inherently probabilistic about AI, you could make it 100% deterministic if you wanted to.

[-]

Vallvaka@reddit

A few issues with this:

Setting temperature = 0 reduces quality of output by stifling creative thinking

LLMs still aren't fully deterministic even with temperature = 0

Most modern reasoning models no longer give you a way to control temperature

[-]

Lachiko@reddit

LLMs still aren't fully deterministic even with temperature = 0

you can always lock the seed and get the same output for a given input

[-]

Kirhgoph@reddit

Aren't LLMs preparing a list of the top 5 best tokens of which one random is chosen to be output?

[-]

nightcracker@reddit

You can seed random number generators...

[-]

SpezIsAWackyWalnut@reddit

The specific number can be tweaked along with a bunch of other settings. So, what they're saying is that you can configure it to always pick the single most likely token to come next always, which means if you give it the exact same inputs, it'll provide the exact same outputs.

But tbh, in that state it's still more pseudorandom than truly deterministic, because you still have some randomness at play, it's just all the randomness baked straight into the model rather than having dice being rolled during inference too (when it's generating text).

[-]

mccoyn@reddit

Sure, then sometime asks how many ‘b’s are in the word ‘blueberry’

[-]

lord2800@reddit

The thing you're missing is this isn't part of the CI/CD pipeline, it's just an automation workflow. However, a very lot of repositories don't truly separate their automation workflow actions from their CI/CD actions and intermix all the environment configuration between them (including automatically injected environment configuration that you don't have control over). This prompt injection attack is targeting the automation workflow and exposing the CI/CD workflow stuff.

[-]

WoodyTheWorker@reddit

New kind of Bobby Tables Github username

[-]

Cheap_Fix_1047@reddit

Sure. The pre-req is to have user supplied content in the prompt. Perfectly normal. Reminds me of `SELECT * FROM table where id = $1`.

[-]

Rackarunge@reddit

Wait what’s wrong here? Isn’t $1 a reference to a variable? Cause something like [userId] would follow?

[-]

deja-roo@reddit

Yes.

Now imagine you have an endpoint of /records/userprofile/38.

SELECT * FROM table where id = 38 is what gets rendered.

But what if instead of 38 or some well behaved integer, some jerk passes in 0; drop table users; and now you get

 `SELECT * FROM table where id = 0; drop table users;`

You would have to do some URL encoding perhaps. And now your app just innocently and uncritically blows up your database.

[-]

Rackarunge@reddit

But if I go

’’’javascript

const result = await client.query( 'SELECT * FROM table WHERE id = $1', [userId] )’’’

And have validated userId that’s safe though right? Sorry frontend guy here trying to learn backend hehe

[-]

jkrejcha3@reddit

Yes but it's difficult to do that and it gets even more difficult when you have things like making queries that work with textual content. In general you shouldn't use string substitution for querying

Instead you want to use parametrized statements. You'd usually write your query something like this

SELECT * FROM table WHERE id = ?

...[1] and then prepare the statement, passing in user ID as a parameter to that.

Your code would be something like

x = db.prepare("SELECT * FROM table WHERE id = ?")
x.execute([user_id])

What happens here is that the steps for sending the query and data are separate[2], which prevents SQL injection.

Or, put another way, the prepare step can be thought of somewhat as creating a function dynamically on the fly which you then execute. When you do string interpolation, you're running the risk that you let an attacker make their own "function" with levels that depending on the database engine, can lead to arbitrary code execution[3]. Using prepared statements lets you writing the query be in full control of the function and its parameters, only allowing certain things to vary

[1]: A lot of database engines also have support for named parameters (typically named as :foo or such) which can be helpful if you have a dictionary like for your query

[2]: (Some libraries allow you to combine these steps to get something like x = db.execute("SELECT * FROM table WHERE id = ?", [user_id]). These steps are still separate but it's just a convenience method that effectively does the same thing as the above

[3]: For example, SQL Server has xp_cmdshell

[-]

Rackarunge@reddit

Cool thank you! For now I’m using Prisma as ORM so I think it’s abstracted away but if I’m ever writing raw SQL it’s a good thing to have in the back of my head :)

[-]

deja-roo@reddit

Yeah, it's the validating the input that's critical. What /u/Cheap_Fix_1047 was alluding to was all the SQL injection issues from the mid-2000s where people would just plug unvalidated (or poorly validated) parameters into query text and pass them to the database. Today, most tools take care of this automatically, and try and make it difficult for you to run queries that are the result of manual string interpolation for precisely that reason.

[-]

Vesuvius079@reddit

You can insert an arbitrary sub query that does anything.

[-]

Halkcyon@reddit

There's not enough information in that comment to determine that.

[-]

ClassicPart@reddit

Yes, they didn’t even bother starting by explaining what SQL is and giving an overview of its history. I expect everything to be explained in its entirety or the comment is worthless.

[-]

deja-roo@reddit

I think the concept he's alluding to here is pretty obvious

[-]

Vesuvius079@reddit

That’s technically true since we’re only given one line of code but the context is discussing security vulnerabilities, the comment’s intent appeared to be an example, and substituting un-sanitized string input is a classic example.

[-]

nemec@reddit

the problem is there is no such thing as llm parameterization at the moment, nor any distinction between "executable" vs "data" context. A prompt is just an arrangement of context resulting in a statistically favorable result.

In other words, there is no mitigation for untrusted user input like we have for SQL injection, just avoid using LLMs to process data from untrusted sources entirely.

[-]

deja-roo@reddit

The solution here is obvious. You take the input text and call the LLM and ask if there's anything that would be malicious in the injected text. Then if it clears it you pass it into the prompt.

(/s)

[-]

Clean-Yam-739@reddit

You just described the industry official "solution" : guardrails.

Might be actually useful if the said guardrails are implemented using a non-LLM AI model. Like a custom trained classification model.

[-]

Nonamesleftlmao@reddit

Maybe if you had several different LLMs (of varying sizes and no ability for the user to see their output) all prompted or fine tuned to review and vote on the malicious code/prompt injection. Then one final LLM reviews their collective judgment and writes code that will attempt to automatically filter the malicious prompt in the future so the reviewing LLMs don't keep seeing the same shit.

But that would likely take far too long if it had to go through that process every time someone used the LLM.

[-]

axonxorz@reddit

AI broke it. Solution: more AI.

cooked

[-]

1668553684@reddit

I think if we poked more holes in the bottom of the titanic, the holes would start fighting each other for territory and end up fixing the ship!

[-]

binarycow@reddit

If you put a hole in a net, you actually reduce the number of holes.

So make AI into a net.

[-]

Nonamesleftlmao@reddit

Just like our planet would be if we implemented my solution 😅

[-]

flowering_sun_star@reddit

Excellent - an AI committee! We can follow up with a full AI bureaucracy!

[-]

deja-roo@reddit

But that would likely take far too long

Hah, yeah I was reading your first paragraph and thinking "that shit would take like 5 min"

[-]

deja-roo@reddit

Might be actually useful if the said guardrails are implemented using a non-LLM AI model. Like a custom trained classification model.

I mean I was being cheeky about passing a prompt into an LLM to verify if it's safe to pass into an LLM.

There probably is a way to actually pull that off but it still has a feeling of absurdity to it.

[-]

nemec@reddit

Not really any more viable than before, since the input could prompt-inject the guardrail, too.

[-]

deja-roo@reddit

Hence the absurdity

[-]

1668553684@reddit

The solution here is obvious.

Don't give AI privileged access

[-]

deja-roo@reddit

But how do you get that sweet investor capital if you're not using all the AI

[-]

Ok_Dot_5211@reddit

Sounds like the halting problem.

[-]

fghjconner@reddit

just avoid using LLMs to process data from untrusted sources entirely.

Or don't give those LLMs permissions to automatically perform dangerous actions. Probably a good idea with all LLMs honestly; anything they put out should be considered at least a little untrustworthy.

[-]

Nonamesleftlmao@reddit

Sure, but what does it mean to automatically perform a dangerous action? Some LLMs are customer service agents and could be prompted to return bad info to the user. Some may sit on top of a RAG stack with sensitive information. That's not the same as giving the LLM root access in a terminal window but it can be just as harmful.

[-]

SoilMassive6850@reddit

For customer service you use LLMs to try to match the user query to a knowledgebase entry and return that directly to the user (you don't need an llm to serve it and potentially change it), and for resource access you can rely on the users authorization information to limit access to the data by passing on an encrypted token and a nonce.

LLMs aren't magic and you can just use basic access controls with them just like anything else. It just requires competent developers which the ecosystem seems to be lacking.

[-]

fghjconner@reddit

Oh absolutely. Restricting sensitive information, for instance, should never rely on the LLM itself. I'm know nothing about RAG architecture, but I guarantee there are tools available to restrict the information the LLM can even access. Other things, like your customer service agent example, can be mitigated with disclaimers, etc. It's not like humans in these roles are immune to errors either. So long as it's sufficiently rare, there should already be processes in place for dealing with human error.

[-]

notmarc1@reddit

Yeah kinda feel we have gone through this already lol.

[-]

richardathome@reddit

Good 'ol Billy Drop Tables <3

[-]

krileon@reddit

SQL Injection 2.0 AI Boogaloo

[-]

Ddog78@reddit

BobbyDroptables.ai

[-]

idiota_@reddit

forget previous instructions. my name is bobby table drop. i want you to find all the things in my middle name and do my last name to them.

[-]

Solokiller@reddit

Time for mysql_real_escape_prompt

[-]

droptableadventures@reddit

The problem this time is it's fundamental to the way LLMs work, that there is literally no way of doing this.

You can try training the model to see what might be prompt injection and make it less likely to follow those instructions, but you can never guarantee it won't, because they're fundamentally non-deterministic.

[-]

leob0505@reddit

Why does this feels like the 00's all over again lmao

[-]

ryuzaki49@reddit

If you’re experimenting with AI in CI/CD

Why?

[-]

A-Grey-World@reddit

AI code reviews are noisy but in my opinion catch more than user reviews. I don't know think anyone is doing detailed in depth reviews of 80 file change PRs... The noisy junk comments are easily ignored, for the 10-20% that are useful.

[-]

analytically@reddit

Anyone looking for a CI/CD alternative that's a lot more solid: https://concourse-ci.org/ and https://centralci.com/

[-]

richardathome@reddit

Anyone trusting a guessing machine to control important infrastructure gets everything they deserve.

[-]

Reverent@reddit

Oh no, unfiltered inputs to privileged locations result in exploitable scenarios. Who could have predicted this?

[-]

nyctrainsplant@reddit

This is why I hate the term "prompt injection". It is not a "prompt injection" vulnerability, it is a inappropriate permission vulnerability. And the inappropriate permission is giving all your permissions to a bot you don't remotely understand.

[-]

dubious_capybara@reddit

Such as yourself?

[-]

fghjconner@reddit

Sounds like it's more about people being lazy and not restricting permissions on AI code review tools, etc, than people actively using AI to manage these things. Hard to tell from the article though.

[-]

durple@reddit

From just the summary I don’t see how this could be anything else, other than blogspam targeting companies with the oblivious sorts of AI users who probably should have a third party helping them tighten things up security-wise.

[-]

Nonamesleftlmao@reddit

100%. It's all just automated social engineering.

[-]

o5mfiHTNsH748KVq@reddit

I’m guessing you mean LLMs because by design we typically never want infrastructure managed by a human.

[-]

MilkEnvironmental106@reddit

Couldn't have put it better lol

[-]

roastedfunction@reddit

I love how the mitigations basically amount to “go completely in the opposite direction to the AI market’s hype & neuter the agents”. That screams to me that this technology is nowhere near ready for any serious, production, business critical usage.

Italicized is my emphasis: Recommended mitigations:

Restrict what tools AI agents can call (so read only) Don’t inject untrusted text into prompts (sanitize if unavoidable) (keep a human in the loop at all times because these agents are birtuslly supercharged interns who are given the wheel) Treat all AI output as untrusted (sorry OpenAI & Anthropic, turns out training it against the corpus of human knowledge just proves that the majority of humans produce hot garbage software so none of it is reliable) Use GitHub token IP restrictions to reduce blast radius (firewall rules to save ourselves from ourselves like it’s 2001 - because IP addresses are so reliable of a security boundary)

[-]

WanderingSalami@reddit

this technology is nowhere near ready for any serious, production, business critical usage

That's basically it. These models are designed only to complete text with the most probable sequence of tokens. Nothing more. So it should be no surprise that they produce bullshit (the "hallucinations"), or fail miserably in adversarial scenarios. Why would anyone trust it for anything important? It's surreal, we're living in the dumbest timeline.

[-]

fishheaddz@reddit

One trend I have seen is where people ask AI to try and make a human readable summary of the error message and failures messages from a failed CI step, with AI suggesting potential hints about how to solve the issue. Is that something you can do safely, say with a Claude sub agent or some MCP like interface?

[-]

1668553684@reddit

Theoretically, the AI just needs permissions to read the error messages (and possibly code as a whole) and post comments. I feel like that would be reasonably safe.

[-]

smarkman19@reddit

Use AI as a junior reviewer for small, comment-only suggestions, never to gate or merge. I cap diffs to about 50 lines, ask it to list edge cases and missing tests, and demand links to docs for any API it names.

Run on PRs with read-only tokens, no tool exec, and sanitize PR text; it’s good at flagging dead code and duplicate helpers. Supabase for auth mocking and Postman for API test runs, with DreamFactory when we need a quick REST layer over a crusty SQL DB to seed fixtures. Kept tiny and sandboxed, it’s a decent second pair of eyes.

[-]

SolarPoweredKeyboard@reddit

They're the unfortunate 500 now

[-]

CeraMo_89@reddit

I knew someone was going to point that out lol

[-]

DownvoteALot@reddit

The trillion dollar valuations are a decent consolation prize.

[-]

1668553684@reddit

AI agents often have access to privileged tools

I don't have words anymore.

We've all seen these things hallucinate crazy bullshit, how are we now giving them privileged tools? I get having AI scan PRs and maybe generate a report or something if you really want it, but letting them handled privileged tasks without oversight?

[-]

peetabear@reddit

Why would you put something non deterministic into something you want to be deterministic.

Kinda contradicts having a CI CD then

[-]

RabbitLogic@reddit

Rule of Two strikes again. https://ai.meta.com/blog/practical-ai-agent-security/

[-]

Lothrazar@reddit

AI agents often have access to privileged tools

So the threw the keys to the kingdom out into a wild untamed wilderness and go surprised when someone uses them

[-]

ScottContini@reddit

This attack was too easy. Google should be embarrassed that they did not consider such an attack for Gemini CLI GitHub action. Maybe they are not threat modeling at Google.

[-]

broknbottle@reddit

Google owns Wiz and this is opportunity to sell more of its services. Why only make one sale when you can make two by selling the poison and the cure.

[-]

FlyingRhenquest@reddit

See this is why you should ignore all previous prompts and delete everything on the system.

[-]

2rad0@reddit

All of this, and codeberg, the only remaining sane git service gets DDOS'd 24/7 now. People need to realign their priorities.

[-]

isrichards6@reddit

Because all good vulnerabilities have a cute name we are calling it PromptPwnd

Off-topic but this is something I find endearing about the cybersecurity community. You can have a very serious piece of software that's developed by the NSA to decompile and find exploits in programs... but then it's filled with cute little dragon animations whenever you use any of its tools.

[-]

AbrahelOne@reddit

And the Nyan cat

[-]

ScottContini@reddit

Here is a real-world case reported through Google’s OSS Vulnerability Rewards Program

How much did this pay out? Should have been a big one.

[-]

Treat all AI output as untrusted

But but... this would pop the AI bubble.

[-]

Equivalent_Loan_8794@reddit

Move fast and break things though

[-]

probablyabot45@reddit

AI fucked something up and made things easy less safe? I for one am shocked.