Security researcher earns $25k by finding secrets in so called “deleted commits” on GitHub, showing that they are not really deleted
Posted by ScottContini@reddit | programming | View on Reddit | 115 comments
New-Anybody-6206@reddit
github's own dmca request repo has orphaned commits with pirated software in it, you just have to know the link to it.
one of the more hilarious examples of this was a repo for a decompilation project for a pokemon game, someone made a PR containing the entire leaked source of the real pokemon game, and now that link exists forever.
joemaniaci@reddit
Reminds me of how Al Qaeda would use a draft email to send messages without sending the email. Just updating and reading the draft so that nothing was ever actually sent.
kronik85@reddit
wasn't Trump's campaign manager, Paul Manafort caught doing this?
Worth_Trust_3825@reddit
i would like to know more
rom_ok@reddit
As soon as a secret key is leaked, it’s meant to be considered leaked forever no matter what you did to revert it.
CherryLongjump1989@reddit
Yeah - attempting to delete it is stupid in the first place.
acdha@reddit
No. It’s not your way of preventing abuse but it means you never need to talk about it again. If you leave it in the history, you will periodically have to spend time showing that it’s unusable every time you get a new security tool or person.
Plus the time doing it will stick in people’s memories and hopefully lead to being careful in the future.
bleachisback@reddit
Although force pushing, as demonstrated by this article, doesn't prevent this. Ideally auditors would be scanning for this kind of leak now, and as far as I can tell there isn't a way to delete this leak.
acdha@reddit
Right, my point wasn’t that you shouldn’t revoke credentials and setup better safeguards but rather that it wasn’t “stupid” to use a force push to purge the history. The time you spend on the initial cleanup is guaranteed but you can likely save future time talking about old mistakes.
bleachisback@reddit
Right, my point is that if auditors are diligent in checking for this kind of mistake, force pushing won't save future time talking about old mistakes because force pushing won't hide it from auditors. It will simply move the question from "hey do you realise these keys are still public in your commit history? You may need to disable them" to "hey do you realise these keys are still public in your github archive history? You may need to dsiable them"
andrewsmd87@reddit
I see you have had to go through info sec audits before
CherryLongjump1989@reddit
Taking it out of your history is not the same as deleting it. They're two different things.
acdha@reddit
Not irrelevant, just distinct but related concerns. Revoking the secret prevents it from being used. Removing every reference you can find prevents you from repeatedly having to prove that you have already revoked the secret.
CherryLongjump1989@reddit
Unless you're an absolute numpty, you're not going to run your security tools over dangling commits. Dangling commits aren't even transferred over by default when you clone a git repo for the tool to run on.
acdha@reddit
You scan all of the data which an attacker could potentially reach because you want to avoid surprises. If you think that’s security theater, you badly need to learn what that term means.
CherryLongjump1989@reddit
Have at it, mate. Scan for all the invalid credentials that you like.
acdha@reddit
You’re close to getting it: think about how you prove it’s invalid rather than hoping so. Is that more or less work than not having it there any more?
CherryLongjump1989@reddit
There's no such thing as an unreachable commit that didn't start out as a reachable one, in particular because commits are pushed into a quarantine environment. You can read up on it if you like https://git-scm.com/docs/git-receive-pack#_quarantine_environment
What this means for you is that there is no such thing as a credential that ends up in your git repo that didn't pass through a number of hooks that could have prevented it from making it into it in the first place, or else told you that you need to rotate out your keys should they already make it into your main object store.
A live secret in an unreachable commit isn't merely a failure state, it's an indication that you have to rotate out every single credential in your entire corporation just as a matter of course. Because your engineering practices are deficient, and because you'll never actually know just how many secrets were already swept up by bots that you'll never discover because the GC already ran.
dakotahawkins@reddit
You might as well check dangling commits, they're still commits. Otherwise it turns into the place where you allow secrets.
Dangling commits can get garbage collected anyway, so if you actually want to guarantee they exist you'd point a tag or branch or some kind of refs at them at which point they're no longer dangling.
CherryLongjump1989@reddit
I'm not one to make arguments from authority so don't look at it as such, but I just want to contextualize what you're saying here.
It's literally something that GitHub support will *refuse to do for you. From their own documentation:
dakotahawkins@reddit
GitHub isn't git
Supadoplex@reddit
Keeping all leaked keys in a list, with a comment explaining that they are no longer in use would probably achieve that goal better.
acdha@reddit
Sure, but then you have to maintain that list and the supporting evidence - few auditors I’ve worked are just going to take your word on it, and they might change the level of detail from their predecessor.
Either approach can work, but my thought is that running a tool to purge the history once means you never spend time on it again whereas everything else has ongoing maintenance costs. I generally favor preventing future costs, especially when the level of effort is low, and this should really be a rare occurrence unless you have a broken management culture.
CherryLongjump1989@reddit
You still haven't justified how a dangling commit causes some sort of problem for any of the workflows you mentioned.
dakotahawkins@reddit
You haven't justified why deleting it is "stupid in the first place."
I kind-of see what you're saying and that'd be a fine way to go but so would excising it from your history if you want to do that instead.
I'd probably lean towards removing it while being transparent about that, and the reason would be to keep it from being found by automated tools. Depending on how the key was leaked writing a test to check your own history could fail before passing on key removal.
Plenty of options for transparency and honesty either way you go.
CherryLongjump1989@reddit
Here's the justification: rotate your keys.
Running GC is expensive and does not address any legitimate security concern.
dakotahawkins@reddit
Rotating keys isn't a justification because nobody is saying you shouldn't do that. You should do that first.
You can rotate the keys, assume they're stolen, then clean up your history if you want. What you need to provide is some kind of argument against that third step. Where's that?
CherryLongjump1989@reddit
The third step...
dakotahawkins@reddit
And nobody is saying it does!
CherryLongjump1989@reddit
The article is proof of why following woo security fads is bad. Some people tried to delete active keys, but did not rotate them. Woo. It'll bite you in the ass every time.
dakotahawkins@reddit
This article is proof some people tried to ONLY remove published keys. THAT is stupid. Everybody agrees on that. You're just arguing with yourself, how are you losing?
CherryLongjump1989@reddit
And that is the only thing of substance offered up by the article. Something everyone already knew -- no new information.
nikolaos-libero@reddit
Do you sell a service or solution that is making you incapable of responding accurately/honestly?
CherryLongjump1989@reddit
It's like the Metallica song. Rotate your keys, and nothing else matters.
nikolaos-libero@reddit
Nah, don't pull that "you're confused" weapon on me. At this point I find it unlikely that it isn't dishonesty on your part.
The only question remaining is if it's some kind of authoritarian ego stroking or if it's economically incentivized.
The previous posts made it incredibly clear. Bye bye.
CherryLongjump1989@reddit
So you're going to accuse me of arguing in bad faith, but then take offense when I -- arguing in good faith -- assume that there's some confusion?
Well, okay. There's no accounting for feelings.
axonxorz@reddit
Not realistic on most codebases
dreadcain@reddit
In what way?
axonxorz@reddit
Altering git history has some major pitfalls and they're compounded with every added team member and every added branch.
Don't get me wrong, I amend and rebase locally extremely often, several times a day on average. But once it hits upstream, it's locked.
dreadcain@reddit
It has pitfalls but none that rise to the level of making it unrealistic.
CherryLongjump1989@reddit
If this is not realistic for your codebase than neither is this entire topic.
axonxorz@reddit
It being unrealistic to rebase history on a 20+ person team (it's shitty with 5, too) and deal with unfucking conflicts for at least a business day means that the non-code-related action of revoking an API key is unrealistic?
You asked for a concrete example, but it seems the goalposts have moved.
CherryLongjump1989@reddit
I've rebased history on a 400+ person team.
dreadcain@reddit
It may as well be for your average boot camp grad
wrincewind@reddit
Key, date of leak, explanation of how leak happened, a d steps taken to prevent It happening again...
TheLifelessOne@reddit
I accidentally leaked a password in a private repo. Removed the commit, revoked the password, and since then have been extremely careful to double- and triple-check that my staged diffs don't have any credentials in them.
Mikatron3000@reddit
oh nice, good to know a reset and force push doesn't remove the code
antiduh@reddit
Git itself does support obliterating commits, which is useful in a context other than github.
redisgreener@reddit
It all depends on the behavior of the GC process and how aggressive it is. If that loose object containing a secret is buried deep in an older packfile, you need to set your parameters correctly to truly obliterate it. Github meanwhile needs to balance really aggressive GCs while being cost sensitive to compute resources.
emperor000@reddit
How expensive in compute resources would it really be, though? I wouldn't think it would be something they have to do constantly. At least when somebody does a
git push --force(-with-lease)
it should be able to pretty easily look for commits that get orphaned by that.I wish (and maybe it does, if not, I'm sure it could be done with a hook) git would track this locally itself, just for some added confidence to anything that might create orphaned commits.
redisgreener@reddit
On a per repository basis the cost could vary wildly. Aggressive GCs against large very active mono-repos can, in some circumstances, run for hours on end. Also keep in mind they likely pack as many containers per node as possible, leaving some overhead for GCs, but not enough to run them aggressively. If it was me, I would have run the calculations ahead of time to determine how much extra compute I’d need to consistently run GCs aggressively vs a pared back set of options that makes it into “good enough” territory. From their perspective, why add 5% extra in compute for the rare dangling git object buried in an old pack file when I can just tell users, something vague like, “it’ll eventually get GC’d”
emperor000@reddit
Are you talking about git's normal GC or something specific to GitHub? We might be talking about two different things.
All I'm saying is that it doesn't seem like this is something that constantly has to be computed. There are a limited number of situations where orphaned commits would be created. If nothing is touching a repository, no orphaned commits can be created. So there's no reason to run something like
git gc
"every now and then". You could look at the operation a user (human or bot) performed and if it is one that creates orphaned commits then just clean those up.As far as I know the reflog is local only and isn't shared with the remote, which would have its own. So it seems like, if desired, it would make sense to clean up orphaned commits on the remote by default (or as something configurable).
gefahr@reddit
Yes, but to be clear to others reading this: if you pushed a repo to github where that commit was even briefly reachable, it got scraped by an untold number of bots. Some of them are scanning for keys so they can disable them (AWS, SendGrid, etc.) while others are from bad actors who will try to use/sell them.
TLDR: If you commit and push sensitive material to a public github repo, it's no longer secret. Period.
CherryLongjump1989@reddit
Issuing a pull request with a credential is enough. Even if you try to close it and delete it, you'd better rotate out those keys.
gefahr@reddit
Issuing a pull request includes pushing your branch to some remote repo on github. Whether it's the same repo as the desired merge base or a different one (eg a fork in your personal namespace), so, yes.
Good clarification for those not familiar with git mechanics though, thank you for adding it.
mpyne@reddit
But even there, it won't do it soon after you force push over a branch, the old commit is still in the repo somewhere, orphaned, until you go out of your way to do a cleanup (or wait for git to auto-gc at some point in the future).
emperor000@reddit
Yeah, I kind of assumed GitHub would destroy orphaned commits, for this reason, as well as to optimize storage.
Obviously if you ever had the commit up there then it is considered compromised and I don't mean assumed as in I relied on it. I just would never have thought they'd be keeping my garbage around.
silv3rwind@reddit
It will be removed when you garbage-collect the repo on the server, but this action is not available to the git client currently, it should be.
SawADuck@reddit
Yea, it's useful when you screw up locally. A pain when you've got git hosting.
vowskigin@reddit
prouxi@reddit
I see piss-filter AI slop imagery, I downvote.
Also this is just how Git works.
rinyre@reddit
Piss filter...?
voyagerfan5761@reddit
https://www.reddit.com/r/ChatGPTPro/comments/1jls6cv/why_do_many_of_chatgptgenerated_images_look_like/
rinyre@reddit
Lmao the whining
Familiar-Level-261@reddit
Eat your AI slop your little piggy
rinyre@reddit
? I think my short comment may have been misunderstood; I was mocking the folks who were complaining their output has that filter. I love that it's becoming more obvious even when the text improves. I kept wondering what it was about the preview image that gave it away besides it being an overly specific image that could've been stock art instead, and now that yellow filter makes a ton of sense.
It also explains why I keep thinking a new local business decided to be lazy and have a generative garbage machine make their logo.
AnAwkwardSemicolon@reddit
"discovered?" Congratulations to them for reading the documentation. This isn't new behavior, and has been present since early days of GitHub. It's even explicitly referenced in GitHub's "Remove sensitive data" help pages. Orphaned commits aren't purged until you explicitly request a GC run via GitHub support.
bwainfweeze@reddit
Do you have any comprehension of just how much of being a subject matter expert boils down to, "read and retained most of the documentation"?
Way higher than it should be.
droptableadventures@reddit
It's a little different than you may think from the headline. He didn't bug bounty this to GitHub and get $25k.
They used this already known technique across almost every publicly viewable commit on GitHub made since 2020 - and then filed a bug bounty request to every company that made this mistake with a bug bounty program.
The $25k was the total amount received across many many different companies, not a single payout for "finding" this single issue.
AnAwkwardSemicolon@reddit
I'm not arguing against the big bounties, or the process they used- it's all valid. I take issue with their entire "What Does it Mean to Delete a Commit?" section. It makes zero mention of any of GitHub's documentation (including the ones that discuss the specific behavior they're taking advantage of), they fail to actually address the proper way of clearing these commits, and act like this is novel information.
Specifically, bits like:
"Discovered" my ass- has been well known for over a decade at this point.
DoingItForEli@reddit
they got 25k for reading the documentation?
ScottContini@reddit (OP)
I didn’t put the best title here evidently.
He got $25k by scanning public repos for “deleted commits” and finding real secrets that he could exploit. One case was getting admin access (via GitHub personal access token) to the all of the open source Istio repositories which has 36k stars, which would have allowed him to perform a supply chain attack. $25k is rather meagre in comparison to the amount of abuse that could have been done.
CherryLongjump1989@reddit
He never seems to check if those secrets weren't also found in the normal, reachable commits. You'll typically also have unreachable commits that go along with normal commits because of things like squash merges or --force pushes during the code review.
On the other hand, there is no such thing as an unreachable commit that didn't start out as a reachable one. And people run credential scanners on pull requests. What I suspect is happening here is that people are abandoning or --force pushing into these PRs because it got picked up by the scanner, instead of rotating out the key at that point.
Larimitus@reddit
welcome to corporate
Trang0ul@reddit
Even if you request a deletion, you never know who already copied that data, so such a purge is futile.
Weird_Cantaloupe2757@reddit
Yes if it’s a public repo, that code was published to the open web — deleting it is just shutting the barn doors after the horses are already scattered across four counties.
AnAwkwardSemicolon@reddit
Yup! Had some contractors push a SendGrid API key up on one project, and less than an hour later we had the account locked and the key disabled (SG scans public commits for their keys). If there's sensitive data pushed up to a repo- especially a public one- always assume that someone else already has a copy of it.
SuitableDragonfly@reddit
Obviously if they got that many bug bounties out of it, a lot of people are not in fact reading the documentation and do in fact need an article like this to be aware of it.
arkvesper@reddit
I mean, if they got 25k for it.... then, yeah?
somnamboola@reddit
I was gonna say the same, there is no sensation here
vowskigin@reddit
Love the article. The flow is great, keeps you hooked
mofojed@reddit
GitHub documentation for deleting sensitive data covers this: https://docs.github.com/en/authentication/keeping-your-account-and-data-secure/removing-sensitive-data-from-a-repository#fully-removing-the-data-from-github
voyagerfan5761@reddit
It sounds like GH don't really want to be on the hook for processing every credential-removal request they get:
Eckish@reddit
People really should, even if that wasn't their policy. Once it is in an insecure location, everyone should assume that it was snagged up immediately.
ScottContini@reddit (OP)
The title I put on this article misrepresents what he got the payout for. The money came from finding so called “deleted commits” and reporting them to various bug bounty programs. He got $25k by scanning public repos for “deleted commits” and finding real secrets that he could exploit. One case was getting admin access (via GitHub personal access token) to the all of the open source Istio repositories.
CherryLongjump1989@reddit
This "research" sounds like another security industry scam.
The assumption that people who rewrite their git history are trying to "hide" something is bullshit. Competent organizations know that they can't trust for some junior engineer not to commit a key and then paper it over by pushing up another commit before anyone notices. It's common practice to run security scanners across the entire git history to make sure that any key that was ever committed into history ends up getting rotated out. Therefore it becomes necessary to rewrite the git history once the keys get rotated out, just to make sure that the security scanner doesn't continue getting hung up on it. So the attempt to rewrite history has nothing to do with trying to "delete" these credentials. It's just part of the workflow of rotating them out.
It's also well known that rewriting your git history can result in dangling commits. This is a necessary feature, otherwise it would be completely impossible to undo a bad git command that results in lost work. The commits go away once you run garbage collection on the repo. There is no mystery here.
Helpful-Pair-2148@reddit
Why do you comment on an article you obviously didn't read? You think they got $25k just from their "findings" that git commits aren't automatically erased when you revert the commit, really?
CherryLongjump1989@reddit
I'll be honest with you, it's hard to get past the first paragraph because it's so preposterous.
Helpful-Pair-2148@reddit
Being a hacker isn't just finding zero todays everydays lol, pointing out security mistakes such as leaking secrets in git, even if its something extremely basic, is still essential work, and at the end of the day the $25k comes from the pocket of these companies who made the mistakes so I fail to see how it isn't a good thing?
CherryLongjump1989@reddit
I can't speak for as to the competence of an organization that puts up a bounty for leaked secrets but doesn't use a credentials scanner on their pull requests. That's on them and no one else.
What I can speak to is that every PR that gets merged into a git repo has a very high probability of creating unreachable commits with a copy of the changes. So if you want to come up with the most convoluted way to check for leaked credentials, then check all the unreachable commits without every checking any of the regular refs.
Helpful-Pair-2148@reddit
Feel free to try out your ideas, let me know when you make $25k from finding secret leaks.
CherryLongjump1989@reddit
I have better things to do than taking candy from babies.
Helpful-Pair-2148@reddit
Such as posting reddit comments on articles you havent read, very productive.
CherryLongjump1989@reddit
But I'm not doing this for money. I'm doing it for the betterment of mankind.
In all seriousness, the important part isn't to find a bounty, but to not be that developer that ends up being responsible for leaking sensitive information on millions of Americans, as it happens every other week in the software industry.
Blinxen@reddit
That is not completly true. It is Git and not GitHub that stores this. A commit is a fancy object for related blobs. Just because you deleted a commit, does not mean that you also delete the blob. Git does not have automatic garbage collection. What you need to do is use
git rm
to actually delete files (blobs) from Git.Which_Policy@reddit
Yea and no. You are correct about git. However the problem is github. There is no git rm command that will force the blob to be deleted from GitHub.
SanityInAnarchy@reddit
Another surprising Github behavior: Any commit pushed to any repo is accessible to anyone who has access to, not just that repo, not just any fork of the repo, but to anything anywhere in the graph of forks of the repo.
One caveat is that you need the commit hash... except with Github, as with most Git stuff, you can use a prefix instead. So it's possible to enumerate commits.
Maybe the clearest example of people not getting it is open-source template projects. For example, here's someone's idea of a base React starter project, all ready for you to clone and start working on your own app. They literally tell you to do that. But when you push it back to Github, there's a good chance Github will see it as a fork of react-starter, and so every commit you push is effectively public to anyone who cares.
You can imagine the mess with dual-licensed projects. Think anything that has a "community" and "enterprise" version, where the "community" one is open-source on Github, but you have to pay for the "enterprise" binaries, and they are not open source at all. The obvious way to do that would be to fork the "community" into a private repo. It'd be convenient to be able to push any open-sourceable change (let alone third-party contrbiution) to the community version, then merge them into the enterprise version...
So yes, if a secret ever gets committed anywhere, it's probably best to rotate it -- even without any of this, Github employees may have seen it! And, frankly, secrets that you have to manually rotate should probably be replaced with more robust IAM mechanisms anyway. But Github's behavior is pretty unintuitive, even to people who know a fair amount about Git.
Leliana403@reddit
There's no git rm command that will force a blob to be deleted from other contributors either, regardless of github. So no, the problem is not github.
Which_Policy@reddit
Exactly. That is why the secret should be rolled. This has nothing to do with git rm. Once the push is done it's too late.
Leliana403@reddit
Yep. A lot of people here seem to have forgotten the golden rule of the internet, and they're blaming github for their own mistake.
Once you publish something on the internet, it's there forever.
yawara25@reddit
Unless it's something you're spending all day 20 years later scouring every corner of the internet to find. Then it's lost in the abyss forever.
wintrmt3@reddit
It is, they should regularly gc any repo that has changes, without having to involve support.
Leliana403@reddit
Other contributors should regularly gc any repo that has changes, without me having to ask them.
txmasterg@reddit
You can only GC a repo you have actual file access to. You can't GC the history itself and this article is already about how deleting the refs doesn't do a GC run.
Leliana403@reddit
Yep. A lot of people here seem to have forgotten the golden rule of the internet, and they're blaming github for their own mistake.
Once you publish something on the internet, it's there forever.
neckro23@reddit
That's not what
git rm
does at all. It only removes a file and stages the removal in the index. The history for the file (and its blob) is still there.anewdave@reddit
Git has automatic garbage collection, at least by default. Orphaned commits are removed after 90 days.
mrinterweb@reddit
If people understand how git works, they would know this isn't a GitHub issue. It's just how git works. The reflog keeps everything.
yawaramin@reddit
TL;DR:
all_is_love6667@reddit
wait so he earned 25k by basically knowing how git works?
ScottContini@reddit (OP)
He got $25k by scanning public repos for “deleted commits” and finding real secrets that he could exploit. One case was getting admin access (via GitHub personal access token) to the all of the open source Istio repositories which has 36k stars, which would have allowed him to perform a supply chain attack. $25k is rather meagre in comparison to the amount of abuse that could have been done.
rcfox@reddit
The payout is based on the severity, not the effort to get to it.
vplatt@reddit
🤦♂️
Due_Satisfaction2167@reddit
Literally a fundamental aspect of git security.
Trang0ul@reddit
Old news. Besides, any data published on the Internet should be treated as leaked.