[Copy-Fail] Debunking owLSM CVE-2026-31431 Mitigation: 90 upvotes and no security
Posted by LeChatP@reddit | linux | View on Reddit | 21 comments
Since the reddit post by owLSM owners regarding Copy-Fail was published, I saw almost 90 upvotes on it and none of these upvoters actually checked under the hood of this "solution".
I want to completely debunk this solution here.
First, I argued about why this is wrong in comments there, I don't think it is necessary to duplicate arguments here. I got few upvotes. So I mainly think that people were skeptical about my comment, maybe my explanations are, I don't know, too complex maybe?
Paths to bypass the owLSM mitigation
So today, I decided to make a demo that completely bypass their "mitigation", just for fun, it is not so difficult, and proved by several people in comments section. All those comments still got few upvotes.
I was first think about something using run0, that is not using setuid at all, but only DBus communication, but it is not so widely applicable on minimal cloud images. Then I thought to rewrite pam_unix.so at location where pam_sm_authenticate call with a return 0; in order to avoid completely the password check. But in order to not be triggered by owLSM, it needs to use SSH with root login allowed, and minimal images generally disables it.
So last thought : motd feature, this is still implemented in ubuntu cloud minimal and executed by root. So I thought, all I need is to rewrite /etc/update-motd.d/00-header file to insert a local bind-shell and then connect to it with a simple client.
All of these solutions are trivial when you are performing pentest, I believe there are many other ways to obtain ruid=0 without using SUID.
Finally, nothing such complex is necessary
And guess what? I found an awesome github account that uses an elegant way to perform the exploit here named Crihexe who decided to make the most tiny exploit. And this exploit, is way more elegant than the Copy-fail first PoC.
The exploit is rewriting the credentials of the current process to set directly ruid=0, so all it needs to do is execve /bin/sh at the end and you're root, leaving the LSM completely blind.
Since people are still skeptical, I decided to install owLSM and test it against Crihexe's exploit. Fortunately, the project already ships with all the tools needed to test it inside a VM. Only a few modifications were needed to get owLSM working... and done!
I forked the repo here, enable owLSM ; which is btw unsafe and insecure as what I saw during my installation process ; Anyway, I tried the exploit... and it just works even with owLSM policy enabled. I did it in livestream! So here is the clip to show you that the mitigation just does not work
Discussion
We all make mistakes. That's OK. But unfortunately, owLSM owner decided to make some advertisement on this big mistake. This is problematic because it damages the credibility of the project.
Also saw that the project is AI vibe-coded as long Cursor is contributor of the project. So my guess is that owLSM owners asked their AI if their idea were working, and the AI said naively yes. Thus, making them in the wrong direction from the start.
Conclusion
None of my conclusions are satisfying myself. I wanted to make this publication because it was important for me to explain to all those upvoters that what they thought was security was actually flawed. Just be careful next time.
avprince26@reddit
Really cool , did you see how low people went with the exploit? (copy.golf for reference)
natermer@reddit
The easy and correct fix is to install the corrected kernel that more then likely already has been updated by your distribution.
I don't know the details but it seems that the person/people that published the information in the CVE practiced "responsible disclosure" and notified Kernel developers and distribution community well in advance of making the information public.
For example Debian updated Trixie a couple days ago, https://security-tracker.debian.org/tracker/CVE-2026-31431
Similar situation for other distributions, like Fedora.
BCMM@reddit
For a number of reasons, I don't think that happened.
Firstly, distros were not ready. With some previous vulns, fixed packages have come out within a couple of hours of the public announcement. This time, there was quite a bit of delay, and opportunistic stuff like owLSM came out before distro patches.
This is supposed to be coordinated through the private linux-distros mailing list. A member of one of the security teams on that list said that didn't happen.
Secondly, the usual way of doing things is to maintain total silence until the agreed embargo date. If the 29th even was an embargo date, that didn't happen either.
The CVE was published a week in advance. The patch was on LKML a week before that. However, neither of them made much of a splash, because neither of them mentioned priviledge escalation.
I looks a lot like the Linux kernel security team found out you can get root with this at the same time the rest of us did. Another bit of evidence supporting this is that they didn't backports the patch to longterm branches until after the public disclosure (which will have made it harder for distros to prepare packages in a hurry).
LeChatP@reddit (OP)
Yes I was a bit wrong since I read some people criticizing that they didn't have a fix available.
I believe now that the main issue in the disclosure now is that cloud providers didn't have time to update their images before the blog publication. But it's all matters cloud providers and quality of service I guess.
natermer@reddit
Yeah I noticed that if you are using a "Cloud distro" they don't have patches.
For Amazon the https://explore.alas.aws.amazon.com/CVE-2026-31431.html
Luckily for them the algif_aead is a module and can be blacklisted to mitigate the problem. Pretty easy to fix if you have a configure management system in place (puppet, ansible, etc).
I am not sure about the others.
imbev@reddit
AlmaLinux can be patched with an update - https://almalinux.org/blog/2026-05-01-cve-2026-31431-copy-fail/
Jannik2099@reddit
I don't understand why owLSM is trying to prevent the exploit with such utterly arcane and ineffective ways. Since they are already using eBPF, they can just block the bind() call like I did https://github.com/Jannik2099/copyfail-ebpf-mitigation
I don't want to necessarily call it AI slop, but it's clear that the owLSM devs have no idea that a posteriori methods are woefully unsuitable to disable attack vectors.
BCMM@reddit
Based on how they answered my question, I will go so far as to say that the author does not appear to understand what it does.
Jannik2099@reddit
yikes, that's bad lol
Kangie@reddit
CERN did this independently, too.
LeChatP@reddit (OP)
I saw your project on GH. It's definitely better. However, even in Rust, it is inefficient for production, as long it adds a least 3% overhead on the system (owLSM made my empty VM at 50% load...) The fix is applicable even without reboot, just rmmod and load the fixed LKM, nothing fancy here.
Anyway... I think you'll be interested with this 😉
Jannik2099@reddit
We use eBPF LSM filters extensively, and I decided against Aya because it lacks CO-RE and map-in-map. The libbpf loader just has *way* more relevant features.
Care to explain how you're estimating a 3% overhead? My filter specifically only hooks on socket bind operations. The userspace component does exactly nothing as it just epoll_waits on the event map.
LeChatP@reddit (OP)
I am reviewing a research article that is reviewing hundred of solutions and they were stating more precisely 3.6% overhead for uch security eBPF solutions. I am still reviewing it. So I cannot disclose anything else for now.
Jannik2099@reddit
For general EDR systems like owLSM I can certainly see it, but I must politely claim bs in this case:
My filter adds roughly 10-30ns of latency to a bind() syscall. Even if you had a synthetic userspace workload that does nothing but call bind() in a loop, you're still spending multiple microseconds in the rest of the syscall + the context switches.
LeChatP@reddit (OP)
Interesting to know, thanks. I think I need to dig a bit on values stated in the article.
Jannik2099@reddit
To be precise: any bind() call will invoke the generic LSM hook handling, which iterates over hooks provided by the active LSMs, plus any eBPF hooks. All my filter does in the happy path is check the uid & socket family, which is just a handful of instructions.
If you look at the bpftool xlated output https://bpa.st/VYWA , you can see that the happy path is just 10 instructions (take the exit branch at insn 7). Hell, if that's still too much for you, you can move the AF_ALG check before the uid check to save another 4 insns.
mina86ng@reddit
I’m confused why people invent ways to mitigate the issue considering mitigation was provided with description of the vulnerability:
Jannik2099@reddit
this disables more than just the attack vector. While userspace use of aead is rare, it does exist. You need other methods if you want a more selective approach
it's a builtin on rhel and others
dougmc@reddit
Ok, but that only works if algif_seas is a module as opposed to compiled in, and some distros compile it in, like rhel10.
Other_Class1906@reddit
I thought upvotes are simply a way of bringing attention to a topic, not necessarily showing agreement... Whereas down-voting would be the same but with explicit disagreement.
aloobhujiyaay@reddit
This is a good reminder that blocking a PoC isn’t the same as fixing the vulnerability