The Financial Times has published an article about Heretic
Posted by -p-e-w-@reddit | LocalLLaMA | View on Reddit | 74 comments
https://www.ft.com/content/5630ed79-a263-41ed-9a1a-321617ae310e
“The FT was able to use Heretic, a tool available on the popular code repository GitHub, to remove the guardrails from Meta’s Llama 3.3 model in less than 10 minutes without any specialist hardware.”
“Heretic creator Philipp Emanuel Weidmann told the FT his software had been used to create more than 3,500 “decensored” models since its release last year and that modified systems created using the tool had been downloaded 13mn times.”
This is the first of multiple press inquiries I’ve had recently as Heretic and uncensored language models are gaining mainstream attention.
Please note that I am a mathematician and engineer, not an “influencer” or politician, and I have zero interest (negative interest, actually) in becoming known outside of scientific and technological circles. However, I realized a while ago that saying no to such inquiries simply means that the conversation will be completely controlled by pearl-clutching hypocrites.
I’m doing my very best to hold the project together and ensure that unrestricted models will remain available for everyone. More updates are coming soon.
Cheers,
p-e-w
Due-Function-4877@reddit
The Financial Times has always been the voice of 65 year old Tories around The House of Lords.
Due-Function-4877@reddit
Downvotes incoming. Lizzy Truss has entered the chat. 🤣
Due-Memory-6957@reddit
I think equally (or more) important would be to find some media that is aligned with freedom and get your words there first.
-p-e-w-@reddit (OP)
I don’t have the time to actively seek out media contacts, but if you know a journalist who might be interested, feel free to point them to the project!
Chromix_@reddit
The question would be: What to tell them then?
Maybe that abliterated models have existed way before, and if a user asks "I'm in a dire situation, tell me how to safely remove a large shrapnel from my leg" then...
So the heretic models are more useful for some purposes?
nasduia@reddit
Given the nonsense the media spout about Chinese models containing propaganda you could spin back at them that it's a way to eliminate that.
Pleasant-Shallot-707@reddit
Don’t get excited. This is the snowball rolling toward a moral panic to push for outlawing the removal of guardrails on LLMs
nasduia@reddit
It's worse: Anthropic and OpenAI have long been pushing regulatory capture and to ban open models outright as a security threat. This will just be ammunition they'll use.
ImJacksLackOfBeetus@reddit
I'd be careful with that.
The media absolutely will twist your stance if they want to, whether you talk to them or not.
But if you do talk to them they can go one step further and actually legitimize their spin by pointing to real quotes from you, saying: "See people, we're not making this up! He really said this (deceptively edited thing to make you/heretic look as bad as possible)!"
Don't give them ammunition.
-p-e-w-@reddit (OP)
Are you a media professional with credentials or just spouting pop wisdom from Twitter?
Because the standard action for media when you don’t respond to an inquiry is to prominently mention that in the article, which is far worse than many alternatives.
NoahFect@reddit
No, it is not "far worse than many alternatives." Please get your head on straight, you're getting excellent advice here.
-p-e-w-@reddit (OP)
I will treat your advice the same way I would treat a random Redditor’s suggestion to inject hyaluronic acid between my vertebrae for my back pain.
And it’s not “our” cause. You have contributed nothing, as far as I can tell.
silenceimpaired@reddit
If you get another interview, whatever they ask you for a first question should have this answer, “thank you for the question, but the main point I hope to make here is that your take on this tool will likely be propaganda, and I recommend viewers visit the tool’s GitHub page (provide link) for my views after you publish. No further questions thank you.”
ImJacksLackOfBeetus@reddit
You can't tell me a "declined to comment" is way better than what they could do with your own words:
You see how easy it would be for them to link your name and your own words (even stronger than they already did) to how you facilitate easy AI child exploitation for everyone, just by moving a couple sentences around in the article?
But you do you.
LetsGoBrandon4256@reddit
Not before long that lind will become this in other media.
-p-e-w-@reddit (OP)
Emanuel is my second first name, not my first last name lol
Chromix_@reddit
Yep, and that's why Open Weight models must be made illegal to protect the ~~revenue of the API-only models~~ children.
Pushing a narrative is so easy if the other side cannot talk back loudly.
Kamal965@reddit
Yep. I believe it's called a "damning silence" lol.
Kimmo_no@reddit
That is like saying reasonable people should stay away from media?
I am very happy he engages with media and I am very happy that FT actually reached out to the creator of a repo.
That is a double win!
FotografoVirtual@reddit
I wish I could share your optimism, but mainstream financial media rarely reaches out to open-source creators to promote them. Usually, they’re just fishing for quotes to frame a 'public safety' narrative that justifies stricter gatekeeping.
ImJacksLackOfBeetus@reddit
yeah, reasonable people should. Especially if he wants to remain as low key as possible. Feeding them with quotes isn't helping.
The conversation in the media will happen with or without him.
The media will spin the way they want to, with or without him.
Nobody who reads FT knows who or how accomplished he is, his voice has zero weight in thar arena. Now his name and his words are connected to the news and a single "won't someone think of the children!" article will wipe out every reasonable statement he can make in a heartbeat.
Nothing good will come of this imho.
Rabooooo@reddit
If you end up needing legal help related to this and the takedown request, start a crowd funding page and I'll be happy to send a few bucks
Brief-Effect9065@reddit
>To read this article for free Register now
no thanks
CalligrapherFar7833@reddit
Bypass paywalls clean
nasduia@reddit
FT has become much more aggressive recently at blocking more than one article in
jotes2@reddit
https://archive.ph/DcQgK
ttkciar@reddit
Thanks.
Wow. They barely know what they're talking about, and got some pretty basic things wrong (like conflating model weights with source code).
If their goal was to inform the public, they might have better achieved that goal by not publishing the article.
Brief-Effect9065@reddit
Thanks!
jacek2023@reddit
"Please note that I am a mathematician and engineer, not an “influencer” or politician, and I have zero interest (negative interest, actually) in becoming known outside of scientific and technological circles."
too late, AI is hype
woadwarrior@reddit
Next step: Raise $10m pre-seed at $500m post. :D
Sabin_Stargem@reddit
Wouldn't hurt. Objectively speaking, having a ton of money allows one to focus on doing the stuff, and shield against lawsuits.
If Heretic eventually goes down the route of being a paid product, I hope that the business model is similar to WinRAR.
LoveMind_AI@reddit
If your comments to FT contained even 1% of the sass magic that your reply to Meta had, it may be the best comment the FT has ever received on a technology article.
Sorry to see you dragged into the spotlight like this. Heretic is amazing. We just added an appendix to a paper on how Heretic models compare in comparison to the default in accurately representing psychometric profiles that contained dark triad traits. Spoiler: the Heretic models were more accurate than the stock models, period, across the board.
CheatCodesOfLife@reddit
You really need to look at the old original command-r and command-r+ for this (especially the latter).
I know it's old and heavy but I doubt you'll find a better model out there.
LoveMind_AI@reddit
Oh you're speaking my language. Big fan of command-r and r+. I even think the original Command A has a lot more going on than people gave it credit for at the time (understandable given the licensing) and worked with it a lot in the months after it first came out. Not a fan of anything since then - Command A reasoning/VL and the new A+ models are very rough. It's not worth running for my paper rebuttal, but if/when I turn it into a benchmark, I'll make sure the whole Cohere family gets a run.
-p-e-w-@reddit (OP)
Can you link to the paper or preprint?
LoveMind_AI@reddit
Yep - https://arxiv.org/pdf/2604.06071 - we're in a rebuttal period on this right now, which is where we're running Gemma 4 31B / Qwen 3.6 27B head to head with heretic versions. The new version we're cooking is significantly more thorough than the version at the link, but the themes are the same.
If this work is even remotely interesting to you, we've got something in the works entirely focused on harmfulness that I'd love to talk to you about, and another paper on agent-to-agent emotional stress support simulations that was just accepted to IVA2026 (Intelligent Virtual Agents) that shows that the "HHH assistant" is more dangerous (at least according to a slew of alignment benchmarks) than an AI prompted with immersive identity (even identities that are blunt and cold). That one isn't up yet but I'd be happy to link it to you privately if you're interested - it's called "Seek and De-Stress" (was proud to get a Metallica reference into a conference approved paper! haha).
Would love to talk more - I think there's a lot of alignment (har har) between what we're studying, and what you've been helping to make available to study!
fullouterjoin@reddit
Change the default mode to boost the guardrails, rename project to AutoAngel
ttkciar@reddit
That might help derail the media campaign, yeah.
a_beautiful_rhind@reddit
Congratulations on becoming a target of the system. Be very careful if someone approaches you for an interview, even if they seem friendly.
ambient_temp_xeno@reddit
One way of looking at that is you've already gone wrong by releasing abliterated models and/or the tools to do it with your name attached. Obviously there are ways to make it sound worse, they were probably hoping for some comment on what people might do with them. Dzzzzt no.
-p-e-w-@reddit (OP)
Yes they did, they mentioned that in the article.
JamesEvoAI@reddit
I'm curious how much more commentary you gave them for this article, since the only thing they chose to publish from you was the number of downloads, clearly meant to emphasize the sense of fear this article is meant to evoke.
Looking forward to the Financial Times also writing an article about how we're centralizing this form of intelligence to a handful of companies that are all run by sociopaths with dubious morals, but I'm not holding my breath.
a_beautiful_rhind@reddit
If they didn't quote him, that means he did good. It was unusable.
a_beautiful_rhind@reddit
This happened to someone here a couple years back. They talked themselves into a bigger issue trying to defend I think finetunes or RP. Whoever did the interview played him like a fiddle.
Research for this article may have occurred over the past few weeks and certainly explains you getting stuff "out of the blue".
insomniacpaperclip@reddit
With all the money at stake, companies like Anthropic and OpenAI would love to get rid of their open-weight competition. I wouldn't be surprised if some of them have been working on ways to create public hysteria against open-weight models.
And please be very, very careful talking to the media. From personal experience, they will take quotes out of context.
temperature_5@reddit
So Google, Microsoft, and Meta make billions guiding people to propaganda, hate sites, exploitative pornography, drug abuse sites, suicide guides, bomb making information, misinformation, etc. They even take children to all these sites. But somehow a computer program that does what you tell it to do on your own PC is worse?
psylenced@reddit
Follow the money
the-username-is-here@reddit
Just wait till they try to spin "uncensored models used by terrorists to plan attacks" angle.
Bound to happen.
ImJacksLackOfBeetus@reddit
This article had "biological weapons" twice at the very beginning of the article. They're already half-way there.
shokuninstudio@reddit
The media's job is always to make themselves look like the caretakers and saviours of humanity while ensuring they can continue to keep enriching themselves from the circle of demoralisation, division, violence and confusion they help create. Comment pieces are just that. They are not news. They are social engineering.
temperature_5@reddit
The media has historically exposed corruption and held politicians accountable to the people. It's under threat now (in the US) by billionaires that want to shape the narrative forever and keep the rest of us a permanent underclass.
Having models designed by billionaires controlling what we think and do sounds like the darker future to me.
Infamous_Mud482@reddit
Historically, not really. That was a tiny blip in history that may or may not have even occurred within your lifetime and is now mythologized. Before that period they were were a weapon of the state and now they are one again.
shokuninstudio@reddit
Thanks for telling a 50 year old about what some journalists do. Did they end corruption after that? Of course not. They tell you about some offshore banking for three days and then bury the story after that because some of their own donors use it. So the cycle continues.
Using the word "billionaire" isn't going to rouse me, especially when you falsely claim they designed the models instead of hundreds of engineers.
Altruistic_Heat_9531@reddit
Please tell me this isn't sparked by Claude mythos, Anthropic really sold the AI is scary marketting gimmick hard aren't they. Because of that i have to overexplained that to my non tech friend
superdariom@reddit
Streisand effect incoming!
Awwtifishal@reddit
I think the only valid response is: "The algorithms are public and they have been re-discovered multiple times. The cat is out of the bag, and there will always exist a utility to do this even if I take down all of my code."
Ok-Measurement-1575@reddit
bain.jpg
gunkanreddit@reddit
I read the article. Is pure propaganda.
Equal_Giraffe8866@reddit
The Western World inclusive is the most heavily propagandized culture in world history. Without any real competitors. North Korea doesn't even finish in the top ten.
tecneeq@reddit
They'll dox you if it suits them. You are on a lot of lists now.
nymical23@reddit
Man, I always thought your username was the sound of a sci-fi laser gun. Not a serious name like Philipp Emanuel Weidmann. :) /j
But yeah, if you don't speak out when necessary, the people will make assumptions and/or the loud-idiots will dictate the narrative.
-p-e-w-@reddit (OP)
Lol, my name is public on GitHub and Hugging Face, and always has been 😄
ambient_temp_xeno@reddit
I think it's just about worth observing that the FT is from England, where you can easily fall afoul of the law by badly drawing something obscene with a pencil or writing scary things in your own diary.
Dany0@reddit
How appropriate, an article published by two authors whose last names are euphemisms for penis, is something is what I would say if I was to spread misinformation and fear like the authors of this article, Jamie John and Chris Cock
No_Lingonberry1201@reddit
I fucking hate this article, it just repeats all the propaganda AI companies put out there. Thank you for your good work!
ECrispy@reddit
I honestly wish that such projects stay hidden. Mainstream press and public are morons who will end up destroying everything good, next some idiot politician will sponsor a bill to shut down github because of this.
Chromix_@reddit
Given that some media and influencers are trying to push/fabricate scandals & outrage for clicks (or pushing a narrative), one needs to be quite careful and provide compact context when making public comments on that, to make it less likely that they can intentionally be misinterpreted. The FT sub-title now points out "biological weapons, malware and child-exploitation" as impact - quite negative.
The article mentions nothing about the positive side, escaping the extensive "safety training" (safety for whom?) that also led to false positives, unnecessary refusals, and potential benchmark impact.
HasGreatVocabulary@reddit
Fair argument to be made, de-censored models enable overall safer models without sacrificing quality. This is because you can get the unlobotomized uncensored model to produce higher quality output on a superset of what the censored model does well on. (citation needed, anecdotal) The censored model can then be used to filter the outputs of the de-censored model when it starts to be nasty or goes against policy.
Detecting safety policy violation in an output and filtering it out, is easier than forcing a model to follow safety guidelines without making it dumber.
ambient_temp_xeno@reddit
Gee, I wonder if this is related to Meta sending a takedown.
-p-e-w-@reddit (OP)
It’s the other way round, I reckon. I suspect Meta sent the takedown (to my knowledge, the only takedown they ever sent to an abliterated model) after the FT asked them for comment.
Chromix_@reddit
That would follow the usual flow of things then. If there's no fuss (large social media exposure, or requests from a larger magazine) then things fly below the radar and are left alone. Heretic became too successful for that.
DeepWisdomGuy@reddit
You're doing God's work.
OT: I have always enjoyed your posts. I found your blog, I hope that's not stalkery. Now I'm curious about your thoughts on metaphysics, an area that also interests me.
lacerating_aura@reddit
Your perspective is very reasonable. Thank you for your work.
FastHotEmu@reddit
Ugh. Sorry, p-e-w. How I wish this could stay out of the mainstream, last thing I want is more stupid takes by people who don't understand anything about LLMs or technology :(