How likely do you think a Ashley-Madison style widespread breach exposing users and conversations is in the next few years?

Posted by Antique-Account-2359@reddit | LocalLLaMA | View on Reddit | 38 comments

I was quite naive with my usage of ChatGPT, and my mind won't stop replaying a doomsday scenario where every single users chat leaks, and there's like a searchable database or some shit like that. If one were one to take place, how do you think the event would transpire? I'm probably shamelessly seeking validation but I don't think I care anymore. My life could change for the worse drastically if this were to happen. (Nothing illegal but enough to ruin relationships and be publicly humiliated)

[-]

thetaFAANG@reddit

the way these big leaks drop kind of annoys me, too much all at once, people lose interest in a week

they could trickle people’s names out once a week and make a big spectacle out of it for a couple years

[-]

Material_Policy6327@reddit

It’ll happen eventually

[-]

SlowFail2433@reddit

Uh its way worse the result of the NYTimes case is that your messages can be read out in a televised court case LMAO

[-]

nomorebuttsplz@reddit

Not really, don't falsely drive OP to suicide. It's not going to name users in court.

OP, consider what such a leak would do to Openai's stock values (after IPO)

It would be like if gmail's database were leaked. Such a breach would have irreparably damaged Google's reputation. Which is why it hasn't ever happened.

I say this as someone who is very careful about what I put into openai's chats.

[-]

SlowFail2433@reddit

I’m afraid you are mistaken- email address, full name, phone number, IP logs and conversation history can all be spoken in court.

This is due to a ruling in the NYT vs OpenAI lawsuit.

This is not speculation it is actual law now (the US works on judicial precedent as it is a Common Law country.)

[-]

annoyed_NBA_referee@reddit

Trial court doesn’t set precedent, only appeals courts. The rulings in this case are not binding for anyone other than NYT and OpenAI in this specific case.

[-]

SlowFail2433@reddit

Lower courts can set persuasive precedent, but not binding precedent. Persuasive precedent is still a form of precedent though.

But to address your point more directly we do have hundreds of pieces of caselaw of actual binding precedent on disclosures and discovery which have not been blocked in any way, so if you want to go down the precedent route then the body of precedence is broadly against us.

[-]

annoyed_NBA_referee@reddit

But you said “this is actual law now”. You’re overstating it, a lot.

[-]

SlowFail2433@reddit

There are hundreds of pieces of existing binding judicial precedence at the appeals court level and above that still apply that would allow a judge to go for those details, from the standard caselaw on discovery and disclosures, just the standard stuff you find in your textbook on those issues. There isn’t anything blocking that existing precedent from being used. The OpenAI v NYT case makes it even worse because it is a piece of persuasive precedent that specifically mentions the LLM category.

[-]

nomorebuttsplz@reddit

You should use an llm and educate yourself about the law a bit before giving these hot takes

[-]

SlowFail2433@reddit

These aren’t hot takes what I have said in this thread is all just the basics of the legal system, covered in any first year textbook.

[-]

nomorebuttsplz@reddit

Where did you go to law school?

[-]

SlowFail2433@reddit

Took some law school modules during undergrad and postgrad, and then have just worked with corporate contracts for a few decades

[-]

nomorebuttsplz@reddit

Judge Wang ordered OpenAI to produce 20 million “de-identified” or “anonymized” consumer ChatGPT chat logs to the NYT and other news-publisher plaintiffs, under an existing protective order. Public reports and the court’s language explicitly say the chats must be de-identified (anonymized by OpenAI) and reviewed under a legal protective order (attorneys and experts only).

[-]

SlowFail2433@reddit

I see where the confusion is

You are assuming that because the logs were anonymous in this case that they must be in the future, but that is not the case they do not have to be anonymous the names can be revealed

[-]

nomorebuttsplz@reddit

I don't think there is confusion. It's simply not the case that "email address, full name, phone number, IP logs and conversation history can all be spoken in court" and any statement that that is "actual law" as you stated is false, and any speculation about the law changing is just speculation.

[-]

SlowFail2433@reddit

What you are saying is inadvertent misinformation I am afraid. I get that you don’t want it to happen, I don’t either, but existing case law goes against us.

There is currently no legal barrier to a judge requesting that information if it can be justified as necessary, relevant and proportionate to the case, within existing discovery rules.

The existing disclosure / discovery caselaw still applies and that caselaw is very much full of that information being released in court.

[-]

txgsync@reddit

> It would be like if gmail's database were leaked. Such a breach would have irreparably damaged Google's reputation. Which is why it hasn't ever happened.

The DigiNotar breach of 2011 would like a word with you.

TL;DR: The company itself doesn't need to be breached for their user's data to be stolen.

[-]

PracticlySpeaking@reddit

\^ This. is much more likely.

And, FWIW, the Ashley Madison thing was not a "breach", it was a targeted attack by people motivated by conviction. We will see what happens with the new 'adult' capabilities from Open AI...

[-]

On-The-Red-Team@reddit

Locally, practically nil... but that's the whole reason people use LocalLLaMA

[-]

YesterdaysFacemask@reddit

Go to privacy.openai.com. There you can download everything that you’ve ever typed or generated in ChatGPT. Be aware the file size could be huge if you’ve uploaded and downloaded a lot of images. You can also request that they permanently delete your information. By law, I believe they are required to actually ensure proper deletion of your data if you go through this process. So if you’re worried, do that.

And also note that previous breaches have generally been pretty hard to get to unless you’re pretty determined. So if a multi petabyte leak happened, it would take substantial resources just to store and host it somewhere AND the person who did so would be the target of a million lawsuits and law enforcement. We’re not talking about a csv of a million passwords. Everyone chat history all together would be gigantic. So I wouldn’t worry that much about it.

But do the delete option if you’re still concerned.

[-]

txgsync@reddit

> You can also request that they permanently delete your information. By law, I believe they are required to actually ensure proper deletion of your data if you go through this process. So if you’re worried, do that.

Unfortunately, US law is scattershot on this count. California's CPRA/CCPA and Virginia's VCDPA, you can request Data Subject Access Rights and Data Subject Deletion Rights, much like GDPR/EUDA in the EU. But enforcement is lax: if someone's engaged in interstate commerce with the company and the company does not have a presence in Virginia or California, they aren't required by law to comply with data subject deletion requests.

Given that OpenAI AFAICT does have a CA presence, they'll probably do it. But it's possible to skirt the law in sneaky or non-obvious ways: delete the data subject from your special "California Residents" database, but leave them in databases for other states. So if they've ever used a VPN to access your service or a different US state, those records might persist.

Should I ever run for Congress, correcting our lackadaisical patchwork of privacy laws would be at the top of my agenda...

[-]

YesterdaysFacemask@reddit

I trust your explanation of the law, and I also don’t have a lot of conviction that companies would really scrub your data so well that even law enforcement couldn’t get at some trace. But I don’t think they’d be so egregious to not even make some attempt to delete when requested or to try and treat California users differently. Ultimately the hard part is ensuring compliance within the system. I’ll be skeptical about whether they’re doing that well or to the letter or the law, but making a specific system to violate that law (eg just flagging California residents in the database and marking “deleted”) seems unlikely to me. It would also be a nightmare if it ever became an issue in court.

So do I 100% trust that every bit of personal data is actually securely deleted from every server or backup they own when it’s requested? Not really. Do I think they have any incentive to try and specifically violate their regulatory responsibilities? Also not really. So I trust enough that the deletion is good enough to provide some safety to someone who’s just generally worried about privacy but maybe not if a user has been asking ChatGPT how to launder money or help them do a terrorism.

[-]

txgsync@reddit

Reasonable take. And under the Data Privacy Vocabulary from the W3C version 2.2, holding data due to legal investigation or government oversight is explicitly called out as potential legal bases for data retention.

The ontology is really useful to know in my day job programming privacy stuff at a car company. I do my level best to make sure we comply with the spirit as well as the letter of the law: that if the user asks us to delete data that we retain the request and the proof that the data once existed but was deleted by user request. And that the very existence of that data subject becomes scrambled in a way that we can only prove the former existence of data in response for formal legal process.

[-]

HanzJWermhat@reddit

Because your messages are likely being used to train the next model and because there’s no open weights… yeah…

[-]

venerated@reddit

You have to think about how much data would leak if it did. You’d be a drop in the ocean.

If you’re that worried about it, do a GDPR-type request to have your data deleted. In the future, use an email address that’s not associated with you, don’t use your real name, and don’t use any significantly identifying info in conversations.

Please don’t let this drive you to suicide, it’s honestly not that big of a deal. Unless someone specifically leaks your data and only your data, any transgressions will be lost in the pool of data.

[-]

a_slay_nub@reddit

Wasn't there a leak on chatgpt already where anyone could view anyone else's conversations. It looks like there's actually been a lot of them.

https://www.twingate.com/blog/tips/chatgpt-data-breach
https://arstechnica.com/tech-policy/2025/11/oddest-chatgpt-leaks-yet-cringey-chat-logs-found-in-google-analytics-tool/
https://wald.ai/blog/chatgpt-data-leaks-and-security-incidents-20232024-a-comprehensive-overview

[-]

PracticlySpeaking@reddit

https://techcrunch.com/2025/07/31/your-public-chatgpt-queries-are-getting-indexed-by-google-and-other-search-engines/

...and returned in search results.

[-]

SlowFail2433@reddit

Yes i remember this Sama apologising directly too

[-]

mr_zerolith@reddit

The government forces these big AI services to log everything, and standard for data security in the US are very low, and the federal govt gets hacked multiple times per year. And ChatGPT has in the past leaked private chats in various ways.

I'd say the likelihood that your data is safe is very low. I would advise you to immediately stop using it, that's the best you can do.

The chance it will be connected to you ( someone targets you ) is MUCH lower, unless you are some influential / powerful person who has a high profile.

You would be far from the first person to be in these shoes. You could put it behind you by stopping and coming out about it later.

[-]

Cool-Current-134@reddit

Calm the fuck down. Get off Reddit. Risk isn’t zero but if there was a huge searchable database that would be made, OpenAI and governments would be scrambling to shut that down. Please, you don’t need to unalive yourself over this.

[-]

grannyte@reddit

At this point it's not a question of if it's a question of when

But what ever you have in there won't be important enough compared to all the dumbasses with security clearances using it to make life and death decisions.

[-]

durden111111@reddit

lol. the grok imagine stuff could be searched directly with google

[-]

a_beautiful_rhind@reddit

meta prompts were going into search too. along with their facebook? and picture.

[-]

a_beautiful_rhind@reddit

Yea and I've only used any services anonymously for this reason. Too much chance for it to be used for blackmail or targeted attacks against you by any number of parties, including the companies themselves.

It also already happened to some AI RP site and users were being contacted about the contents of their chats. They had used emails and details traceable back to them when making accounts.

[-]

jonahbenton@reddit

Absolutely guaranteed.

[-]

suicidaleggroll@reddit

How likely do you think a Ashley-Madison style widespread breach exposing users and conversations is in the next few years?

I’d say it’s practically guaranteed. Tech companies have proven over and over and over again that they aren’t capable or willing to implement proper security practicals to protect their infrastructure. Mostly because it costs money, and there’s practically zero punishment for having a breach. So these breaches will continue to happen again and again.

[-]

HarambeTenSei@reddit

lol don't share your real name with chatgpt and don't use your main accounts for login