Asked our head of sales if putting client addresses in ChatGPT was data sharing. She looked at me like I was the idiot.
Posted by shangheigh@reddit | sysadmin | View on Reddit | 430 comments
Had a weird convo with our head of sales last week. She was showing off how she uses chatgpt to polish client emails. The prompts had full names, deal sizes, internal pricing strategy. one even had a clients home address.
I asked if she thought of that as sharing data. She looked at me like I was slow and said no, she’s just asking for help with wording.
Training clearly isnt landing. People genuinely dont see it as data sharing. Policy posters arent fixing this one
Call-Me-Leo@reddit
We need an AI plugin on computers that will detect when people are putting sensitive information or data into AI chatbots. Bring back clippy and have them say “ are you sure you want to upload this to the cloud? It seems like private information.”
ribsboi@reddit
It's called Purview
Icy_Performer_9675@reddit
you use it ?
ribsboi@reddit
yes
khantroll1@reddit
How do you leverage purview in this way?
We have a blanket policy against AI, so I haven’t had to keep up with it.
bbanda@reddit
Label sensitive data and build policies around it. If you can’t get it out of the environment at all they can’t get it in to unapproved tools.
Aggravating_Refuse89@reddit
That would only work if you have a really tight MS ship. Even then, its more than Purview. You are talking about DLP and classification which Purview has in it. Purview may be a piece of it, but it not a magic bullet and most IT shops do not have resources to go after this.
ConflictResident5253@reddit
That's not an accident - According to our MS admins, the answer to defective MS shit is always to buy and use more MS shit. It's a way to just get you to sunk-cost more and more of the IT budget into MS products. Trust me, even if you run the tightest MS ship in the world, you'll still have a leaky boat.
ribsboi@reddit
What @bbanda said + use Defender for Cloud Apps to block all AI apps.
Aggravating_Refuse89@reddit
Must be nice to have that
psiphre@reddit
reddit doesn't use @ notation
postbox134@reddit
I knew what they meant
VividVigor@reddit
How do you apply a Purview document label to copy-n-paste?
ribsboi@reddit
It's not actually working with labels because clipboard data doesn't have labelling. There are 2 very useful policy actions for this: "Paste to supported browsers" and "Copy to clipboard". You design a policy with the type of sensitive data you want to catch, with these actions, and you can decide to block or audit the matches.
See https://learn.microsoft.com/en-us/purview/endpoint-dlp-learn-about#endpoint-activities-you-can-monitor-and-take-action-on
VividVigor@reddit
I was not aware Microsoft Defender for Endpoints could enforce actions on the clipboard. Thanks. I thought only a secure browser or TLS inspection on a cloud gateway were the only two ways to detect and block this.
mrmugabi@reddit
This right here. It worked surprisingly well when I rolled it out a while back.
belzaroth@reddit
This sounds like the blind leading the blind.
After all who checks the AI plugin, my ADHD has me here thinking you need AI to watch the AI that watches the AI, and another to watch the AI........................
belarm@reddit
Yeah, this seems like a fool's errand. What is sensitive varies greatly and depends on context that the model is just not going to have, even after scanning all your docs. That "filter" is also now a honeypot full of sensitive data.
Away-Sea7790@reddit
SentinelOne has this plugin.
RoseRoja@reddit
Look into the prisma browser from palo alto networks
Morkai@reddit
DLP tools can definitely do this.
PotatoOfDestiny@reddit
My company runs an "internal" AI tool that combines a couple of different LLMs with regulatory data privacy requirements (I work in healthcare), and we straight up blocked every single other one in the firewall. Letting users access random public tools is a really bad idea.
Optimus_Krime555666@reddit
That's unfortunate.
sentinelone was the biggest piece of shit I've ever encountered in 15 years in IT, so hopefully the browser extension is better
zvii@reddit
Exact opposite here, S1 works great. Paired with BeyondTrust for PAM. Little tuning up front, but it's very clear why things get flagged or killed/quarantined. A few exclusions and false positives, but never anything let through I didn't want or that I wasn't alerted on in the 6+ years. I'm going to say, probably an implementation or admin issue more than anything.
DeifniteProfessional@reddit
I reached out to SentinelOne to get pricing and they sent my request to an MSSP who would give me S1 Lite as part of an MDR service. Frankly seems like they don't actually want business
snatchpat@reddit
$20 days you flipped the switch but never tagged your data for DLP to actually “work”. Nobody does
VividVigor@reddit
How do you tag copy-n-paste? Detection engines are getting better (with help from Ai 🤦♂️)
lotekjunky@reddit
we use zscaler
Annonimbus@reddit
If you use Copilot you can force that the document you upload needs to have a Purview classification and only certain levels of confidentiality are allowed to be uploaded.
Malicious intent isn't stopped by that as you can just classify the data as public or copy and paste it into the box.
But for a case like OP where people are just ignorant it helps.
Morkai@reddit
I would refuse to take that bet because you're quite likely right. Unfortunately there's been a significant amount of turnover in our team over the last ~5 years (I technically have longest tenure currently and I've been here less than a year)
There are many things we're still reviewing and consolidating and cleaning up. Every other week there is something we've come to call "landmines" where it's some undocumented system or platform or license renewal or subscription that we have to drop what we're doing to address immediately.
snatchpat@reddit
That’s not a you problem. Leadership gets in a bind over exfiltration and suddenly it’s top priority just like all the other priorities. They’re shmucks for not supporting the IT operation to begin with. A properly empowered technical group would have this solved five years ago. Tell them to eat shit while you train your agents to outdo them.
Morkai@reddit
Unfortunately it's a me problem insofar as something breaks and it's largely on me to fix it as best as possible, assuming I have the tooling or the money to do so. It's a small company and am even smaller team, so there's not many other resources to call on.
Awkward_Pingu@reddit
We have something integrated into all the microsoft apps that detects PII and sets a confidentiality level from 1-4 on it. Also the Business level AI app.
Mission_Process1347@reddit
These tools definitely exist
dinkleberg01@reddit
Torii does this iirc. There's an extension for browsers. it will alert you that you are sending sensitive data via xyz platform and prompt you change it.
iamoldbutididit@reddit
What time we live in.
We used to call program that recorded everything a user typed spyware... Now we have to buy the same type of product to keep our data confidential.
Ansible32@reddit
I love that people think this is a good way to keep data confidential and not a prime security hole in and of itself.
lotekjunky@reddit
it's just matching patterns locally in your better. it doesn't store it ..
Ansible32@reddit
And how do you know? have you read the source code? That's one code change away from deliberately exfiltrating things.
lotekjunky@reddit
because I've watched the payloads in .har files and through fiddler. you could do the same if you actually cared to understand how shit works
MrHaxx1@reddit
I don't know how Torii did it, but it could very much be done locally.
DeifniteProfessional@reddit
Claude teams has a global DLP/enterprise data scanning thing apparently. I've not yet used it (we're still onboarding it) but hopefully that's actually decent
Vesalii@reddit
Our manager had a meeting with a company that has this as a browser extension. If it detects you're sending data to an AI it substitutes the data for fake data.
It would cost us 30k per year 😂 No way that's happening.
UpsetMarsupial@reddit
People will click anyway. Just like "This site might be insecure. Do you wish to continue?" type questions. Just like "There are updates to install. Do you wish to update or postpone". Users ignore the essence of the warning and instead click to progress with whatever task they have in mind.
AppropriateSpell5405@reddit
I think MS rolled out a GPO for detecting this, at least from within Edge.
Aroe2k@reddit
We use Netskope for this, it’s configured to only allow AI usage with approved tools and domains.
TheBigBeardedGeek@reddit
Crowd strike does this on Windows I think. Ultimately what we need as orgs is better DLP implementation
FriendlyITGuy@reddit
These do exist. We were testing out Forcepoint Remote Browser Isolation specifically for this but unfortunately it was nothing but issues for the testers.
If anyone has any other suggestions I'd be interested in hearing them.
neon___cactus@reddit
Sentinel One has a browser plugin called Prompt Security that lets you see what people are putting into the AI tool and block based on category.
AbfSailor@reddit
Zscaler can do this
Packagedpackage@reddit
We are encouraged to use the ai for as much as possible. The only thing I imagine not being entered is w2 related stuff from HR. People put account numbers and such into it. Nobody cares except the Karen’s.
Previous-Low4715@reddit
This is literally what Purview does.
maxis2bored@reddit
Plenty of tools do this.
ReptilianLaserbeam@reddit
Compliance and security suits can do that. We have automatic alerts with purview when someone uploads files into generative cloud apps, even alerts that are triggered with prompts including sensitive information.
Tarwins-Gap@reddit
Checkpoint has this
cnrdvdsmt@reddit
clippy comeback would be elite. we use layer x which does basically that minus the paperclip
shangheigh@reddit (OP)
We would really appreaciate one of those
kennetheops@reddit
I’m building this tool right now. It’s early but we are seeing a ton of interest?
postbox134@reddit
Proxy rules, block any non on boarded AI tool. Give them a helpful link to approved tools
Previous-Low4715@reddit
There’s a reason the enterprise versions advertise that they don’t train on the data
shangheigh@reddit (OP)
Either way I wouldnt really want to trust customer pii with any model
Krigen89@reddit
If your customer data is in SharePoint, then you can put it in Copilot. Same TOS.
Same stuff with Google Drive and Gemini.
Legal_Situation@reddit
Just tossing this out there that some features may have different TOS, such as things within Google Labs. Currently I think Google Flow (AI Video) doesn't use the same TOS for example.
moonski@reddit
thing is google / MS could well say " we dont use this to train ai" in the their TOS and just use that data anyway... whats going to happen? They get fined a few hundred mil 5/10 years down the line? They don't care.
PowerShellGenius@reddit
That is true, but it is true whether it is "AI" or not. If you do not trust Microsoft not to scoop up data for AI training in direct violation of a binding contract, you cannot store any data where they technically could do that. This includes SharePoint, OneDrive, Exchange Online, etc.
AI may be the motive (training) for misusing customer data. But there is zero correlation between whether the service you input the data to is Copilot, and whether Microsoft can be trusted to honor a contract and not misuse it. If they are stealing data for AI training they would definitely train it on your Teams chats as a source of Natural Language samples...
So if you are going down the "what if cloud providers are blatantly breaking their own ToS" rabbit hole, you're back to everything on prem.
moonski@reddit
would it surprise if they hadnt already done that?
Krigen89@reddit
Maybe. Probably.
On personal accounts, with different TOS.
ConflictResident5253@reddit
No one has ever enforced a contract against Microsoft successfully, to my knowledge. What would you do, sue 'em? They'd drag it out for decades like they did the class action about fraudulently pushing Office purchases in Canada.
And what are you gonna do, NOT use Windows? Microsoft is unenforceable and judgement-proof.
lotekjunky@reddit
this is bs. MS provides compute and ai for highly regulated industries. they do not retain your information as it would make them legally liable. if you have any proof of this ever happening, please post it. otherwise, stop spreading misinformation.
moonski@reddit
Yeah just implicitly trust the mega corporation that has routinely done all sorts of illegal stuff in the past. They'd never lie again right? Or do something they shouldn't to get competitive edge
lotekjunky@reddit
so you got nothing and continue to spread misinformation.
moonski@reddit
oh yeah bro let me just dig up all the evidence to prove they are doing this right now... did you actually expect that? mate
ConflictResident5253@reddit
I mean, they fake contractual obligations all the time in the most highly regulated industries on the planet. Like, if you fake FEDRAMP and no one givea a fuck at the end of the day, then no cuatomer should believe anything you promise, ever again.
Justgetmeabeer@reddit
"I know these companies actually have been all caught in lies, antitrust suits, and lawsuits for their entire existence, and there is ultimately an insane profit motive for them to use your data anyways, but idk guys. They say they are telling us to trust them, I can't see any reason not to"
lotekjunky@reddit
you don't understand the difference between consumer and enterprise services.
MagicWishMonkey@reddit
The number of idiots in threads like this always surprised me. The fact that some rando IT guy thinks it’s ok to give legal advise for something they know nothing about is just weird. Surprise they haven’t been smacked down by OGC or senior leadership for not routing questions like those to the people who are actually capable of giving an informed response.
lotekjunky@reddit
I'm the guy the lawyers call when they need to understand how something works. It's my job to work with them t ok safely enable ai. I spend way too much time with the lawyers and compliance goons.
MagicWishMonkey@reddit
They call you to understand how something works BEFORE contracts are signed, once the agreement is in place it's not your rodeo.
I say that as someone in the same exact position and it frustrates me to no end how many supposedly intelligent IT people seem to think ChatGPT/Claude/etc. are somehow different than any other enterprise tool. I did a demo of some agentic workflow automation I created for my team and it freaked a lot of people out and it took a lot of time for me to do damage control with the CISO, super annoying.
ConflictResident5253@reddit
Dang, son.
ConflictResident5253@reddit
They faked FEDRAMP and nothing happened. It follows that they fake everything else too.
Legal_Situation@reddit
Technically, yeah. But then again, you could kind of say this for anything. They'd also likely get a large amount of lawsuits from companies with their own legal teams and capital to take up that legal fight, so it's not as cut and dry as cartoon villainy.
Really this is more about making the legal ramifications of taking an action like that actually have teeth. US Privacy laws aren't great based on my layman's understanding of them. That said, who knows what that would look like when corporations feel their IP was threatened by it.
I was mostly just mentioning this because I wanted to add the nuance that I know of to the conversation.
IlIlllIIIIlIllllllll@reddit
MS provides services like email for hospitals. its possible that software companies that handle patient data are just stealing it instead. But what are you going to do, personal audit every bit of enterprise code your company provides you?
Krigen89@reddit
Same thing with SharePoint and other parts of their platform.
reillan@reddit
We have a copilot enterprise account and that's all we're supposed to be using.
ConflictResident5253@reddit
Ain't it interesting how this pinky-promise on paper keeps cuatomers from buying other products instead? Maybe that's the scheme.
CernerBurner2000@reddit
Isn't co-pilot Enterprise different though? I don't use co-pilot often because it is terrible, but I have a company tab and a public tab. I thought the company tab had access to the data in our tenant and was not used to train public AI models?
Krigen89@reddit
Not really.
Public one is the garbage personal one, which might train on your data.
Enterprise is "paid" through your M365 license, so they don't train on your data. It does not have access to your tenant's data initially though.
If you want it to be able to access your data, (and be able to summarize your Teams calls, and other stuff) you need to add the optional Microsoft 365 Copilot license.
Ferretau@reddit
That's dependant on the Copilot your currently plugged into. Depending on where in the O/S you use Copilot can change the TOS it operates under.
Krigen89@reddit
Not when logged in with a business/corporate licensed account.
Previous-Low4715@reddit
Wish people understood this, plenty of reasons to be alarmed but this one is not high on the list. It’s been that way since copilot was “bing chat” lol
nyokarose@reddit
I mean, most of us trust that data with cloud providers. Can you help me understand why models are different? (Genuinely curious).
ConflictResident5253@reddit
Models ingest (your) data and then potentially regurgitate it to strangers.
nyokarose@reddit
Companies like Microsoft have specific corporate data protection guarantees that they don’t allow this with the models they’re running. They could just as easily scrape all your Outlook emails in M365 and share with a third party… but we trust them not to do that. I’m not saying that I trust Microsoft to always do the right thing, just trying to understand why this risk is different than any other time we give cloud providers access to our data.
Are you suggesting that the model is doing this in the background undetected by Microsoft and sending it back to the model creation company? That seems like data they’d notice leaving…
ConflictResident5253@reddit
No, I'm saying Microsoft doesn't even seem to read its own contracts, ignores their obligations to their customers, and does whatever it wants. There's a visible history of just not delivering on obligations and not caring.
Plus, the JET-derived database that backends m365 data doesn't have the kind of architecture that can separate sensitive and non-sensitive data. There's no way this promise has the technical capacity to be realized.
They say a lot of things. Doesn't make it real.
nyokarose@reddit
Which then amazes me that nearly everyone has their data in M365, with or without LLMs.
Thank you for helping articulate some of the holes!
Tetha@reddit
German Dataprotection Laws and I think the GDPR as well have this idea of "data frugaility" and "frugal use of personally identifying data". This means should should always be questioning if data has to be stored, processed, or even worse, handed to third parties in general.
I think that is a very good wording and mindset to get into: Does this improve the answer from an LLM? Like, if I want it to check an email I am writing, does something like a personal address or a phone number actually increase what the LLM could improve or analyze? If not, it should not be included by principle.
Oh the other hand, in production, customers pay us to store this kind of data. So we have to.
nyokarose@reddit
Yep, I work with EU so am familiar with GDPR, you’re right. I also know that in many cases it’s impossible to strip some identifying information when using a SaaS service; the European Council themselves use M365 (and ostensibly copilot along with it, to read emails…. Which all have identifying info in the signature…) I expected more than just data privacy from the original comment.
Tilted5mm@reddit
Exactly.
coastsofcothique@reddit
Because in practice, the vendor agrees to security on their end, gets risk reviews, allows the company to secure it per industry standards.
Random LLMs being used all over the place will though company risk reviews have no organizational oversight. That’s the risk.
ls--lah@reddit
Lol have you read the Microsoft 365 Terms of Service? It basically says "We may get hacked, you may lose data, we are not liable".
lotekjunky@reddit
have you ever heard of a Microsoft DPA? https://learn.microsoft.com/en-us/answers/questions/2236249/how-to-sign-a-data-processing-agreement(-dpa)-with
itskdog@reddit
We're tqlkialking approved LLMs the company pays for to not train on the data, and people still being skeptical on that.
Best-Conclusion5554@reddit
Unless the LLM training is somehow bypassing the information security that applies to the company's data when used for 'normal' purposes (applications, analytics, etc) by 'normal' users. I am old and cynical enough to think that may frequently be the case.
TheChance@reddit
For roughly the same reason a storage locker is different from leaving your shit on the sidewalk while an army of robots sort it for you.
NQ-QB@reddit
Because AI bad.
Lambs2Lions_@reddit
We have a BAA among others. Head of legal, compliance, and security all sign off as long as the AI servers are in the USA and data never leaves the states. shrug, at that point it’s not my monkey, not my circus.
MemeMan_Dan@reddit
Really the only "safe" way to do this is via a local model run internally without access to the outside internet.
Kholtien@reddit
Not even a self hosted one?
NetworkingNoob81@reddit
Only if completely disconnected from the internet, then maybe.
Kholtien@reddit
Why disconnected? They’re more useful when they can search the internet
gameoftomes@reddit
It really is intent. If I put customer pii data in, I don't want a random tool use to allow the model to push the pii to the internet.
Openai just release a privacy model that detects pii. Could use that and some code to filter actual pii and put place holders in.
lotekjunky@reddit
azure has pii detection on their open ai models for years now
fadingcross@reddit
Are the emails this person are rewording with chatgpt in Exchange Online? If so you're already sharing the data, so why do you care if it's in chatgpt?
plinkoplonka@reddit
No, I wouldn't either.
alochmar@reddit
Pinky swear!
ArbitraryMeritocracy@reddit
Just like 23 and me wasn't going to share or sell your data. Google even once had the catchphrase "Don't be Evil".
Previous-Low4715@reddit
The same TOS for data covers SharePoint and Onedrive. They want the EU to use it, where there is some modicum of respect for data privacy.
ConflictResident5253@reddit
EU doesn't believe it tho. They're all adopting divestment plans now.
Previous-Low4715@reddit
That’s about digital sovereignty which is slightly different, but very real. I’ve been asked to look into it as a possible project too.
Finn_Storm@reddit
But thats not totally true. Labs and Flow have a different TOS
guareber@reddit
No one wants to be evil, that's the basic reason PR firms exist.
HermyMunster@reddit
No one wants to be perceived as evil, that's the basic reason PR firms exist.
Clyzm@reddit
PR firms exist literally because they do want to be evil but don't want to be perceived as evil.
porkchameleon@reddit
The full version was "Don't be evil like that" /s
Adventurous-House-32@reddit
Also the wifi password on the gBus if you're ever cruising down 101 in NorCal
pdp10@reddit
It was supposed to be, "Don't be Microsoft", but counsel said not to use a trademark.
Ubuntu issue number 1 was "Microsot has a majority market share".
Yuugian@reddit
That and $12 will get you a StarBucks. Y'all can trust them if you want, but i obfuscate the name of the org i'm signed in with
guareber@reddit
The problem is that from your org's perspective, it doesn't matter if it's true - it only matters if the contract say it's true. Your org doesn't care about anything other than its liabilities.
MagicWishMonkey@reddit
That and violation of contract would give the company grounds to sue, it’s crazy how many people here think OpenAI is just ignoring their contract agreements. If that was true and it became public the lawsuits would sink the company. There’s a reason that almost never happens.
ConflictResident5253@reddit
Blatant contract violations happen all the time, though. No one holds vendors accountable, ever. They throw all their capital into the Legal and Marketing departments to help dilute consequences.
I mean, after we found out that Microsoft's security is so fake that they faked FEDRAMP, got caught, and DOD just said, "oh, well.." I'm not sure why anyone would believe any tech company wasn't lying its ass off all the time.
RememberCitadel@reddit
Anytime I do anything with AI there is a placeholder for anything sensitive in data. [Customer name] and [public IP] and similar make it super easy to run a find and replace later.
sunburnedaz@reddit
Gordon Freeman and black mesa are my go to for test users/dummy data.
Questionsiaskthem@reddit
The right man in the wrong place can make all the difference in the world. So, wake up, Mr. Freeman. Wake up and smell the AI slop.
spittlbm@reddit
Reminds me that I need to find cs_beta.zip
Trooper27@reddit
This is the way.
Obscure_Marlin@reddit
This is the way
Gh0st1nTh3Syst3m@reddit
Think about the profile other FAANG companies have already built on you though and exists. So, even with placeholder data, if you were a high value target they could at minimum infer methods of operating, business processes, etc. Just brainstorming, not really calling you out. Once you are online, unless you are on a public wifi a state over with a burner device and on a cash only VPN or hacked exit node...then there is a profile out there on you with some company lol
Bocchi_theGlock@reddit
How does one become a high value Target
This is a key thing I'm always wondering. How many people could even do that if they tried? Like without destroying their life
JasonDJ@reddit
Work in defense sector, you'll be beating off attackers with both hands.
Bocchi_theGlock@reddit
Good example, thanks
Yuugian@reddit
That doesn't mean i have to make it easy for them. And apparently i don't. I have a work profile, a gaming profile, and a private profile, i don't get any of the same ads on them and the private one gets such generic garbo that there is no way the know who i really am.
they have a profile, and the profile sucks
Ron-Swanson-Mustache@reddit
Doubt
aVarangian@reddit
based
fearless-fossa@reddit
People heavily underestimate how much knowledge you can get by connecting the right data. There is a semi-famous video by a data scientist who used just the dates articles are published and the author's names to make educated guesses about the internal structures of the newspaper and the people working there. SpiegelMining – Reverse Engineering von Spiegel-Online (33c3)
Only available in German though.
Most of the employees of a company who use a website over a period of time can rather easily be mapped to that company, the prompts they enter then reveal not only a lot about what the company is currently doing, but also where it is headed. If you are able to access data about them at multiple points - eg. all the interfaces where Microsoft is listening, this doesn't start and end with just CoPilot - you can get a scary amount of data passively.
Previous-Low4715@reddit
I was able to pull user passwords from a copilot agent by putting one on top of my service desk’s Sharepoint site where they were stashing user passwords in an excel spreadsheet. That was a fun call to my risk manager
danstermeister@reddit
You wouldn't say that TO a FAANG company, because you'd assume they would keep everything internal somehow, and better yet, feel they DESERVE to.
But somehow you tell smaller companies here they should give up and lol?
RememberCitadel@reddit
Oh, I meant placeholder data inside a paid enterprise instance.
7fw@reddit
Copilot already has access to my entire pile of data from Exchange, Teams/SharePoint, OneDrive, everything. I don't put in names, but as soon as I send it, or save it, it has access. I don't trust it, but that's up to Cyber Security
I share no data with AI in my personal life.
Happy_Love_9763@reddit
This is what I do use fictional names and corporations, like Homer Simpson and Contoso.
Careful-Criticism645@reddit
There's little reason not to trust them on this matter. These companies do not want random data from users polluting their models.
Educational-Wing2042@reddit
I work for a fortune 25 company that deals with sensitive data and our data security team doesn’t trust them, we have enterprise and are still restricted from entering any PHI/PII.
atbims@reddit
What data do you think models are trained on if not random users? 🤔
f0urtyfive@reddit
What would be the value of training on user input data, what does the existing AI that constructed that data learn from learning on it's own output?
That IS how CLONES of models are trained, in China, so you can just replicate the existing model through it's bulk outputs by capturing users through "free" apps. That is NOT how frontier labs train models, because the model would just output nonsense all the time.
Yuugian@reddit
Very specific users from Reddit
sdeptnoob1@reddit
We dont own the company (Unless you do then good on you). by signing with a enterprise agreement we did our part. its legal after that. especially when the C-Suite wants it. You can give warnings but hey, the agreements are their to make it easier to use and lets us sue if they violate.
bananenkonig@reddit
There's a reason that a lot of companies are getting internal models. My company has two, one that is only attached to the intranet that is used for company related questions, and one that is disconnected from that one on a separate internal network for quick data searches on program specific documents.
lotekjunky@reddit
no local model has 1 million token context
RobbinDeBank@reddit
Not every work and task require that.
bananenkonig@reddit
I'm not sure what you're talking about. I don't deal with AI at all. Our system is completely offline. I don't see any mention of tokens though.
tylerwatt12@reddit
He’s saying, local AI is nowhere near as good as local AI when it comes to context. You can either pay a subscription of $100 a month, or invest a half million dollars to roll it yourself.
shikkonin@reddit
Doesn't fix the data protection issues.
Previous-Low4715@reddit
https://learn.microsoft.com/en-us/microsoft-365/copilot/microsoft-365-copilot-privacy
shikkonin@reddit
Good for you. Doesn't fix the issue of (personal) data protection.
Previous-Low4715@reddit
That’s why I compared the personal versions to, I quote myself, “the enterprise versions”.
shikkonin@reddit
Yes, the enterprise versions.
You, as an enterprise, are still responsible for protecting all the personal data you have been entrusted with. That includes to whom and under what circumstances you give that data to third parties.
Previous-Low4715@reddit
I know, I did data protection for an F500 and I’ve spent the last year setting up copilot for a large government organisation. I’m not sure what you’re getting at.
shikkonin@reddit
That, depending on the jurisdiction you're operating in, there simply no way you can ever use Copilot or any of the other LLMs legally if you provide personal data to them.
Previous-Low4715@reddit
It’s the same handling as any other data in M365 in a GDPR etc bound location, you use a tool like purview with trainable classifiers to identify the data in your environment and enforce data boundaries through DLP and so on. Copilot data does not leave your tenant.
shikkonin@reddit
That's nice and all, but doesn't (always) matter.
Previous-Low4715@reddit
It’s enough for the GDPR compliant government department I rolled it out for.
shikkonin@reddit
Good. However, it would not be legal for my government.
Previous-Low4715@reddit
Pray tell which government has stricter enterprise data requirements than the EU.
bentbrewer@reddit
And allow you to connect to their api for dlp.
Previous-Low4715@reddit
Aye
TSiQ1618@reddit
something that was weird when I was using copilot to troubleshoot a simple piece of code, it does this thing while it's "thinking", where it gives usually pointless temporary messages that are supposed to convey what it's doing like "lining things up", "checking online database", "reformatting output", "adding comments". But it was taking a while this time, cycling through a few different messages and for a second it said something like "have a potential solution, one moment, checking for any intellectual property legal violations". That was kind of weird, just got me wondering what information was it accessing that it needed to check for legality? Was it stealing code it wasn't sure it was allowed to share?
Cley_Faye@reddit
General consumer version : "we'll train on your data"
Enterprise version : "we'll pretend we won't train on your data"
It's a good deal, really…
lesusisjord@reddit
Copilot attached to enterprise Microsoft accounts has the same data protection as the rest of the 365 stack, so if I can include it an email/Teams/SharePoint, I am safe to include it in my copilot chat.
Is this wrong?
Previous-Low4715@reddit
Correct, though some of the features for compliance monitoring in purview are in preview
lesusisjord@reddit
I’m asking for my personal experience - I don’t want to be responsible for any data-related issues myself, but I don’t manage our 365 tenant, so the organization as a whole is not my concern.
lotekjunky@reddit
we can read all of your prompts, like we can read your email... just remember that :)
lesusisjord@reddit
Wow! For real‽
lotekjunky@reddit
I didn't know if you down voted me for that, but yes, it's true. it's all in purview. we need access to it because the prompts were deemed "business records" and I work in finance where deal with legal hold, compliance, and audits...
lesusisjord@reddit
Yes, I did. We are in the sysadmin sub and not tech support.
Previous-Low4715@reddit
You can see whether you’re protected or not, at least in copilot, by hovering over the protection icon in most copilot enabled apps. But like the other guy said, in theory all your prompts are visible by IT or information management team
lesusisjord@reddit
I couldn’t care less who sees my prompts. Copilot is for work shit and ChatGPT is for everything else.
I’m only worried about the company data and not being responsible for data loss. I get that shied on the top right of copilot and that’s enough for me.
lotekjunky@reddit
this is correct
lesusisjord@reddit
Thanks!
jeffrey_smith@reddit
They're saying all versions don't train to get the shadow IT enterprise use up. All of them continue to retain data though.
lotekjunky@reddit
r/conspiracy is over there ->
Previous-Low4715@reddit
In Copilot, prompts and interactions are held in user mailboxes so they can be searched for compliance and ediscovery.
Fearless-Assist-127@reddit
Ah yes. Adverts. Famously reliable, transparent and truthful.
Previous-Low4715@reddit
That’s not what “advertise” means. Judging by the replies I’m getting, some Americans can’t process the idea that the EU has strict data sovereignty and processing requirements.
https://learn.microsoft.com/en-us/microsoft-365/copilot/enterprise-data-protection
jimicus@reddit
America has something called the CLOUD act - which basically says “to hell with where the data is held or the corporate structure of the subsidiary that holds it; if it’s held by a company with a US head office, they must hand it over on demand”.
nem8@reddit
Yeah, I feel people forget about this, or don't know about it..
ls--lah@reddit
Countries like North Korea and China were always "unsafe" because they had similar laws. But it's okay when the US does it!
Fearless-Assist-127@reddit
I'm not American. What I do see though is how much surveillance the likes of Microsoft force on everybody, everywhere; deals done between Palantir and the NHS; global databases of identity openly being built and forced on everybody.; etc.
We're in a de facto third world war and one of the fronts it is being fought on is data and surveillance. I don't trust any multinational, no matter how many platitudes and unproveable promises they make.
spin81@reddit
I'm in talks with them for a really cool purchase. I'm not saying what but let's just say it will connect Manhattan to Brooklyn!
SideburnsOfDoom@reddit
Sending this data to some third party service over the internet is sharing it. It's that simple. You don't have no real control over how that services logs it, stores it, uses it and trains LLMs on it.
SearchAtlantis@reddit
I mean we still do training about not putting PII or PHI in. Internal IP is fine on the enterprise version.
gryghin@reddit
I remember when Google appliances first landed in the data centers so that we would have that functionality without exposing sensitive IP.
You would think this would be a thing for ChatGPT.
But then again I retired in 2023.
RobotBaseball@reddit
Is this ChatGPT enterprise?
kennetheops@reddit
that is pii leakage i’m sure of it. The auditors are about to get paid because of ai
sdbrett@reddit
It’s not just from the consumers either. The AI companies aren’t doing a good job with data segregation, like the copilot bug which allowed summarizing documents and emails with data sensitivity labels.
ConflictResident5253@reddit
They actually can't. LLM's don't work that way. If there are sensitivity claims being made, they're necessarily false. LLM's just recognize and replicate patterns. They don't know what they're reading or saying. They don't know whether it's private or what "private" is.
kennetheops@reddit
holy hell that’s big bad
shangheigh@reddit (OP)
Tough truth, whats scares me most is most employees arent even aware of how much theya re putting the company into risk
kennetheops@reddit
Imo ai has set us back 15 years in cyber security. We are giving up security for convenience.
curious on your thoughts on not just pii but just confidential information as well being leaked
ConflictResident5253@reddit
It's more like 2002, when windows had no defense at all against network worms. Mixing fake cloud platform security (o365) with incompetent endpoint OS's (windows) and adding code that can't do anything BUT ingest and regurgitate language-like patterns made of what used to be your company secrets.
Like, it's an apocalypse and no one cares.
Bogus1989@reddit
lmao,
they will keep doing it till companies cyber insurance payout requires they have full compliance…I actually bet cyber insurance companies are looking very lucrative right now…they found a new way to not pay out
Aggravating_Refuse89@reddit
cyber insurance leads to a lot of products being purchased and box ticking theater. Its super lucrative for security vendors and companies like KnowBe4 which are lip service to doing something about email which is by far the biggest succesful attack vector. Cyber insurance has done some good such as forcing MFA on a lot of things it needs to be on. But for the most part it forces people to buy products they wont really use or know how to use to check the box.
The downside of all this is the poor implementation of a lot of these products leads to security fatigue and user confusion. We are sending them legimate emails that look like phishing and then wonder why they get phished.
Point is, all this worry about AI exfiltration is not wrong, but it really is a much much lower actual risk than a lot of things people do every day.
I am all for security. But cyber insurance style box ticking is a lot of theatrics and $$$ for the vendors.
I swear KnowBe4 which is the industry's weak assed attempt to solve the Phishing problem is in cahoots with the insurance issuers. Otherwise there is no way that company would even exist. Yes I said Phishing is the biggest risk and then called something that addresses phishing useless. I know that seems wrong
AthiestCowboy@reddit
Yes and we will see AI workloads come on prem.
MagicWishMonkey@reddit
If your company doesn't have an entperise agreement you get what you have coming to you
If you do have an agreement, it's not PII leakage or anything of the sort, it's no different than storing something in google drive or sending an email. The fact that so many people think "AI" is some magic thing different than other tools is really confusing to me.
I will say that if you DO have an enterprise agreement and you're giving users legal advise I would be all over your ass if you were one of my reports. Not only is that not your lane, you have no business giving any one legal advie, and you obviously have no clue what you're talking about.
Phreakiture@reddit
Thing is, were I a customer, I would be pissed if I found out this was done.
But our modern world is not built on consent. Like, at all. Matters of intimacy are the one major exception to that.
Outside of government, sales and marketing are some of the most invasive, DGAF-about-consent types around. I'd kinda like to sic a PI on every marketing guy to give their lives a metaphorical proctology exam. Let them see how it feels to have someone up their ass without their permission.
linoleumknife@reddit
Yet company leadership pushes employees to use AI for everything they do.
ency@reddit
I'm willing to be most employees, even if they know, just don't give a damn.
anortef@reddit
Worked at places at which the response from the CEO when told the risks was legit: As long as the sales outpace the fines there is no problem.
ency@reddit
That's always the case for the C level people. Breaking the law isn't an issue if the fine is less than the profit, ignoring safety issues is fine as long as the payout is less than the profit, ignoring security isn't an problem as long as the fix after getting called out is less than the profit.
I'm all for capitalism and think it's the best of a lot of bad options as long as it's constrained and regulated. The corporate death penalty has to become a thing and the c levels need to be held accountable for much of their bs. I'm a huge cynic but I am still fairly optimistic when it comes to people. If the companies started to give a damn and showed it then the employees would as well. Doing that would go a long way in plugging a bunch of the easy to reach holes when it comes to security. But as things stand I don't do shit and I don't say shit when I see issues unless it's gonna effect my role. I'd gladly let the company burn to the ground while making sure the items in my job description are taken care of. It's not my responsibility to make sure others do their part.
Bogus1989@reddit
oh i hate that…
The intent seriously matters.
It is why I vehemently despise google. They actually believe their shit dont stink…they couldnt be that dumb right? microsoft amazon facebook?
you cant continue just giving the middle finger guys….you have to pay the piper at some point….
still cant believe a lawyer out there finally got them, and shes smart enough to understand it all, and has proven googles track record indicates they are no longer capable to be believed that they will operate in good faith. Love that ruling. They are working out how best to not ruin the entire business. I guess google just guys to try to convince them it wouldnt work…they said well, its your funeral…you can either help mitigate risks or be no part of it all.
Black_Patriot@reddit
Throw some CEOs in jail and see how quickly the rest make following the rules a top priority...
gandhinukes@reddit
Real businesses that care about pii are paying Microsoft for enterprise copilot and guaranteed no data learning to the outside lmm. and that employee is messing up.... Lots of trust being given for $$ though
Mrhiddenlotus@reddit
https://openai.com/index/openai-for-healthcare/
kennetheops@reddit
I have heard about it being a pain to get a BAA from them so most folks have gone through the hyper scalers
WRB2@reddit
Natural stupidity hides a great number of issues with artificial intelligence
99infiniteloop@reddit
It is stunning how often people can’t see ChatGPT and the like as either a 3rd party or vendor, when it comes to privacy and security concerns. Sure it can do more than other traditional SaaS tools so it’s “different.” That doesn’t mean your data isn’t going anywhere.
missingcolours@reddit
So... under GDPR and many similar laws there is the concept of a "data processor". This is someone who processes your data on your behalf. This could be anything from using cloud apps to outsourcing data analysis tasks. GDPR requires certain contractual agreements between data owners and data processors.
Enterprise versions of things like ChatGPT in covered justifications typically include such agreements and are thus GDPR compliant. So if you're using a licensed enterprise version, you're probably fine. If you're not in Europe or California, there may not legal restrictions around data sharing. If you're in Europe or California and using free ChatGPT, you might be in trouble.
Finorix079@reddit
The mental model people have is "I'm asking a tool a question," not "I'm sending data to a third party." Those are very different in how a brain processes risk. Posters won't fix it because posters address policy, not the model. What works is making the safe path easier than the unsafe path, like an enterprise tier that auto-redacts before the prompt leaves the browser, or a corporate ChatGPT instance that defaults to no-training. Friction-on-the-unsafe-path is policy theater. Friction-removed-from-the-safe-path actually changes behavior.
dumblebees@reddit
it’s not your job to train people. your job is to cover your ass and take home a paycheque that is as large as possible for as long as possible.
cjcox4@reddit
All AI was created from stealing end user data. All of it. Something to remember when your favorite AI model says they "won't use" the input data you send for "training". Do not trust these folks.
Nagroth@reddit
Enterprise versions have legally binding agreements. If they don't uphold their end there are civil and in some cases criminal ramifications.
Using any version that isn't signed off by the Legal department can get you into the same sort of trouble as posting it on your public social media page. So ya, don't do that.
Aggravating_Refuse89@reddit
One problem is this talk is all theoretical. You post stuff on Facebook its easily seen, but what is the worst case scenario of someone putting sensitive data into an AI?
They could have a breach? Yes so could anything.
If this threat is as bad as many say it is and I am not convinced. what is an example and how would it look?
Most users could not give two rat turds about the company. But they care very much about getting caught for real. Assuming the org doesnt have all the wiz bang monitoring shit or anyone to look at it, lets say a dummy puts customer info into Chat GPT. How does that play out in a way that they get caught? Breach? Suit?
I think proving to end users that, its a reasonable risk to believe someone could get the data I put into ChatGPT and prove it was me that entered it, and that my boss can get that info from the issue itself not internal monitoring.
Theoretical its bad is not gonna stop end users. Saying, if you put that in there and something bad happens, its going to clearly come back to you, that scares end users.
I have yet to see one person do this.
If I go and put Joe Blow's private info into ChatGPT. How does this go bad and how does it come back to me? Assuming my company does not monitor for this and many people in my org have access to Joe Blow's data?
Real world stuff that make sense to users. Cause I dont even have that for myself.
I would not do it because I do not trust these companies. But these feels very theoretical and not very relevent to the user
Nagroth@reddit
It's very simple. Don't put company data anywhere that you haven't been told it's ok, in writing.
Aggravating_Refuse89@reddit
Average user also does not care about the blame shifting over to legal which is really all Enterprise agreements get you.
illhaveubent@reddit
There's also civil and criminal law against stealing intellectual property, but every model has been trained on it anyway without permission or compensation. I don't think the law is really being enforced on these companies because the government sees AI as a national security interest.
Nagroth@reddit
If Corporate Legal says I can give it company info then idgaf about any of that, let the lawyers eat each other.
illhaveubent@reddit
And that's fine, but let's not pretend there are civil or criminal ramifications for misuse of the data.
Nagroth@reddit
Why would you pretend that there's not?
illhaveubent@reddit
You're just going to ignore what I said in the previous comment and play dumb?
cjcox4@reddit
Legally binding only matters if caught and one has compelling evidence of violation. Agreed?
I've worked for fortune 100 companies that have done far worse things.
Careful-Criticism645@reddit
If there's no compelling evidence of a violation, then what's the issue?
Jimthepirate@reddit
For me it is funny how with AI people suddenly are so aware of their data, but then do not think twice using cloud providers to store emails and all their files. I had people with serious face tell me how concerned they are on MS copilot but then have all their emails run on exchange online and sharepoint store 99% of documents.
Yes there is a good reason to be concerned, but at the end of the day, unless you run your own thing, you are putting the trust in someone. Just need to do due diligence when choosing your stack and evaluate your vendors. For free stuff you are always the product. Did you know that for example Gemini a human reviewer can look at your prompt and output “to ensure better quality”. I find “someone peeping your conversations“ is way more effective argument to get people listen.
Aggravating_Refuse89@reddit
Google is the absolute worst for prying eyes.
cjcox4@reddit
I think it's also funny that "giving up your privates" is defined as "modern day zero trust". Ironic.
SquareWheel@reddit
Public websites are not end user data, and scraping websites is not illegal.
PatHeist@reddit
Scraping anything copyrighted from a website is in fact illegal.
SquareWheel@reddit
It's certainly legal under US law, even if the website wishes to prevent it (see hiQ Labs v. LinkedIn). Even using that data commercially can be legal under Fair Use if deemed transformative (Bartz v. Anthropic, Kadrey v. Meta).
Do you have any specific counterexamples?
PatHeist@reddit
hiQ Labs v. LinkedIn resulted in a settlement after the court concluded hiQ Labs breached LinkedIn ToS.
Both Bartz v. Anthropic and Kadrey v. Meta rulings specifically state they're not an endorsement of the potentially illegal acquisition methods of the copyrighted material, only that the use in training is transformative.
SquareWheel@reddit
Yes, but neither of those points change the precedents that were established. Scraping was found to be legal, and training AI models was found to be transformative. In the words of Judge William Alsup, it is "exceedingly transformative". The fact that Anthropic pirated content was a separate matter entirely, and had no bearing on that determination.
You made the claim that scraping copyrighted material is illegal, but have not yet shown that to be true.
PatHeist@reddit
Copyright law makes copying copyrighted material without permission from the rights holder illegal outside of situations that can be classified as fair use. In the US fair use has been consistently established as a case specific matter that cannot be ruled on generally, that needs to be tested separately for each case base on the matters of the case. In both of the summary judgements in those two cases that is re-affirmed.
And both summary judgements specifically do not rule on the illegal acquisition of copyrighted material. Kadrey et al v. Meta Platforms, Inc. is currently an ongoing class action and Bartz v. Anthropic resulted in a $1.5 billion settlement over the piracy claims.
https://copyrightalliance.org/wp-content/uploads/2025/06/Bartz-v.-Anthropic-Order.pdf
https://law.justia.com/cases/federal/district-courts/california/candce/3:2023cv03417/415175/598/
Please find the part of these rulings that sets a carte blanche precedent that scraping copyrighted material is legal.
SquareWheel@reddit
This is a reversal of how the law is structured. You don't need to find legal exceptions to determine if something is allowed. Acts are permitted by default unless determined to be illegal. The burden of proof rests with those claiming that something is illegal.
In this case, however, I'm not arguing that a carte blanche exists. Scraping has been shown to be legal under more specific scenarios such as accessing public information, but not when bypassing captchas, login pages, or other technical barriers. That access is enough for AI training, and does not violate the CFAA.
You claimed that general copyright law protects published data, however simply downloading (public) website data has not been shown to be a copyright violation. It's how that data is used is the deciding factor.
We've already gone over case law, but there's many other precedents which show legal use of web content under fair use. Examples are web indexes such as Google Search, news agencies quoting tweets or reddit comments, YouTube remixes, and reverse image search tools.
That's true. Fair use is a defense, not a general legal protection. In cases where it's disputed, it may go to trial. That's what we're seeing now with a number of AI cases. However, so far training these AI models has been found to be highly transformative, and thus good candidates for fair use exceptions. Once case law is better established, a stronger precedent will be set.
Again, the issue of Anthropic and Meta utilizing piracy is irrelevant to the question of training. Piracy is clearly illegal, and they should be and are being fined for it. But the original question was over public web scraping, and as discussed here, there is strong precedent for scraping being legal.
PatHeist@reddit
Copyright infringement is explicitly illegal in and of itself. There are exceptions to what is generally illegal if you can demonstrate fair use. Fair use being an affirmative defense matters. You are right to say that acts are legal unless they are illegal, but in this case you're doing the equivalent to arguing that murder is legal because killing in self defense can be ruled to have been justified.
You say there has been strong precedent for scraping being legal. Cases I have heard of affirm that scraping of copyrighted material in general is not legal. The judgements in the cases you've linked outline at multiple points that copying copyrighted materials to make a database of them without a specific purpose is in and of itself obviously illegal and establish IB length that their comments on later using those databases to train AI is a separate question from acquiring the training material. You appear to be conflating the training rulings, which are explicitly divorced from the data acquisition question, in a way where it comments on the data acquisition. I struggle to see how this makes sense. In other cases companies like Google have repeatedly lost regarding scraping and retaining copyrighted material where it could not establish a fair use defense and they have had to modify practices as a result.
I maintain that if you want to claim that "scraping", which I understand to mean "downloading and retaining from websites", of copyrighted material is generally legal, contrary to copyright law, you should be the one to produce cases that establish that precedent. It seems like you think the cases above do, but obviously I am not reading the judgements the same way you are. If I am missing the part where the judgements establish that scraping copyrighted material is legal, please point it out. Or support your argument with one of the other cases you say establish this precedent.
SquareWheel@reddit
It seems at the very least that we agree that scraping under fair use is permitted. There's disagreement on if scraping for general use - even if it may not be used in infringing ways - is also permitted.
You've referenced specific lines in the judgment that I would need to spend time reviewing -- more time unfortunately than I have right now. I would however be willing to review and evaluate shortly. If I've made a mistake, I'd like to correct it, and would recognize the error.
To bring this discussion back to its starting point though, since we agree that scraping under fair use is accepted within the US legal framework, and that training AI models specifically qualifies as fair use, it follows that training models on public website data is permitted. Or to be more precise, would be highly defensible in a suit. This may change depending on the outcome of future court cases, but existing precedent does point in this direction.
The specifics of the generalized case for scraping is something I'll need to review more of in my free time. I agree with the definition you gave of the term.
postbox134@reddit
A hot legal topic on 'fair use'
cjcox4@reddit
Technically, your "bank" (for example) is a public website.
SquareWheel@reddit
Scraping public pages on a bank's website is completely okay, such as seeing what types of accounts they offer. Scraping anything behind a login page would likely be a violation of the Computer Fraud and Abuse Act.
cjcox4@reddit
Their reach is more egregious than your "okay" makes it seem.
Superb_Raccoon@reddit
Nobody wanted the IBM models that had clean data training. Any of the Granite models are copyright clean
AriesCent@reddit
Pii!!
Jacmac_@reddit
If a user puts client information into a google search prompt, full names, deal sizes, internal pricing strategy, etc., is that data sharing?
What's happening here is that admins, by that I mean the AI wannabe police, have decided that data that goes in that might later be used to train models will somehow be used nefariously by the model owners or that thrid parties will later somehow co-opt chatGPT to leak this information directly to them because it was trained with it. This simply isn't the case. It is hyperbolic to think that a user putting data into their context is going to have the data leak out to the rest of the world. If they save the context and someone else logs into their chatGPT account, well then I guess you could look at that. But the mere fact that they pasted information into a chatGPT prompt does not mean anyone else in the world gets to take that raw data as if it was freely shared.
Aggravating_Refuse89@reddit
A lot of this is a reaction to a theoretical problem that could happen. I think that is why it doesnt get taken seriously
CPAtech@reddit
The issue at hand is that you are transmitting sensitive data to a third party as soon as you enter it into a prompt.
Jacmac_@reddit
You could make that case about searching. Seriously, the threat is being blown way out of proportion. There is more danger of the user copying the information to an unsecured USB device and losing it than send it to ChatGPT.
CPAtech@reddit
Are you inputting PII into search engines? That actually is the same case.
Jacmac_@reddit
OK so how are you stopping users from putting PII into search engines and what is the known exploited risk?
phunky_1@reddit
It's your job to give people AI tools that they can use safely.
Convince decision makers that this is a risk, have them get staff a paid subscription to ChatGPT, Claude, etc.
Then it's a non issue.
nyckidryan@reddit
Since when is it my job to give people access to AI tools? Should I give them access to Tor as well?
phunky_1@reddit
I guess if you just blindly run IT without suggesting tools that can improve business, knock yourself out.
People are going to do it regardless, you might as well make it safe for your data.
Aggravating_Refuse89@reddit
Safe is the wrong word. Make it someone elses responsiblity like Uncle Micro$oft
Linkpharm2@reddit
A GPU with Qwen3.6 27b and vLLM will help a lot to solve this problem.
kylethedesigner@reddit
Absolutely, and honestly a lot of office tasks don’t even need AI when a simple python script would accomplish the same thing faster and for free.
Aggravating_Refuse89@reddit
Many would need AI to write that script.
Snarky response
03263@reddit
It's probably in a million different data leaks already.
Everyone and their mother has potential access to my full name, home address, phone numbers, SSN, insurance info, email, leaked passwords, etc. I've cashed enough data breach settlement checks to be sure of it.
So pardon me if I don't really care about it continuing to happen. Just keep sending those checks.
Aggravating_Refuse89@reddit
This is the attitude I am talking about.
OldGeekWeirdo@reddit
Can you go to ChatGPT at your computer and using a different account, ask questions about company deals?
If it shows it knows about it and will tell strangers, it will land much harder.
wazza_the_rockdog@reddit
It's not an immediate thing and may not allow a direct lookup due to guardrails it sets up, more of the concern is that someone looking for similar info in the future may be given that live data as an example.
Aggravating_Refuse89@reddit
This is the problem. Its not immedaite and very much may not tie back to the actual user. Employees dont care about the company. At all. They care about their ass. If that is not directly threatened they are not going to care about the rest. If I could go search for data and find that Jeff entered it, Jeff would be scared. But I have not heard of any situation where it comes to that.
cpz_77@reddit
I was about to say something like “you guys have training and policy posters for AI stuff?” till I realized you probably mean security/data sharing training. And yeah unfortunately as you mentioned most users probably aren’t going to make that connection, at least not the less technical ones. We’ve had some users that had asked questions indicating they do understand it (generally more technical “power users”) but others have said or done things that makes it very clear they don’t.
We desperately need some actual “AI best practices” training for all users I think (specific to AI), but part of the problem is I don’t think we (IT) even really fully knows what those are yet. Obviously there are security standards we can and do already train users on but taken outside of the context of the IT world many don’t know how to apply that to other situations like with AI. Plus there’s the whole topic of productivity and “moving fast” ( 🙄 ) vs. maintaining process and security. Every company/entity has to make that decision for themselves as to where they want to set that line.
But I don’t think any of this is really a thing yet, anywhere. Everybody and their brother is balls deep in this AI craze but I haven’t heard anyone say their company is spending cycles actively trying to draw up an official AI “best practices” or “usage guide” and/or put together some sort of AI training for users at their company. Doesn’t mean it isn’t happening of course - maybe some places somewhere are doing that - but I certainly haven’t seen it anywhere (least of all at my own place 🤣 ).
mysysadminalt@reddit
Meanwhile we had to turn off TLS inspection for these platforms so DLP no longer works 🙃
Aggravating_Refuse89@reddit
This is some reality here. If you do not have that, it really doesnt matter what else you do have
therankin@reddit
Oh crap! You just reminded me that there a problem with my TLS cert and I forgot to make a reminder about it!
Thank you for inadvertently reminding me!
NapalmNorm@reddit
I had a director who was on our AI committee who help review and approved AI policy for approval for the Engineering department. He started using a free Claude account (not approved platform at the time) on his personal computer entering confidential company data (whole separate conversation) to generate side by side samples to show how much better it is than Copilot. He put it all into presentation to show me and our COO, and all I could say is… what the fuck.
Aggravating_Refuse89@reddit
That was stupid. He definitely should not have done that and more stupidly, shown it to the you and the COO. Idiots are everywhere
Slivvys@reddit
Switch to copilot for enterprise, turn on enterprise data protection to prevent it from being used to train foundational models.
MeatPiston@reddit
This is the correct thing to do. Unfortunately Cope Pilot is useless.
Slivvys@reddit
Usefulness is subjective to what you need it to do. For most businesses using it for general admin and sales its more than capable.
Using it for code? High level anything? Not without enabling Claude or chatgpt plug-ins for it.
Aggravating_Refuse89@reddit
This is exactly how Microsoft stays in business. Corporations want a minimum viable product. This is also why the AI is going to replace you hype is BS. Not as long as its limited to most to being only on par with Teams, Sharepoint, Bing, Zune, and Windows 11. Co-pilot is a Microsoft turd version of something both useful and dangerous. The entreprenuer types want innovation but lawyers dont. AI is going to be at best the next Google and some automation
MrHaxx1@reddit
I genuinely don't understand why people think Copilot is useless.
It's great for anything in M365, and for everything else, it does the job. I kind of hate much of a pain in the ass it is to use connectors, where as it is two clicks in Claude, but otherwise it's fine.
Cyhawk@reddit
Copilot can use the majority of models.
MagicWishMonkey@reddit
An enterprise agreement with any of the big AI companies will prevent your data from being stored or used to train anything. There's no need to "switch" to anything.
I'll bet anything the OP has no idea what agreement/contract is in place but has strong opinions anyway.
Slivvys@reddit
The difference here is the risk of shadow IT. Most likely their org already uses m365 with hybrid joined DC, so computers can default to logged into copilot. If you reduce the hurdles your staff have to go through to access a tool, they'll be more likely to follow that route instead of buying their own tool that requires additional steps to access, and that they may lose access should the organization decide to block access down the road.
I prefer Claude myself, but I also understand when my users are left to their own devices they make insane choices.
wavemelon@reddit
I had exactly the same conversation with the founder of our company the other day, he wanted to upload a clients list of their clients contact details and calls to it to output a contact trend analysts…. One day, on the plus side, if you ever forget your phone number, address and social security number you can just ask chat gpt, haha
spittlbm@reddit
Mythos to the rescue!
Aggravating_Refuse89@reddit
This is a lot more of a risk than anything an LLM could possibly ever imagine.
spittlbm@reddit
If the rumors are true, golly. Mozilla alone already used it to identify and patch 271 vulns.
shangheigh@reddit (OP)
Its all fun until someone else asks your AI about your card number, social security and address.
Aggravating_Refuse89@reddit
And then exactly what? Some company has that info. Yet we trust other companies and people we dont know with that info every day. While its possible, the actual risk there is pretty low.
Express-Pack-6736@reddit
Most companies are sitting on way more than 23 and just dont know it. the real number is probably way higher if you count the stuff people use on their phones and personal laptops. the visibility gap is massive and its only gonna get worse as ai tools multiply. we recently onboarded layer x and it painted a much worse picture than we originally thought
Aggravating_Refuse89@reddit
What s layerx?
itskdog@reddit
Just or GDPR purposes I'd love to roll out Teams to replace the numerous WhatsApp group chats staff have set up, but there'd be no buy-in from school leadership.
Rajvagli@reddit
It's called PII, and your sales person is an ass butt.
https://csrc.nist.gov/glossary/term/personally_identifiable_information
Aggravating_Refuse89@reddit
Ass butt?
Hello Castiel
UpperAd5715@reddit
Just share this with HR or whoever does infosec and wash your hands off it, you don't want to chew whatever you can bite off of this.
Working in a regulated industry our users are pretty good with all of this stuff but there is also very good retention so we don't get a boatload of people that don't care. A business analyst scrambled an ancient dataset a few times and people still threw a hissy fit because the name of a higher level employee appeared in it while all data was fictional, just in case someone might consider it to be real.
Aggravating_Refuse89@reddit
This is the real answer. Everyone drinking the microsoft generic grape beverage and looking to solve a people issue with tech.
Sobeman@reddit
she is head of sales, sales people are immune to any disciplinary actions
randalzy@reddit
"but I'm Pagliacci"
DaftOnecommaThe@reddit
100% this, if there is a compliance/governance officer they need to be made aware.
leogodin217@reddit
This used to really worry me until I realized there are all kinds of tools we put client information in. Salesforce, mail provider, etc. Is the risk any greater with an LLM? Any one of those companies could break the contract and train AI on the data.
dllhell79@reddit
Yes, that is a valid mindset from the perspective of an IT professional. It's really not our responsibility to micro manage and govern every single action that the end user takes. The question I'd also pose though is what happens when you get audited, and an auditor discovers this is going on with no safeguards or solutions in place to prevent such behavior in the first place? That is where things get into grey territory, especially since many of the AI governance tools do not actually exist yet.
leogodin217@reddit
Governance is a real concern right now. My question is what is different between using Gmail (or whichever provider the company uses) to send an email and using AI to help draft it? Both cases are sending customer data to an external system.
therankin@reddit
Yea, that's basically where my mind is now too.
Turbulent-Pea-8826@reddit
I imagine this is why AI is huge right now. All of these companies have a direct link to your company data. Even if they aren’t training on it they are just sucking up all of this useful information that they can use for their own benefit.
jgrig2@reddit
It depends : was she using a free version or an approved enterprise version?
Curious201@reddit
this is exactly the kind of thing companies need a simple internal rule for, because “can i paste this into chatgpt” should not be decided by each employee in the moment. client names, addresses, private order details, contracts, support tickets, source code, credentials, screenshots with customer data, all of that should be treated as not safe for public tools unless the company has an approved setup and a written policy. i also think training matters because a lot of non-technical staff do not think of an address or spreadsheet row as sensitive, they just see it as text they need help rewriting. the safest practical rule is to anonymize first, use fake names and dummy numbers, and only paste the structure of the problem, not the actual client data.
Curious201@reddit
this is exactly the kind of thing companies need a simple internal rule for, because “can i paste this into chatgpt” should not be decided by each employee in the moment. client names, addresses, private order details, contracts, support tickets, source code, credentials, screenshots with customer data, all of that should be treated as not safe for public tools unless the company has an approved setup and a written policy. i also think training matters because a lot of non-technical staff do not think of an address or spreadsheet row as sensitive, they just see it as text they need help rewriting. the safest practical rule is to anonymize first, use fake names and dummy numbers, and only paste the structure of the problem, not the actual client data.
elementsxy@reddit
Has she been fired yet?
Dry_Inspection_4583@reddit
The average user doesn't know where the information is, they postulate it's "in the AI". I'm unsure if maybe your expectations don't need to be lowered.
caylyn953@reddit
D'oh! This is why you must sign up to the enterprise version (you have done that, right? Otherwise this is 100% all your fault)
Then it won't really matter so much at all if people are doing this or not
SlickAstley_@reddit
Surely you'd have to play dumb in that scenario tho.
I know its wrong and sometimes still do it anyway
BroaxXx@reddit
Is she using a public api? If so this should be blocked by default.
Aggravating_Refuse89@reddit
going to go against the grain and say, its bad. Its probably against policy and coudl get someone fired depdning on said policy and if they are trying to get rid of someone. Dont put real info in AI.
Their risk is more that the IT department will be tasked with documented it and it be used as a way to fire someone.
The actual real risk of your data leaking out of an AI is not zero, but its not likely. I will say that super proprietary info or classified info is a bad idea. But most likely unless there is a breach of the AI and it can be tied back to coming from you, the risk is pretty low.
A lot of orgs use DLP to catch this
Now in person I would totally say the opposite because the security dogma states you will believe it. Just like I believe a lot of "security" is placebo box ticking. But to say any of that is blasphemy if you work in IT.
So bottonm line, you are in IT. You caught someone. The risk is low. But what does the policy say you do about it?
Its astronomically low chance this will cause a real problem but its a definite no no in security dogma.
Scullyx@reddit
Give employees a 'Do Everything' button
Employees push 'Do Everything' button
Im shock
dllhell79@reddit
You should have called it "disclosing proprietary company information and trade secrets" instead of "data sharing". Sure, she may not be doing that fully, but it sounds more forceful and impactful. It's sounds like a negative thing that could have consequences. The term data sharing almost implies that it's a good thing.
PotatoOfDestiny@reddit
People type some wild-ass shit into chatGPT too, all of which is potentially discoverable in a legal proceeding. Never type anything into a chatbot that you wouldn't want read back to you in court.
Deltrus7@reddit
I work in Healthcare.
Don't put anything in AI that you don't think you should. We keep all patient information out of it.
CernerBurner2000@reddit
I think that within the next 5 years we are going to see our first case of "extortionware" for a bad actor gets a hold of enough company specific PII forces them pay or else it will be released to a competitor or public.
My wife can tell me one time that our dishwasher sucks we need to get a new one, and my ads on Facebook for the next two weeks are nothing but dishwashers. If Meta AI is listening to me while I'm at home then it's also listening while I'm at the workplace, if they are listening all of the others are also
w1na@reddit
There is an option to turn off training model from input in ChatGPT. And sincerely, if you are part of IT and you do not advise your department about how to handle data in AI tools, it is your failure.
Secret_Account07@reddit
I mean do you have an enterprise license?
I don’t use ChatGPT but most AI enterprise licenses have data protection. I can drop full server info on Copilot just like I can with Sharepoint
MagicWishMonkey@reddit
Reading all these replies by people who A) have no idea what their actually contractual agreement is while B) assuming they are informed of the legal implications of using licensed software
Not only is that 100% not your fucking job, but you're also not informed enough to have an opinion and you definitely shouldn't be telling users what they can and cannot do. These people are just looking for ways to get fired, it's wild.
postbox134@reddit
This is why ChatGPT enterprise exists.
If they demand the tools you've got to provide it or shadow IT happens. And no Copilot isn't good enough (although easier to deploy and manage), people want the native tool (ChatGPT or Claude).
bobo_1111@reddit
Lol Copilot IS ChatGPT and Claude. It’s just a front end and also is enterprise so data doesn’t leave your tenant or get used for training.
postbox134@reddit
It's significantly worse than the native tools. Yes the wrapper makes it more controllable but if it's useless then no user wants to user it
_-pablo-_@reddit
Copilot is just the wrapper, they still get access to the models.
The Orgs that jumped onto ChatGPT enterprise are having a hell of a time integrating it with o365 and that suite
postbox134@reddit
Copilot is only better at O365, everything else is much worse imo
Finn_Storm@reddit
But... Copilot is shite for M365? Like I asked it to build a simple query with 10 lines of code in power automate and it couldn't even do that. Gemini makes spelling mistakes and does funny stuff like hallucinate commands that don't exist, and chatgpt just straight up lies for shit half the time.
I've heard good things about Claude, but I've yet to try it out and I'm skeptical of it being as good as they promise
F0rkbombz@reddit
Claude does the same crap. I do a lot of work with KQL and Copilot and Claude will both confidently make up table names, properties, and produce some of the most inefficient and ineffective KQL I’ve ever seen.
I haven’t used ChatGPT in a while, but last time I did it was making the same mistakes on stuff.
All the LLM produce slop and will confidently lie about it to convince you it’s good.
Finn_Storm@reddit
Cgpt can handle powershell okay-ish I suppose. It will do Some things Inefficiently but mostly it's logically explained and crafted. But that's also basically all I use it for, aside from it's deep research program
K-Rose-ED@reddit
Claude is very good, but it used to be better. The problem with all these apps is your service quality can suddenly tank, all of them have been constantly changing models and increasing them reducing how much they can do, it’s a mess.
Finn_Storm@reddit
Can't you just return to the old models? /gen
nyokarose@reddit
Yeah… but who wants to pay dual licenses….?
MrHaxx1@reddit
Our company does that lol
Bogus1989@reddit
yeah or gemini instance is hilarious. it just treats me like an end user sometimes and says i could be performing unauthorized things….
but with the change of one word, it spits the answer out. 🤣
Visible_Soup_5484@reddit
Copilot offers Claude Opus FWIW
postbox134@reddit
Model != features and UI/UX
jayybeegeee@reddit
Recommending chatGPT enterprise while ignoring that Copilot Enterprise is an odd take
007bane@reddit
This right here.
flummox1234@reddit
I'm willing to bet the same person would lose their shit if someone did that with their data. There might be a solution for you there, probably not.
EvilGreg13@reddit
When was the last time you were impressed with the level of intelligence of a sales person.
UpsetMarsupial@reddit
I remember a time when a sales person was impressed with my knowledge. This was when Maplin was still trading. I was buying individual components, and the sales person asked what I was making. I assumed (naively in retrospect) that he therefore was knowledgeable and therefore talked about it and asked him a question about an as yet unsolved problem I had. He just looked at me with a face of bewilderment and said something like "You know your stuff". An almost heartbreaking moment, as when I went there as a kid 35 years ago I'd get loads of support from the staff.
GermanAf@reddit
Imma just call the GDPR regulators and they will take care of it :)
mydogcaneatyourdog@reddit
When I see people comparing the use of blob storage and s3 to using private data in LLM prompts, it concerns me greatly. This thread makes me wonder how many people don't understand their tools or are just straight bad faith bot replies.
thatirishguyyyyy@reddit
I am almost certain is is violating contracts agreements and you should probably tell someone higher up... or let your clients know.
mmmaaaatttt@reddit
I think about this. And then I usually can’t be fucked using placeholders so just share the data anyway.
VividVigor@reddit
I would stop calling it “data sharing” and call it data exfiltration or data breach. Get serious about the name of this behaviour because “sharing” sounds like a nice, collegial thing to do.
Send emails with phrases like exfiltration attempts detected. Data breach results in loss of revenue and brand reputation and directly impacts every employee. Yada yada.
FlyingBishop@reddit
ChatGPT is just an app. Uploading data to it is no different from putting it in Google Docs, if you have an enterprise agreement. Yes, they store it on their servers. Sure, you don't trust OpenAI but why do you trust Microsoft when you upload PII to O365? Why do you trust Google with Google Docs? There's no breach unless you can demonstrate the data was used by someone outside the company.
Bogus1989@reddit
exactly this.
or I say, do you actually want your employer to have dirt on you? easy pickings for layoffs.
doubleopinter@reddit
Oh I can’t tell you the things sales and marketing are doing in ChatGPT. People are fuckin stupid. We’ve caught ppl giving production credentials to chat gpt. I think we’re making it a fireable offence soon.
GX_EN@reddit
Prod creds? JFC.
I have a friend who has uploaded un-redacted personal medical info right into ChatGPT because she's a hypochondriac and constantly trying to diagnose her "problems" instead of you know, talking with her doctor..
RadlEonk@reddit
Send her an email, copy your Legal, Risk, Compliance department(s), and explain that she can help contact the clients and states attorneys general when the breach happens.
Sobeman@reddit
if you are allowing your org to access LLMs without guardrails then you have bigger problems then training
RIPGoblins2929@reddit
This would get your license yanked if you're an attorney.
Unlike sales people we have professional responsibility rules we are mandated to follow.
Varrianda@reddit
If it’s an enterprise version it’s fine, that’s the whole point. If y’all are just using ChatGPT.com though….
thepatientwaiting@reddit
I couldn't even use Grammarly at my last job because it was risking client confidentiality.
Competitive_Smoke948@reddit
that's definitely a gdpr issue AND she's putting your proprietary data, sales, strategy etc in a place competitors can find it... that's a firing offence... report her to HR
Ok-Measurement-1575@reddit
It's far worse than that. People aren't actually stupid, they just don't care and who can blame them?
Fake jobs, fake economy, everything is busywork :D
DocterDum@reddit
I disagree with that entirely - We see plenty of execs and owners who are just as dumb despite having a high stake in the company. Users that are actively asking for help but struggle to understand basic concepts. I’m sure some people are just lazy, but the stupid runs far deeper.
Bogus1989@reddit
i think the best ones probably know and acknowledge they dont know, so they actually ask wtf is going on. I love these people. I am one of those people always, because if i know i dont know? its time for me to go learn.
Bogus1989@reddit
yeah, I even hate explaining it to people. It even annoys me.
Yuugian@reddit
"You see Bob, It's not that i'm lazy. It's that i just don't care" -Office Space
NetworkingNoob81@reddit
It's a problem of motivation, all right? Now if I work my ass off and Initech ships a few extra units, I don't see another dime; so where's the motivation?
pdp10@reddit
Later in the film, the protagonists prove that automation can make fractions of a dime very motivating. I think the lesson here is pretty obvious: why were they working so hard in the first place?
ency@reddit
That's pretty much how I feel. I work for money and these companies have shown me they have no loyalty. So I'm going to do the job in such a way that makes things easy for me. I don't give a damn about pasting in SOPs, client info, and pretty much anything else. That's the companies responsibility to sort out, not mine. They gave me an AI prompt box and told me to use it...OK...
Observer422@reddit
100%
Bogus1989@reddit
her attitude is WAY the fuck out of line. im actually really thankful for HIPPA. because of its existence nurses and medical personnel take that shit DEADASS serious, and they actually come to me worried if even by accident they even had done something wrong. Alot come for clarity, too…Im glad my org actually is doing things right. I sure wouldn’t be quiet if they werent. Id tell end users the truth if I stumbled upon real issues no one did anything about.
newbietronic@reddit
Sales is always doing shit like that.. know how there are laws against recording without consent? Almost all tech sales calls over the phone is recorded without consent. I worked in there. Sales does not give a shit.
mitharas@reddit
I was helly confused because I thought IP addresses are okay. Only later "client address" became customer details.
lolschrauber@reddit
Completely wrong approach.
Ask if she got consent from the people on that list for their data to be shared with third parties instead.
TheIntrovertedHuman@reddit
lmaoo we are so cooked
_30Harsh_@reddit
You are training the GPT good enough bro
stromm@reddit
It’s implicitly against PII best practice.
pacman6642@reddit
Why would you ask your head of sales this question. You know their answer
GreenWoodDragon@reddit
Entertainment value TBH. A bit like poking a wasp nest.
StinklePink@reddit
Ya can’t fix stupid
WolfAffectionatefk@reddit
How does this work?
Geminii27@reddit
Get it in writing or an email from her, for when it inevitably blows up and she blames you for 'not preventing it'.
Djimi365@reddit
That's why I'm starting to see companies properly locking down AI tools now. Can't even access the sites of tools which are not specifically allows in AI policy.
2c0@reddit
Then you show them the policy and tell whoever is responsible for them. You provide the tools and policy dictates the usage. After than, not your problem.
tejanaqkilica@reddit
People are stupid. Next
shdwbld@reddit
A $2000 Mac Studio with Ollama is perfectly capable of running models capable of doing stuff like this locally. Or AMD Ryzen AI.
hobovalentine@reddit
Not much different than a public google search but yeah not a good practice to put customer data where you don’t control the data.
Do push for a corporate account though for sure
toasterdees@reddit
Aren’t companies info like emails and addresses and phone numbers publicly available? I don’t see why you can use those
Cayayu@reddit
That is considered personal information. You shouldn’t be sharing it.
Professional_Rip103@reddit
Enterprise chatgpt doesnt train on data but that doesnt actually solve the core issue which is that sensitive data is still leaving your network and hitting a third party server. Compliance doesnt care if openai trains on it or not. The data was exfiltrated either way. thats why we switched to a model where the detection happens at the browser before anything gets sent anywhere.
itskdog@reddit
The same happens when you use Outlook or SharePoint.
Morlark@reddit
I don't think you understand what exfiltration actually is. I'll give you a clue: data leavining the network is a necessary but not sufficient criterion for exfiltration.
Companies work with external partners in ways that involve them processing data on your behalf all the time. As long as you have appropriate contracts in place delimiting what they are permitted to do with that data (process it on your behalf for purposes that you specify), and what security they have in place (obviously), it's 100% compliant.
It is, by definition, not exfiltration if you give it to them willingly.
zbignew@reddit
On the other hand, you can get a BAA agreement with Anthropic and then compliance doesn’t care at all about HIPAA.
The whole point of HIPAA was to make sure customers are lightly inconvenienced so they think something is being done. And then they will assume that the notes from their pelvic exam aren’t being read by temp office staff at the medical billing company.
Empty_Allocution@reddit
Coming from an education perspective:
I'm lucky because we get to block all of this shit. Staff only have access to a ring-fenced Gemni.
However, few years back we absolutely had braindead shit going on, like proper sensitive data getting dumped into ChatGPT to create 'reports'. It was an absolute nightmare. GDPR is lumped with me too, so we had to take a step back and basically give everyone the third degree and tune our web filters to be harsher.
The big problem in education is that every app under the sun is adopting some form of AI as a selling point. Many kids and teachers go mad for these flashcard generators with LLMs built in. We have seen so many instances where you can just break the LLM out of its rules etc, and get it to write code, do your homework etc.
I don't think it is an issue that is going away. And I agree, I have found you can't policy your way out of it. People are gonna keep dumping all kinds of data into these platforms, and they'll do it at home if they can't do it at work.
wannito@reddit
Two things - first at this point most businesses need to embrace AI and LLMs and pick a tool thats officially supported so - at least in the terms/MSA - they don't train their models on your data. Otherwise people are just going to circumvent policy. Second - yeah our data and customer data is already vacuumed up. As long as your not using some Chinese model/platform (and even then who cares) it's moot at this point.
twhiting9275@reddit
I mean, she was correct
Potatus_Maximus@reddit
It’s terrifying to see how people see LLMs as helpers and can’t grasp how much data is getting cached and mined. Voluntary data exfil at its finest. Not sure if we should laugh or cry
Careful-Criticism645@reddit
Why would these companies mine the garbage that their users are uploading? It's pretty much worthless and it would open their models to being poisoned by malicious actors.
Sweaty_Marzipan4274@reddit
New initiative, large committee with all the stakeholders, long ass weekly meetings at the end of day to design new policy posters. Simple.
Player2Systems@reddit
You can tell who’s never had to sit through a data retention policy review 😂 Even if the vendor says they “delete” inputs, liability still follows the sender. I’ve had luck framing it as “treat it like a public pastebin unless legal says otherwise.”
povlhp@reddit
We at clear. People can use copilot and Gemini with company credentials. Everything else is illegal for GDPR reason. You make all the data public and the next user might see it. For developers I tell them API keys becomes public property. They have to rotate every time they use AI on codebase.
We are working with Anthropic to get a data Processor agreement and no data sharing.
madasfire@reddit
Artificial intelligence is no match for no intelligence
PaleoSpeedwagon@reddit
I think the CISO would appreciate an anonymous email.
Nonaveragemonkey@reddit
They're probably the one that showed them how to do it..
recourse7@reddit
I know a loan agent that pastes in full on loan documents. Fully filled out.
SirEDCaLot@reddit
Honestly I wish companies would have a basic tech literacy test for all new hires. It would solve so many problems.
tifu_tifu_1000@reddit
That look is a classic corporate defense mechanism—usually deployed when someone either doesn't know the answer or realizes they’ve been playing fast and loose with compliance.
Thunar13@reddit
Head of sales..
habitsofwaste@reddit
I have dealt with this a lot in security. It’s like they think it only counts with a file or something. Silly people!!
Leather-Arachnid-417@reddit
They will after the first GPT related breach
SpongeJake@reddit
Wow. Seems obvious now that everyone needs to learn about AI and the problems it creates.
I used to work in IT for an outfit here in Canada. We deliberately disabled CoPilot (or didn’t enable; not sure how it works) solely because we don’t want Canadian data showing up on the U.S. side of the border. We had clients and we treated their data carefully.
Goodlucklol_TC@reddit
I would have banned use of ChatGPT immediately following that conversation.
Julio_Ointment@reddit
These companies are HOPING for this policy and training failure.
Financial-Chemist360@reddit
20 years ago people didn't understand all of those apps that were taking your entire contact list and selling it. You seriously think people are understanding that all the models being trained with your data and nothing is walled off anymore?
eejjkk@reddit
This is my biggest concern every time someone brings up AI at work. It's like company sanctioned DLP circumvention.
Manitcor@reddit
sensitive data is for local models only. everyone is going to have a gaming laptop now
LandoCalrissian1980@reddit
It's amazing what employees will do with other people's data, but if they were asked if they want their data in public systems they'd blow a gasket. I see it so much with small retail companies and credit cards/credit applications. "Just email me a picture of your SSN & DL so I can print it 100 times on printers all over the world
HippyGeek@reddit
Point out the policy violations and associated financial risk to leadership. Once people start getting fired, maybe behavior will change.
Public_Fucking_Media@reddit
Do you have paid models for them to use that aren't trained on your data, by contract?
If not, isn't that kind of on you?
shangheigh@reddit (OP)
We have pitched that, but higher ups dont buy it
Public_Fucking_Media@reddit
Tell them to enjoy your shadow IT data leaks while also getting beaten by companies that know how to use AI properly.
FlipMyWigBaby@reddit
Even Reddit has these types of basic sadeguards! For instance if you type in your Credit Card number, it’s expiration date, and CVV number, the SNU AI automatically detects that and obfuscates it with asterixes.
For example, here is my VISA card
*--- exp /* and CVV ***
Go ahead and try yours to see for yourself!
proigor1024@reddit
The marketing person expensing perplexity on her personal card is such a specific kind of chaos. We found our design team using some ai image generator one person put on their personal cc and expensed as software subscription, only discovered it because layer x flagged it. Leadership was more mad about the expense policy violation than the security risk. Priorities man.
HeligKo@reddit
We have tools to block that outside our managed platforms.
Infamous_Horse@reddit
Yeah this is the real problem. The gap between what users think counts as data and what actually is.
Sales people are the worst about it too cause they're trained to share information freely, it's literally their job. We had the same fight until we put layerx in place to flags the content before it leaves the browser.
Legitimate_Put_1653@reddit
Laziness will overcome security consciousness every time.
tobascodagama@reddit
Jesus fucking Christ these fucking people.
StockMarketCasino@reddit
Atakama browser management can selectively block uploads and or downloads to all or a curated list of sites you specify.
Don't let them upload into an unauthorized AI or get to those sites.
dedjedi@reddit
You cannot reason a person out of a position they did not reason themselves into.