TheaterFire

"You are the product" | Google as usual | Grok likes anonymity

Posted by BidHot8598@reddit | LocalLLaMA | View on Reddit | 112 comments

"You are the product" | Google as usual | Grok likes anonymity

Reply to Post

112 Comments

Conscious-Tap-4670@reddit

What is the methodology here?
View on Reddit #53707886

Ragecommie@reddit

Regardless of what it is, everyone collects everything in the end. With or without permission.
View on Reddit #53708942

Conscious-Tap-4670@reddit

My issue is with the framing of LLM itself vs the app frontends that people use them through.
View on Reddit #53776868

orrzxz@reddit

Yup. It's the sort of thing you pay into knowing it's the only way to get into the class action when it's inevitably uncovered that they lied and did collect the data they told you they won't.
View on Reddit #53712287

woadwarrior@reddit

The source is likely Apple App Store privacy nutrition labels for the respective apps.
View on Reddit #53709450

troglo-dyke@reddit

Which only gives you information about what is being requested from the OS. And there's a big difference between Gemini accessing the name on your Google account so that it can call you by name, and having access to conversations where you talk about your mental health.
View on Reddit #53711937

BidHot8598@reddit (OP)

Here you go : https://www.voronoiapp.com/technology/Gemini-Collects-More-User-Data-Than-Any-Other-AI-Chatbot-4429
View on Reddit #53709497

binheap@reddit

What a dumb methodology since those are self reported. I'm surprised that chatGPT doesn't collect location data, since if I recall that you can query it for vague location.
View on Reddit #53709923

PossibleCicada4926@reddit

This is why only ask trivial things on web hosted services. Local llms are the key. Deepseek is indeed a key development for locally hostedllms. Hope many more like qwen pop up soon.
View on Reddit #53706794

mustafar0111@reddit

This. I trust online LLM's about as much as I trust the Google search bar.
View on Reddit #53706981

Impossible-Cry-1781@reddit

It's cute you'd think China would spend the money to deal with anyone non-influential outside their country having anything to say about Tiananmen or anything else they'd offended by.
View on Reddit #53707897

mustafar0111@reddit

It was a joke. That said try saying shit about it on your social media then entering China with your passport after and you'll find out just how seriously they do take that stuff.
View on Reddit #53707942

Iory1998@reddit

Same for many countries, not only China. Do that for any Middle-Eastern, African, Latin American Countries, modest of the East European countries, and even the US. Most of the world basically. Shouldn't countries protect their national interest? I'd like to see praise Hitler and Nazi and travel to France or Germany!
View on Reddit #53708495

mustafar0111@reddit

So you support China arresting or detaining people for discussing *Tiananmen Square*?
View on Reddit #53708557

Desm0nt@reddit

Saying that this is not unique for China and all big countries do the same doesn't automatically meant that we support it. It just means that this is how the modern world works (not China specifically), with no judgment about whether it's good or bad. But the fact that you try to focus only on China specifically, ignoring a similar problem for example in the USA - this can already be judged as “bad” and as a biased and distorted presentation of information.
View on Reddit #53721331

hugthemachines@reddit

> Shouldn't countries protect their national interest? Decent democracies do not punish visiting people for being negative towards something national on social media. There are certain exceptions, of course. Where I live it is not legal to do "persecution of an ethnic group" but that is illegal for citizens and visitors alike. A decent democracy have no problem with that because they are not so fragile that some social media post complaining about something political would damage them. Also free speech is important to a decent democracy so we would not like if our goverment limited it unnecessarily.
View on Reddit #53710402

Iory1998@reddit

I agree 100% with you. But, a democracy doesn't work for every country, and that's OK. As long as people live in prosperity and happiness, they choose the system of governance they like. I am just tired of the double standards the Western countries are applying to China, and by extension to many other countries they don't like. So, It is OK to censor anyone who support Gaza and advocate for freedom to choose for the Palestinian? Where is the democracy here? I don't care about what happened 50 years ago. I care what's going on NOW!
View on Reddit #53712245

hugthemachines@reddit

I agree in that there will always be countries with malfunctioning democracy. However, bad democracy is not binary. I don't think it is ok to censor someone who supports Gaza, but it is on a different level than China's Mass internment in their still existing reeducation centers. They are erasing Uyghurs by forced sterilization. that kind of stuff, I would consider on another level and that happens right now. The censoring in China is also on a very major level. Like you say, the censoring of anyone who have sympathy for suffering people in Gaza is very bad. No matter what the people with the weapons do, having sympathy for the civilians who get injured is very natural. I guess some people in power conclude that any support for Palestine in any way is support for Hamas.
View on Reddit #53715870

Iory1998@reddit

You seem like a rational person, and I share your view. However, I lived in China and I can tell you, before I relocate I took months to think carefully because all the stories we hear in the West. But, I can guarantee you, China is really a great place to live in. As long as you are paying your taxes, not involved in any crime, minding your own business, no one cares what you do. The West has been launching a propaganda against China for years now, and that it's not right. If you can, I highly recommend you to visit the country and see yourself.
View on Reddit #53718863

hugthemachines@reddit

It is my view too that China is nice for many of their citizens. However, I can't see a country as great when it commits horrible crimes against groups of their citizens. Also extreme censoring is a clear sign there is something wrong on a basic level.
View on Reddit #53720269

_w_8@reddit

Have u had any issues with China immigration or are you just making stuff up?
View on Reddit #53708341

mustafar0111@reddit

Its literally mainstream news and has been for years. [https://www.theguardian.com/world/article/2024/sep/02/how-chinas-internet-police-went-from-targeting-bloggers-to-their-followers](https://www.theguardian.com/world/article/2024/sep/02/how-chinas-internet-police-went-from-targeting-bloggers-to-their-followers) [https://www.axios.com/2021/01/30/china-social-media-criticism-arrest](https://www.axios.com/2021/01/30/china-social-media-criticism-arrest)
View on Reddit #53708438

_w_8@reddit

Nice thanks for links. I think it’s a bit different for citizens vs foreigners though.
View on Reddit #53709109

mustafar0111@reddit

Probably depends who you are and exactly what you are posting. If you are a foreign government official or someone the Chinese government wants to come in I'd assume you'd get a free pass.
View on Reddit #53709219

brahh85@reddit

sweetheart, people are getting deported now in usa for opinions they wrote in social media [https://www.indiatoday.in/world/us-news/story/us-using-ai-to-spy-on-international-students-even-instagram-likes-can-get-deported-visa-f1-mark-rubio-trump-2690421-2025-04-02](https://www.indiatoday.in/world/us-news/story/us-using-ai-to-spy-on-international-students-even-instagram-likes-can-get-deported-visa-f1-mark-rubio-trump-2690421-2025-04-02)
View on Reddit #53708380

mustafar0111@reddit

What does that have to do with what I said?
View on Reddit #53708444

iwinux@reddit

Take my bank account and I don't care. Would appreciate if they fill up the balance though LOL.
View on Reddit #53710907

PossibleCicada4926@reddit

Of course hosting them locally is not for everyone. A lot of things go into hosting locally which any person who doesnt need it on a regular basis should not opt for. It is like going to med school just because you have fever. No dear sir/madam please do not, not worth the efforts you have to put in also resource requirements are going to grow with time. So use the services but also mask things by breaking it up into multiple segments and use multiple platforms. Wish you the best mate.
View on Reddit #53711032

IrisColt@reddit

What’s the point of tapping into a remotely hosted, SOTA system if all we do is ask trivial questions? I save these powerhouses for the really tough stuff.
View on Reddit #53710758

DesperateAdvantage76@reddit

I've already given up because even fully anonymous web browsing reveals a ton of data about you.
View on Reddit #53708358

PossibleCicada4926@reddit

That is the irony unfortunately, more anonymous and private you try to be more unique your footprint becomes.
View on Reddit #53708453

Massive_Robot_Cactus@reddit

Yeah even in the real world, you can use a BLE beacon network to identify people walking around without phones.
View on Reddit #53709246

TheToi@reddit

Don't worry about that, NSA collect everything from any online chatbot you would use.
View on Reddit #53707246

Former-Ad-5757@reddit

Honest question, does NSA still keep up? There are billions being thrown at ai, I would doubt the nsa budget is high enough to keep up.
View on Reddit #53708098

mustafar0111@reddit

Yes. You are talking about the government agency that monitors every phone call, email and communication exchange on the planet. Do they have a human reviewing every message? No. But they intercept everything and correlate all the the meta data and produce output reports humans do review. Basically if you are on their "list" or contact someone on their list you'd get reviewed. Otherwise you are background noise. They'd done this forever through programs like ECHELON and PRISM.
View on Reddit #53708382

toreobsidian@reddit

Think this really is the critial Point. People thinking "it's too much" actually playes their Card because people feel save then. You don't have to wach everyone all the time. There are probably the two mechanisms: 1) targeted - you are on a list or have contact with people on a list 2) purely statistical: the more Data you have the better statistics you can do. It's basically getting BETTER for them to See patterns and gain Insights on an Overall Level with more Data than it's overwelming.
View on Reddit #53718406

Former-Ad-5757@reddit

They used to monitor everything, but now with cloud/ai I am wondering if they can still keep up. Have fun monitoring Azure / AWS / Google Cloud and their data volumes. Meta releases Llama 4, boys roll in a new truck of Disk-space just to keep all the fine tunes and variations of Llama 4. Azure has a local outage of a few hours in an area, roll in a few more trucks as Azure will transfer a lot of data around that region. Want to message securely with a third party, just say that when you have a message you will input it as Q&A 50353 in a dataset on HF. Or just finetune/overfit a 3B model to respond to a certain question. Technically this was always possible in images etc, but the data has blown out of proportions in the last couple of years.
View on Reddit #53709312

mustafar0111@reddit

Probably won't know for 20 years but I definitely wouldn't put it passed them given their track record to date.
View on Reddit #53709375

Accomplished_Steak14@reddit

the NSA sure can't keep catch up with deez nuts!
View on Reddit #53709982

blumenstulle@reddit

Check out articles about the Utah Data Center. It's absolutely massive and snagged up a good portion of the world's production of hard drives for a good while. These days it's probably in the thousands of exabyte range.
View on Reddit #53708680

Accomplished_Steak14@reddit

the NSA sure can't keep catch up with deez nuts!
View on Reddit #53709958

Careless_Wolf2997@reddit

They kind of can't keep up with it, even when the Bush Administration was throwing them tens of billions during the Iraq War. Supposedly they were backlogged 15-25 years even with ten thousand employees. I don't think they will really be able to ever review what they collected during the Bush admin, much less in real time everything now.
View on Reddit #53709627

Major-Excuse1634@reddit

And everything said on discord, sms, and everywhere else.
View on Reddit #53707460

Apprehensive-Mark241@reddit

***I don't believe that about Grok for a billionth of a second!***
View on Reddit #53707155

No_Macaroon_7608@reddit

It's crazy how anytime some good data related to grok comes, people on reddit straight up decline to acknowledge that. Please grow up!
View on Reddit #53709228

Apprehensive-Mark241@reddit

It has nothing to do with Grok's quality as a model, I'm saying that I don't believe that Elon Musk has any ethics or honesty. That can be clearly seen!
View on Reddit #53710761

No_Macaroon_7608@reddit

But you have to know that there are 1000's of talented people involved in making and managing grok. It's not that elon is the owner so his product has to be evil. One should show some respect to the talent and hardwork involved in grok, by giving it a fair shot.
View on Reddit #53711982

prtt@reddit

> 1000's of talented people involved in making and managing grok This is false. In number first and foremost. But arguably the qualifier is false too ;)
View on Reddit #53713662

Apprehensive-Mark241@reddit

And to be honest, no. I don't have to respect people who choose to work for a monster. Though to be fair to them, most of them are essentially slaves. Elon made it clear that he prefers H1B employees to Americans, for the obvious reason that they can't quit without being deported so he has taken their freedom.
View on Reddit #53712904

Apprehensive-Mark241@reddit

So? What does that have to do with what Elon will do with all of the data?
View on Reddit #53712746

QuestionableIdeas@reddit

You don't think that the AI that's hoovering up all the social security data and whose owner has a history of ignoring privacy and security concerns is on the up and up?
View on Reddit #53707844

Apprehensive-Mark241@reddit

I don't believe that Meta, who make Gemini are trying to become despots or oligarchs under a despot, and that's EXACTLY what Musk is doing. I don't think they're trying to make a database of political enemies, and I'm sure Musk is. I don't think they want get rid of all of the immigrants and non-white people.
View on Reddit #53710831

AutoModerator@reddit

Your submission has been **automatically** removed due to receiving many reports. If you believe that this was an error, please send a message to modmail. *I am a bot, and this action was performed automatically. Please [contact the moderators of this subreddit](/message/compose/?to=/r/LocalLLaMA) if you have any questions or concerns.*
View on Reddit #53713463

One_Dragonfruit_923@reddit

contact and location data....a bit scary
View on Reddit #53713349

Qual_@reddit

Damn, those annoying bakers, I can't go the the bakery and purchase my bread without the shop knowing my location.
View on Reddit #53713325

BTolputt@reddit

Grok doesn't like anonymity. Grok hasn't access to the other data. If Elon ever manages to make is "X everything app" actually go somewhere, all of the data from it's extra branches will be plugged into Grok's training / access.
View on Reddit #53712459

agoodepaddlin@reddit

Go local and save your dignity.
View on Reddit #53712376

drplan@reddit

Thanks for this - super useful in any pro-local discussions/slides
View on Reddit #53710785

-oshino_shinobu-@reddit

"To compile this data, Surfshark identified the most popular AI chatbots and analyzed their privacy details on the Apple App Store." So in other words this graph is useless?
View on Reddit #53710773

LostMitosis@reddit

This must be fake. I cant believe anything that doesn’t paint DeepSeek as evil. 😂
View on Reddit #53710549

Pkittens@reddit

This feels like an arbitrary and misleading comparison. If Grok had the ability to collect data from all the sources Google does, wouldn't it use them? Is openAI fundamentally against tracking purchases, or is it just that they don't have access to that information?
View on Reddit #53709944

xquarx@reddit

Mistral missing from the list
View on Reddit #53706921

ConnectionDry4268@reddit

Because as of now it's bad
View on Reddit #53708134

Juice-De-Pomme@reddit

Uh? Care to elaborate? I use mixtral le chat mostly and i find it very usable.
View on Reddit #53709879

binheap@reddit

You know this graph is complete nonsense since some of the LLM apps in this chart apparently don't collect user content. How, exactly, does one send a query to an LLM without sending user content?
View on Reddit #53709792

xXprayerwarrior69Xx@reddit

that's where i am getting stuck philosophically currently, i see massive potential in the business setting for LLM (rag/agents/...) but uploading business documents to gemini is not great. the performance of local LLM is an hinderance too. so i dont know how to tackle it. i could get a macstudio m3 ultra but i cannot find very good/thorough benchmarking especially at higher context (50-100k).
View on Reddit #53709432

Chance-Hovercraft649@reddit

Grok not collecting user data LOL! The whole model is advertised by having access to realtime x posts. It collects everything.
View on Reddit #53709054

BidHot8598@reddit (OP)

shouted opinion ain't private, ye dictate
View on Reddit #53709429

Electronic-Air5728@reddit

I thought Claude was the more private option.
View on Reddit #53708991

BidHot8598@reddit (OP)

Google owns 15% of claude
View on Reddit #53709388

nuclearbananana@reddit

You mean what *chatbot apps* collect. All of these go to near zero if you use the api
View on Reddit #53707825

Efficient_Ad_4162@reddit

I doubt the chat is collecting all this. For example, they've got purchases listed under google but that's talking about Google wallet (the payment processing thing for android) and has nothing to do with Gemini. And that's just the most obvious one, I wonder how many other entries are having their data collection 'padded' to make grok look good.
View on Reddit #53709326

whenpossible1414@reddit

I'd be surprised if the ip address the api is coming from doesn't reveal location info. Can you elaborate on this?
View on Reddit #53708047

nuclearbananana@reddit

Vague location yes. Not that there's much point to that. If you're concerned about that use a vpn. Also if you go through like openrouter they won't get that either. OpenRouter will though
View on Reddit #53708293

Harshith_Reddy_Dev@reddit

Is this public or someone found it through other means?
View on Reddit #53708755

BidHot8598@reddit (OP)

Here you go https://www.voronoiapp.com/technology/Gemini-Collects-More-User-Data-Than-Any-Other-AI-Chatbot-4429
View on Reddit #53709303

ReasonablePossum_@reddit

Visual capitalist charts have lost quality over the years. Lately ive noticed they look more like ads than actually relevant info.
View on Reddit #53709235

electricsashimi@reddit

I love gemini, I get a lot of value from it and happy to trade whatever data they want to collect. If you don't like it don't use it. Why are people feeling so entitled that they expect getting something for nothing?
View on Reddit #53709191

virtualmnemonic@reddit

Here's the source: "[We identified the 10 most popular AI chatbots⁵,⁶ and analyzed their privacy details on the Apple App Store](https://surfshark.com/research/chart/ai-chatbots-privacy)." Which is total bullshit. The Gemini app, for example, doubles as an assistant, hence all the permissions.
View on Reddit #53708823

CptNico@reddit

I don't trust those charts at all. I had to check data privacy of almost all of them last week (for the company I am working for), grok was the worst, anthropic the best... Plus It didn't say which have opt out option etc.
View on Reddit #53708813

Efficient_Ad_4162@reddit

How much of those are collected just by using google vs using Gemini specifically though? If I already have a YouTube account, I might not care that there's an extra two data points on top of that, compared to sharing 20 data points with an entirely new entity.
View on Reddit #53708803

Erhan24@reddit

I thought most of Claude's data collection is OptIn?
View on Reddit #53708792

Flintsr@reddit

What do the unique data points even mean? Deepseek takes 2 contact info and 3 diagnostics? what?
View on Reddit #53708739

popiazaza@reddit

Data source is App Store label. https://surfshark.com/research/chart/ai-chatbots-privacy voronoi just remake the chart from Surfshark's article.
View on Reddit #53708631

Guinness@reddit

This is why airgapped offline models are so incredibly important.
View on Reddit #53708573

LevianMcBirdo@reddit

Just to put that into perspective. This is the maximum amount of data they collect via app, this doesn't hold true for a lot of plans. Not saying this is great, but surf shark acts like this is true for all plans and ways to use the models.
View on Reddit #53708570

Betadoggo_@reddit

I don't know if I trust this infographic with 2 watermarks and data pulled from a privacy grifting company. The point values don't make any sense. What is 3 points of diagnostics vs 2? Or 2 points of location vs 1? Or 4 points of "user content" vs 1. Expect the data collection from all online providers to be about the same, if you can't verify it it may as well be.
View on Reddit #53708362

taiof1@reddit

How do we even know what data they collect? If you just ask a model they always tell they don’t collect anything
View on Reddit #53707134

Rubendarr@reddit

They say their source is Surfshark, afaik it's a VPN, I'm not familiar with how it works, but I'm guessing it can also analyze/display outgoing traffic, and from there let you see from certain identifiers if/what type of data is being collected?
View on Reddit #53707485

Niightstalker@reddit

Those are the privacy labels listed by the companies themselves in the App Store. I quickly crosschecked and that count with the categories matches.
View on Reddit #53708207

Rubendarr@reddit

Ah, that makes sense.
View on Reddit #53708272

taiof1@reddit

Well data has to go out in order to enable the LLM to answer the questions asked right? We don’t know what the company’s store (that’s my understanding of collect)
View on Reddit #53707655

kuzheren@reddit

these statistics are just fabricated
View on Reddit #53707177

Niightstalker@reddit

They actually match exactly the listed privacy labels on the App Store. So no not fabricated but listed by the companies themselves.
View on Reddit #53708158

IriZ_Zero@reddit

86.7% of all statistics are made up on the spot.
View on Reddit #53707250

Niightstalker@reddit

Maybe they count the listed data point in the privacy labels of the App Store? That would somewhat match to those numbers
View on Reddit #53708032

Additional-Hour6038@reddit

Funny how evil Deepseek collects way less data than "don't be evil" Google, oh, wait...
View on Reddit #53707380

kongacute@reddit

No surprise for Google, their extensions have access to a lot of data.
View on Reddit #53708259

Lost_County_3790@reddit

The question is how they know those datas. Seems impossible to know what a company is storing if you are not an insider
View on Reddit #53707966

hugganao@reddit

where did they get this info?
View on Reddit #53708240

seppo2@reddit

I‘m using Gemini 2.5 Pro exp via Website and uBlock is blocking over 3K entities, that‘s wild!
View on Reddit #53708009

hannesrudolph@reddit

Perplexity.. hahah why are they on that list?
View on Reddit #53708002

hannesrudolph@reddit

Broke collects contact info and it likes anonymity? Huh?
View on Reddit #53707987

pigeon57434@reddit

OpenAI being 2nd place on this list is very commendable considering how much hate they get at least they dont collect \*that\* much data
View on Reddit #53707847

Conscious-Tap-4670@reddit

These numbers have no basis in reality. Maybe the author meant what chatbot \_apps\_ collect?
View on Reddit #53707934

paulirotta@reddit

Copilot has a setting in GitHub to turn off collection. Does anyone know if that is honored by upstream LLMs also? Or just privacy washing? 
View on Reddit #53707821

Wild-Masterpiece3762@reddit

AI = user data collection in overdrive
View on Reddit #53707511

Red_Redditor_Reddit@reddit

Why do the box types have different sizes despite being the same type? Is more user data being collected on one but less on another or something?
View on Reddit #53707016

mustafar0111@reddit

Box sizes seem to be related to the number of data points collected for that category.
View on Reddit #53707113

Professional_Helper_@reddit

So are we still thrashing Elon now ?
View on Reddit #53706816

BidHot8598@reddit (OP)

Blud, open source his old models too ! 
View on Reddit #53706924

oodelay@reddit

...until we (always) learn otherwise.
View on Reddit #53706814