Yup. If it's any company that I trust with generous rate limits. It's google. These guys gave me 300$ cloud credits to start my business which is now making 6k MRR after 5 months. I can masterbate with the google logo now.
Go to gcs.google.com or aistudio.google.com and click around until you make a billing account. They give everyone $300. They’ll give you $2k of you put a bit of effort in (make a website and answer the phone when they call you.)
AWS and Microsoft give $5k for similar.
(Unfortunately Google is WAY better for my use case so I’m burning real money on Google now while trying to chip away at Anthropic through AWS and mega-censored OpenAI through Azure.)
You're lucky, I hit the quota during the initial setup after logging in to my google account lol, it just hangs and others are having the same problem. google WAY underestimated popularity of this product when they announced it as part of the gemini 3 promo
Our modeling suggests that a very small fraction of power users will ever hit the per-five-hour rate limit, so our hope is that this is something that you won't have to worry about, and you feel unrestrained in your usage of Antigravity.
WOW i just asked it to review my project and instead of just some text, it did an artifact with a full fuckin report that you can make notes on and send back to it for further review wow, cursor and the others in trouble i think
I think their getting destroyed on usage from the launch, i got 1 big nice report out went to submit the notes i made on it back, and got a error "Agent execution terminated due to model provider overload. Please try again later." ... seems they're overloaded AF lol
These rate limits are primarily determined to the degree we have capacity, and exist to prevent abuse. Quota is refreshed every five hours. Under the hood, the rate limits are correlated with the amount of work done by the agent, which can differ from prompt to prompt. Thus, you may get many more prompts if your tasks are more straightforward and the agent can complete the work quickly, and the opposite is also true. Our modeling suggests that a very small fraction of power users will ever hit the per-five-hour rate limit, so our hope is that this is something that you won't have to worry about, and you feel unrestrained in your usage of Antigravity.
Their antigravity vscode clone uses gpt-oss-120b as one of the available models, so that would be an interesting sweetspot for a new gemma, specifically code post-trained. Here's to hoping, anyway.
the antigravity vscode clone is also impossible to sign up for right now... there's a whole thread on reddit about it which i can't find but many people can't get past the authentication stage in the initial setup. did it actually work for you or you just been reading about it?
Haven't tried it yet, no. I saw some screenshots of what models you can access. They have gemini3 (high, low), sonnet 4.5 (+thinking) and gpt-oss-120b (medium).
More or less this:
https://www.derstandard.at/story/3000000296969/gemini-3-ist-da-google-verspricht-grossen-leistungssprung-fuer-kuenstliche-intelligenz
(PR Video linked in there as well - but the Article is good enough to use translate on it. :) )
That text is basically the information from the google blog.
And it also states, that it's "just" an evolution and not a revolution. That's not bad, actually it's great when good tools get a polish to become even better. But the first sentences of the CEO raised the expections that not only Gemini 1 and Gemini 2 are revolutions but Gemini 3 as well.
gemini 3 can generate UI on the fly. that will be present in gemini app and AI mode in search. if you want to learn about concept then it will generate UI and images and text together to explain topic. i think we first learned about that in gemini 2.0 that they were working on something like this but they never released
Here is the standouts PR blurb: https://www.youtube.com/watch?v=rq-2i1blAlU (NYT interviewing Josh Woodward, vice president of Google Labs and Google Gemini )
Jep, their PR blurb mentions nothing specific. Article also illustrates what some of the benchmarks mean.
Only thing that I have so far is, that it (Pro) is might impressive on my staple hallucinations questions and that in prose it responds more like an arguing machine than a creative.
see:
Mitten in Wien, eingebettet in eine weitläufige Parklandschaft, liegt ein Monument, das wie kaum ein anderes die Pracht, die Macht und den kulturellen Reichtum der Habsburgermonarchie verkörpert: Schloss Schönbrunn. Es ist weit mehr als nur eine touristische Attraktion oder ein architektonisches Meisterwerk des [Barock und Rokoko] [== thorough]. Schönbrunn ist ein steinernes Geschichtsbuch [cant write well creatively], das von den intimsten Momenten der kaiserlichen Familie bis hin zu welthistorischen Zäsuren erzählt. Als UNESCO-Weltkulturerbe und meistbesuchte Sehenswürdigkeit Österreichs zieht es jährlich Millionen von Menschen in seinen Bann. Doch um die wahre Bedeutung dieses Ortes zu erfassen, muss man [hinter die Fassade des „Schönbrunner Gelbs“] [interesting phrasing, but also one of the most obvious logically connected phrasings you would go for - as a human], blicken und die Jahrhunderte durchschreiten, die diesen Ort geformt haben.
Von der Katterburg zum Kaiserschloss
Die Geschichte Schönbrunns beginnt lange vor der Errichtung des heutigen Palastes. Im 14. Jahrhundert befand sich auf dem Gelände die „Katterburg“, ein Gutshof, der im Besitz des Stiftes Klosterneuburg war. Erst 1569 gelangte das Areal in den Besitz der Habsburger, als Kaiser Maximilian II. es kaufte, um dort einen Tiergarten für exotische Tiere und Fasanerien [again, a doubling with a high corelation probably a hallucination (Hellbrunn not Schönbrunn), but unsure] anzulegen. Der Name „Schönbrunn“ selbst geht auf eine Legende zurück: Kaiser Matthias soll bei der Jagd im Jahr 1612 eine Quelle entdeckt haben, die er als „Schönen Brunnen“ bezeichnete. [Diese Quelle versorgte den Hof lange Zeit mit Wasser] und gab dem späteren Schloss seinen Namen. [Again it wants to stick to what seem like logic chain correlations.]
I'm mostly miffed that I have to come up with more hallucination test questions based on obscure facts now.. ;)
Let's have a drink every time when a new model announcement mentions state-of-the-art :)
On a more serious note, I'm somehow happy for Google.... as long as they keep Gemma alive too. Still, I expected to see more innovations in Gemini 3. Judging from their article, it seems just a gradual evolution and nothing majorly new, if I'm not mistaken?
She was the CEO of X (Twitter) and people associate Grok with X because it’s made by Xai (a scheme by Elon Musk to inflate the value of his ailing Twitter platform). Grok has done things such as publicly identify itself as hitler, and not as a one-off hallucination, but consistently for everyone.
it's a deceptive simple question that seem like there's intuition for it, but really requires thinking. If a model spit out an answer for you right away, it didn't think about it. Thinking here requires breaking the word into individual letters and going thru one by one with a counter. actually fairly intensive mental work.
I think it’s funny though that I build a python script to solve for this, which if you really think about it we eyeball it but intellectually are we building a script in our head as well?
Actually when we eyeball it we're using our VLM. The model has indeed three methods to solve this: reason thru it step by step, letter by letter; write a script to solve the problem; or generate an image (visualize) and use a VLM. We as humans have these three choices as well. Models probably needs to be trained to figure out which method is best to solve a particular problem.
If we're being totally honest with ourselves Open Source models are between Claude Sonnet 3.5 and 3.7 tier.. which is phenomenal, but there is a very real gap there
Bro ur not AI rich. The new Rich is not people in Lamborghinis and G5 airplanes, the new rich are people spending billions of dollars of tokens while I sleep on the floor of my apartment
Well, for >200k tokens processed. That's mostly not the case, maybe just for long-horizon coding stuff. Claude Sonnet is even more expensive (22,50$/M output tokens after 200k tokens) and still everybody uses it. Now we have Gemini 3, which is a better all-rounder, so this seems still very reasonable.
Research power just proved Google still miles ahead OpenAI. Few missed steps at the start made they lose majority of market share but in the long run they will gain it back.
The open source models are a threat to their valuations. Can't have people realizing how close free and diy are. Sure they're behind, but they're still there.
Soon it will be an AI video generated individually for each person watching to algorithmically guarantee attention & follow through by the ~~victims~~ employees.
Extremely dumb take (but par for reddit as it has high upvotes)
Insider trading only applies to stocks and enforced by SEC.
SEC has no power over prediction markets.
Philosophically, the whole point of prediction market is "insiders trade" and surfaces the information to the benefit of the public. Yes, there are certain "sabotage" incentives for the betters. But ideally there are laws to protect.
My not a lawyer dumbass take is that this is correct, but that it's basically as bad to your employer because you're making them walk an extremely high risk line every time you do this
Gambling on things they have control over by means of access granted to them by the company will change their behavior towards public releases. This particular use case is an awful example of this.
If you hire the kind of people who want to build and ship stuff, they wouldn't be engaged in silly low-agency shit like sabotaging the launch.
Don't run the risk. If they're betting on things they own or have elevated access too internally, fast track them to be replaced/fired ASAP. Doesn't matter if it's legal.
That would be like me taking bets on if take a shit today, you betting money that you will, and others getting mad because you have an unfair advantage on the bet
In my own experiments, when using it to review and improve a white paper for a software project with attached source code files for context, it performed much worse in terms of instruction following than GPT 5.1. It shortened the given whitepaper instead of improving it, whereas GPT 5.1 followed the given instructions flawlessly and in general had a better feel for the right style to use.
I do love ARC AGI 2, but as current techniques show, the ARC performance can come from pre-processor techniques used (tools) rather than purely a signal of the strength of the LLM model. Gemini 3 (I claim) must be using internal Tools to reach their numbers. It would be groundbreaking if this was even remotely possible purely by any prompt authoring technique. Sure, I AGREE that it's still a big deal in absolute terms, but I just wanted to point out that these Tools could be ported to Gemini 2.5 to improve its ARC-like authoring skills. Call it Gemini 2.6 on a cheaper price tier.
Doing testing, thus far chess skills and vision got major improvements. Will see about the rest more time consuming test results, but looks very promising. Looks to be a true improvement over 2.5
And starting today, we’re shipping Gemini at the scale of Google. That includes Gemini 3 in AI Mode in Search with more complex reasoning and new dynamic experiences. This is the first time we are shipping Gemini in Search on day one. Gemini 3 is also coming today to the Gemini app, to developers in AI Studio and Vertex AI, and in our new agentic development platform, Google Antigravity — more below.
Looks like that Ironwood deployment is going well.
lordpuddingcup@reddit
I'm sorry!
Gemini Antigravity...
CYTR_@reddit
This IDE looks very interesting. I hope to see an open-source version fairly soon 🥸
SunItchy8067@reddit
lol
CYTR_@reddit
Update : It's crap.
Reason_He_Wins_Again@reddit
lol my man. Thanks
teasy959275@reddit
thank you for your sacrifice
Mcqwerty197@reddit
After 3 request on Gemini 3 (High) I hit the quota… I don’t call that generous.
integer_32@reddit
Same, but you should be able to switch to Low, which has much higher limits.
lordpuddingcup@reddit
I mean the limits reset every 5 hours apparently
ResidentPositive4122@reddit
It's day one, one hour into the launch... They're probably slammed right now. Give it a few days would be my guess.
Expert_Driver_3616@reddit
Yup. If it's any company that I trust with generous rate limits. It's google. These guys gave me 300$ cloud credits to start my business which is now making 6k MRR after 5 months. I can masterbate with the google logo now.
Reddit1396@reddit
What’s your business? You hiring?
AlphaPrime90@reddit
Could you share how to get the300 credit?
Crowley-Barns@reddit
Go to gcs.google.com or aistudio.google.com and click around until you make a billing account. They give everyone $300. They’ll give you $2k of you put a bit of effort in (make a website and answer the phone when they call you.)
AWS and Microsoft give $5k for similar.
(Unfortunately Google is WAY better for my use case so I’m burning real money on Google now while trying to chip away at Anthropic through AWS and mega-censored OpenAI through Azure.)
ArseneGroup@reddit
Dang I gotta make good use of my credits before they expire. Done some decent stuff with them but the full $300 credit is a lot to use up
CryptoSpecialAgent@reddit
You're lucky, I hit the quota during the initial setup after logging in to my google account lol, it just hangs and others are having the same problem. google WAY underestimated popularity of this product when they announced it as part of the gemini 3 promo
c00pdwg@reddit
How’d it do though?
Mcqwerty197@reddit
It’s quite a step up from 2.5 I’d say it’s very competitive with Sonnet 4.5 for now
lordpuddingcup@reddit
Quota or backend congestion
Mine says the backend is congested and to try later
They likely underestimated shit again lol
TheLexoPlexx@reddit
Lads, you know what to do.
lordpuddingcup@reddit
already shifted to trying it out LOL, lets hope we get a way to record token counts and usage to see what the limits look like
TheLexoPlexx@reddit
Downloading right now. Not very quick on the train unfortunately.
lordpuddingcup@reddit
WOW i just asked it to review my project and instead of just some text, it did an artifact with a full fuckin report that you can make notes on and send back to it for further review wow, cursor and the others in trouble i think
TheLexoPlexx@reddit
I asked it a single question and got "model quota limit reached" while not even answering the question in the first place.
lordpuddingcup@reddit
I think their getting destroyed on usage from the launch, i got 1 big nice report out went to submit the notes i made on it back, and got a error "Agent execution terminated due to model provider overload. Please try again later." ... seems they're overloaded AF lol
TheLexoPlexx@reddit
Yeah, same for me. Too bad.
cobalt1137@reddit
This is so cool. The future is going to be so so strange and interesting.
Recoil42@reddit
https://antigravity.google/docs/plans
OldEffective9726@reddit
why? is it opensource?
_wsgeorge@reddit
No, but it's a new SOTA open models can aim to beat. Plus there's a chance Gemma will see these improvements. I'm personally excited.
Zemanyak@reddit
Google, please give us a 8-14B Gemma 4 model with this kind of leap.
dampflokfreund@reddit
38B MoE with 5-8B activated parameters would be amazing.
a_beautiful_rhind@reddit
200b, 38b active. :P
TastyStatistician@reddit
420B-A69B
PotaroMax@reddit
Nice
DealingWithIt202s@reddit
This guy infers.
arman-d0e@reddit
666B-A0.1B
lemondrops9@reddit
Sorry but 666 isn't allowed or the dark lord will come.
layer4down@reddit
69B-A2m
allSynthetic@reddit
420?
BalorNG@reddit
69B 420M active
Actually sounds kind of legit
allSynthetic@reddit
Let's call it Blue 96b-420m
Cool-Chemical-5629@reddit
67B
mxforest@reddit
This guy right here trying to fast track singularity.
smahs9@reddit
That magic number is the 42 of AGI
_raydeStar@reddit
Stop, I can only get so erect.
For real though, I think 2x the size of qwen might be absolutely perfect on my 4090.
ForsookComparison@reddit
More models like Qwen3-Next 80B would be great.
Performance of ~32B models running at light speed
chriskevini@reddit
Me crying with my 4GB VRAM laptop. Anyways, can you recommend a model that can fit in 4gb and is better than qwen3 4b?
Fox-Lopsided@reddit
Qwen3-4B-2507 Thinking is the best one
ForsookComparison@reddit
A later update of Qwen3-4B if there is one (it may have gotten a 2507 version?)
tomakorea@reddit
30B please
AyraWinla@reddit
Gemma 3 4b is still the best model of all time for me; a Gemma 4 3b is my biggest hope.
Mescallan@reddit
me too, crazy how performant it is for it's size even after all this time.
Mescallan@reddit
4b plzzzzzzzzzz
Caffdy@reddit
120B MoE in MXFP4
ResidentPositive4122@reddit
Their antigravity vscode clone uses gpt-oss-120b as one of the available models, so that would be an interesting sweetspot for a new gemma, specifically code post-trained. Here's to hoping, anyway.
huluobohua@reddit
Does anyone know if you can add an API key to Antigravity to get past the limits?
CryptoSpecialAgent@reddit
the antigravity vscode clone is also impossible to sign up for right now... there's a whole thread on reddit about it which i can't find but many people can't get past the authentication stage in the initial setup. did it actually work for you or you just been reading about it?
ResidentPositive4122@reddit
Haven't tried it yet, no. I saw some screenshots of what models you can access. They have gemini3 (high, low), sonnet 4.5 (+thinking) and gpt-oss-120b (medium).
FlamaVadim@reddit
can you explain it? how it is possible that google is giving access to gpt-oss-120b?
CryptoSpecialAgent@reddit
its an open source model so anyone can download it, serve it, and offer access to customers, whether thru an app or directly as an api...
FlamaVadim@reddit
I understand now :) Funny that they brought in a competing product for this task. But Gemma 3 is a bit outdated.
Crowley-Barns@reddit
It’s open source. You can offer it to people for free if you’ve got the compute idling away too :)
ResidentPositive4122@reddit
Running in vertex I would presume. Same w/ sonnet.
FlamaVadim@reddit
I see now! It's about cloud services. Thanks for the clarification!
FlamaVadim@reddit
I've used Brave and it worked. I think it is issue with Chrome.
Birdinhandandbush@reddit
I just saw 3 is now default on my Gemini app, so yeah the very next thing I did was check if Gemma 4 models were dropping too. But no
Salt-Advertising-939@reddit
the last release was very underwhelming, so i sadly don’t have my hopes up for gemma 4. But I’m happily wrong here.
shouryannikam@reddit
Google!! Give me an 8B Gemma 4 and my life is yours!!
InevitableWay6104@reddit
MOE would be super great.
vision + tool calling + reasoning + MOE would be ideal imo
ttkciar@reddit
Models in 12B, 27B, and 49B would be perfect :-)
StableLlama@reddit
Hm, the CEO said what the big achievements of Gemini 1 and Gemini 2 were. But none of Gemini 3.
So, what are the major steps that make this a full new version?
I'm sure it's a good model and better than those before. But so far no information was given about the big step it promisses
harlekinrains@reddit
More or less this: https://www.derstandard.at/story/3000000296969/gemini-3-ist-da-google-verspricht-grossen-leistungssprung-fuer-kuenstliche-intelligenz
(PR Video linked in there as well - but the Article is good enough to use translate on it. :) )
StableLlama@reddit
That text is basically the information from the google blog.
And it also states, that it's "just" an evolution and not a revolution. That's not bad, actually it's great when good tools get a polish to become even better. But the first sentences of the CEO raised the expections that not only Gemini 1 and Gemini 2 are revolutions but Gemini 3 as well.
kvothe5688@reddit
gemini 3 can generate UI on the fly. that will be present in gemini app and AI mode in search. if you want to learn about concept then it will generate UI and images and text together to explain topic. i think we first learned about that in gemini 2.0 that they were working on something like this but they never released
harlekinrains@reddit
Here is the standouts PR blurb: https://www.youtube.com/watch?v=rq-2i1blAlU (NYT interviewing Josh Woodward, vice president of Google Labs and Google Gemini )
harlekinrains@reddit
Jep, their PR blurb mentions nothing specific. Article also illustrates what some of the benchmarks mean.
Only thing that I have so far is, that it (Pro) is might impressive on my staple hallucinations questions and that in prose it responds more like an arguing machine than a creative.
see:
I'm mostly miffed that I have to come up with more hallucination test questions based on obscure facts now.. ;)
somealusta@reddit
I will subscribe to this certainly but give me also gemma4 27B with vision.
martinerous@reddit
Let's have a drink every time when a new model announcement mentions state-of-the-art :)
On a more serious note, I'm somehow happy for Google.... as long as they keep Gemma alive too. Still, I expected to see more innovations in Gemini 3. Judging from their article, it seems just a gradual evolution and nothing majorly new, if I'm not mistaken?
zenmagnets@reddit
It just got 100% in a test on the public simplebench data. For context, here are scores from local models Iv'e tested on the same data:
Fits on 5090: 33% - GPT-OSS-20b 37% - Qwen3-32b-Q4-UD 29% - Qwen3-coder-30b-a3b-instruct
Fits on Macbook (or Rtx 6000 Pro): 48% - qwen3-next-80b-q6 40% - GPT-OSS-120b
JsThiago5@reddit
How did you run qwen next?
apocalypsedg@reddit
100% shouldn't scream "massive leap", rather training contamination
Flaky_Pay_2367@reddit
But Google Bro doesn't bother to fix ugly slow 4ss Gemini CLI :(
For me AmpCode, OpenCode or Claude are all much more snappy
AmpCode even runs on my broke Termux on Android :D
JsThiago5@reddit
One guy used Gemini to one shot prompt a copy of ChatGPT front-end. So they do not do it because they don’t want lol
Ummite69@reddit
Why they doesn't compare it to Grok?
Tai9ch@reddit
OpenAI, Google, and Anthropic really like pretending that Grok doesn't exist.
marcoc2@reddit
everyone should do that
giant3@reddit
Why dude?
dictionizzle@reddit
google linda yaccharino
giant3@reddit
I know the lady, but how is she related to grok?
dictionizzle@reddit
https://www.bbc.com/news/articles/cx2gy3j9xq6o
findingsubtext@reddit
She was the CEO of X (Twitter) and people associate Grok with X because it’s made by Xai (a scheme by Elon Musk to inflate the value of his ailing Twitter platform). Grok has done things such as publicly identify itself as hitler, and not as a one-off hallucination, but consistently for everyone.
the_mighty_skeetadon@reddit
Literally released <24h ago and benchmarks can't be independently verified on it yet.
richardathome@reddit
First impressions. It's not as good as DeepSeek.
Second impressions: It's not very good at all :-(
dadidutdut@reddit
I did some test and its miles ahead with complex prompts that I use for testing. let wait and see benchmarks
InterstellarReddit@reddit
That complex testing: “how many “r” are there in hippopotamus”
loganecolss@reddit
to my surprise,
Independent-Fig-5006@reddit
I use https://aistudio.google.com/.
gemstonexx@reddit
google ai studio
Ugiwa@reddit
Holy hell!
Normal-Ad-7114@reddit
r/anarchyllama
TOO_MUCH_BRAVERY@reddit
new model just dropped
the_mighty_skeetadon@reddit
Naw Gemini 3 Pro gets it right first try.
loganecolss@reddit
current GPT5 also gets it right at the first try.
InterstellarReddit@reddit
So I see Gemini three on the web but when I go to my app on my iPhone it’s 2.5 so I guess it’s still rolling out
ken107@reddit
it's a deceptive simple question that seem like there's intuition for it, but really requires thinking. If a model spit out an answer for you right away, it didn't think about it. Thinking here requires breaking the word into individual letters and going thru one by one with a counter. actually fairly intensive mental work.
InterstellarReddit@reddit
I think it’s funny though that I build a python script to solve for this, which if you really think about it we eyeball it but intellectually are we building a script in our head as well?
ken107@reddit
Actually when we eyeball it we're using our VLM. The model has indeed three methods to solve this: reason thru it step by step, letter by letter; write a script to solve the problem; or generate an image (visualize) and use a VLM. We as humans have these three choices as well. Models probably needs to be trained to figure out which method is best to solve a particular problem.
chriskevini@reddit
4th option aural? in my stream of thought, the "r" sound isn't present in "hippopotamus"
astraeasan@reddit
Actually kinda funny
InterstellarReddit@reddit
This is what my coworkers do to make it seem like they’re busy solving an easy problem.
Environmental-Metal9@reddit
There are 3
r’s in hippopotamus:h
i
p <- first r
p <- second r
o
p <- third r
o
t
a
m
u
s
zungesolang@reddit
how many “r” are there in hippopotamus
11/18/2025 2:33PM EST
The word "hippopotamus" has two "r"s. 🐘
They are in the second and fifth syllables: hippo-po-ta-r-mus.
InterstellarReddit@reddit
Lmao bro u imagine this stumps Gemini 3
Robert__Sinclair@reddit
impressive reasoning I just hope they won't soon dumb it down as they did before.
findingsubtext@reddit
I am once again begging for a Gemma-4 preferably with a 40-70b variant 🙏
idczar@reddit
is there a comparable local llm model to this?
Dry-Marionberry-1986@reddit
local models will forever lag one generation behind in capabilitie and one eternity ahead in freedom
Scotty_tha_boi007@reddit
Until the bleeding-edge models hit a wall, which I am now realizing may never happen.
jamaalwakamaal@reddit
sets a timer for 3 months
nmkd@reddit
More like 1.5 years
Interesting8547@reddit
It will be sooner... 1.5 years in this space is forever.
Frank_JWilson@reddit
That's optimistic. Sadly I don't even have an open source model I like better than 2.5 Pro yet.
Antiwhippy@reddit
Really? I find gemini terrible for coding hoenstly.
ForsookComparison@reddit
If we're being totally honest with ourselves Open Source models are between Claude Sonnet 3.5 and 3.7 tier.. which is phenomenal, but there is a very real gap there
True_Requirement_891@reddit
Exactly... 2.5 Pro was and is something else and only 3 can beat it.
MembershipQueasy7435@reddit
!RemindMe 3 months
RemindMeBot@reddit
I will be messaging you in 3 months on 2026-02-18 18:34:14 UTC to remind you of this link
CLICK THIS LINK to send a PM to also be reminded and to reduce spam.
^(Parent commenter can ) ^(delete this message to hide from others.)
MembershipQueasy7435@reddit
!remindme 3 months
Interesting8547@reddit
Soon, don't worry all local models are cooking...
a_beautiful_rhind@reddit
Kimi, deepseek.
huffalump1@reddit
And GLM 4.6 if/when the weights are released.
I wouldn't say comparable to Gemini 3.0 Pro, but in the neighborhood of 2.5 Pro for many tasks is reasonable .
FlamaVadim@reddit
how?
allinasecond@reddit
lol
No_Conversation9561@reddit
I think the gap just got wider
fab_space@reddit
I tested antigravity and it worked like a dumb.
I ended up sonnet there and in a couple of minutes high load unusable non-happy ending.
Mental_Ice6435@reddit
So when singularity?
dahara111@reddit
I'm not sure if it's because of the Thinking token, but has anyone noticed that Gemini prices are insanely high?
Also, Google won't tell me the cost per API call even when I ask.
Johnny_Rell@reddit
Output is 18$ per 1M tokens. Yeah... no.
InterstellarReddit@reddit
Bro ur not AI rich. The new Rich is not people in Lamborghinis and G5 airplanes, the new rich are people spending billions of dollars of tokens while I sleep on the floor of my apartment
Normal-Ad-7114@reddit
Reminds me of crypto craze days and endless to-the-moon-bros
Final_Wheel_7486@reddit
Uuh... where did you get this from?
Johnny_Rell@reddit
Final_Wheel_7486@reddit
Well, for >200k tokens processed. That's mostly not the case, maybe just for long-horizon coding stuff. Claude Sonnet is even more expensive (22,50$/M output tokens after 200k tokens) and still everybody uses it. Now we have Gemini 3, which is a better all-rounder, so this seems still very reasonable.
featherless_fiend@reddit
what's the monthly subscription cost?
pier4r@reddit
when you have no competitors, it makes sense.
ForsookComparison@reddit
Unless you're Opus where you lose to competitors and even your own company's models, and charge $85/1M for some reason
Clear_Anything1232@reddit
It's $12
Final_Wheel_7486@reddit
Which is totally reasonable pricing for a SOTA model and in line with 2.5 Pro
doomed151@reddit
Since this is r/LocalLLaMA, anybody found the download link yet?
CheatCodesOfLife@reddit
Need to wait for deepseek-r2 for the link
Science_Bitch_962@reddit
Research power just proved Google still miles ahead OpenAI. Few missed steps at the start made they lose majority of market share but in the long run they will gain it back.
rulerofthehell@reddit
Why they only show open-source benchmark result comparisons with GPT and Claude and don’t compare with GLM, Kimi, Qwen, etc.
cnydox@reddit
Because researchers are chart criminals
ddxv@reddit
The open source models are a threat to their valuations. Can't have people realizing how close free and diy are. Sure they're behind, but they're still there.
tenacity1028@reddit
Then you’ll see a much larger gap, they’re not competing with each other.
Equivalent_Cut_5845@reddit
Because open models are still worse than propriety models.
And also because open models aren't direct competitors to them.
rulerofthehell@reddit
These are research benchmarks which they quote in research paper.and these open source models have very good numbers on them.
We can argue that the benchmarks are flawed, sure, in which case why even use them.
PDXSonic@reddit
Found the person who bet $78k it’d be released in November 🤣
ForsookComparison@reddit
They already work at Google so it's not like they needed the money
pier4r@reddit
couldn't that be insider trading?
MysteriousPayment536@reddit
polymarket isn't regulated and uses crypto wallets
Spiveym1@reddit
who the fuck cares? Insider trading is technically illegal according to the Terms of Service of all prediction market platforms.
ForsookComparison@reddit
Impossible. They watch a mandatory corporate-training video in a browser flash-player once per year.
valhalla257@reddit
I worked at a company that made everyone watch a video on export control laws.
The company got fined $300m for violating export control laws.
zulu02@reddit
These Videos even detect when they are being covered by other windows, management thought of everything!
ForsookComparison@reddit
Lol my company bought that package this year. Jerks.
rm-rf-rm@reddit
you mean a poorly paid actor from some 3rd party vendor
bluehands@reddit
Only for now.
Soon it will be an AI video generated individually for each person watching to algorithmically guarantee attention & follow through by the ~~victims~~ employees.
ForsookComparison@reddit
The big companies film their own but pay the vendors for the clicky slideshow
qroshan@reddit
Extremely dumb take (but par for reddit as it has high upvotes)
Insider trading only applies to stocks and enforced by SEC.
SEC has no power over prediction markets.
Philosophically, the whole point of prediction market is "insiders trade" and surfaces the information to the benefit of the public. Yes, there are certain "sabotage" incentives for the betters. But ideally there are laws to protect.
ForsookComparison@reddit
My not a lawyer dumbass take is that this is correct, but that it's basically as bad to your employer because you're making them walk an extremely high risk line every time you do this
qroshan@reddit
Do you know how an organization starts to rot?
() - Actual Bad/Illegal Thing.
( () ) - What your average employee / HR thinks it's a Bad/Illegal thing
( ( () ) ) - What you tell your colleagues that it's a Bad/Illegal thing to cover your ass
Pretty soon the whole organization is filled with "we don't do that here and covers it's ass"
Then a startup comes and eats your lunch.
So, if you want to always stay competitive, prevent organizational rot and let your employees take bold risks that are actually legal
ForsookComparison@reddit
Gambling on things they have control over by means of access granted to them by the company will change their behavior towards public releases. This particular use case is an awful example of this.
qroshan@reddit
Everything in life is a trade off.
If you hire the kind of people who want to build and ship stuff, they wouldn't be engaged in silly low-agency shit like sabotaging the launch.
A close knit high-agency team can easily sniff out who is actively sabotaging launches
ForsookComparison@reddit
Don't run the risk. If they're betting on things they own or have elevated access too internally, fast track them to be replaced/fired ASAP. Doesn't matter if it's legal.
GottBigBalls@reddit
insider trading is only for securities not polymarket bets.
hacker_backup@reddit
That would be like me taking bets on if take a shit today, you betting money that you will, and others getting mad because you have an unfair advantage on the bet
KrayziePidgeon@reddit
The president of the US of A family blatantly rig predictions on polymarket on the regular for hundreds of millions; this is nothing.
AffectSouthern9894@reddit
No. They’re not trading, they are betting. Is it trashy? Yeah. Is it illegal? Depends. Probably not.
hayden0103@reddit
Probably. No one will do anything about it.
usernameplshere@reddit
Would love to see Gemma 4 as well.
Fearless-Intern-2344@reddit
+1, Gemma 3 has been great
lorddumpy@reddit
After the Marsha Blackburn debacle, I wouldn't hold my breath.
ttkciar@reddit
Yes! If Google holds to their previous pattern, we should see Gemma 4 in a couple of months or so. Looking forward to it :-)
tarruda@reddit
Hopefully a 150-200B MoE with 5-15B active parameters
haagch@reddit
Are we still not even giving flairs or tags to proprietary api-only models in /r/LocalLLaMA?
entsnack@reddit
😂
halcyonPomegranate@reddit
In my own experiments, when using it to review and improve a white paper for a software project with attached source code files for context, it performed much worse in terms of instruction following than GPT 5.1. It shortened the given whitepaper instead of improving it, whereas GPT 5.1 followed the given instructions flawlessly and in general had a better feel for the right style to use.
Conscious_Cut_6144@reddit
This is the first model to noticeably outperform o1-preview in my testing.
WinterPurple73@reddit
Insane leap on the ARC AGI 2 benchmark.
jadbox@reddit
I do love ARC AGI 2, but as current techniques show, the ARC performance can come from pre-processor techniques used (tools) rather than purely a signal of the strength of the LLM model. Gemini 3 (I claim) must be using internal Tools to reach their numbers. It would be groundbreaking if this was even remotely possible purely by any prompt authoring technique. Sure, I AGREE that it's still a big deal in absolute terms, but I just wanted to point out that these Tools could be ported to Gemini 2.5 to improve its ARC-like authoring skills. Call it Gemini 2.6 on a cheaper price tier.
policyweb@reddit
virtualmnemonic@reddit
Needs more jpeg
the_mighty_skeetadon@reddit
Now that's a tasty treat for your cake day! Happy cake-day-ing!
Kubas_inko@reddit
Not surprised given that some insider bet on it releasing before November 22.
johnerp@reddit
Deep research delayed, sounds like they really wanted it out there - I’m with you!
PangurBanTheCat@reddit
Using the Preview on OpenRouter. It's terrible for RP. If the actual model is this locked-down, then I'll stick to 2.5 Pro.
WithoutReason1729@reddit
Your post is getting popular and we just featured it on our Discord! Come check it out!
You've also been given a special flair for your contribution. We appreciate your post!
I am a bot and this action was performed automatically.
dubesor86@reddit
Doing testing, thus far chess skills and vision got major improvements. Will see about the rest more time consuming test results, but looks very promising. Looks to be a true improvement over 2.5
vogelvogelvogelvogel@reddit
yo what 2.5 pro was already top notch
Cool-Chemical-5629@reddit
GGUF when? 😏
harlekinrains@reddit
Simple QA verified:
Gpt-Oss-120b: 13.1%
Gemini 3 Pro Preview: 72.1%
Slam, bam, thank you mam. ;)
https://www.kaggle.com/benchmarks/deepmind/simpleqa-verified
Aggravating-Age-1858@reddit
WITHOUT nano banana pro it seems tho
:-(
as try to get it to output a picture and it wont.
that really sucks i hope pro comes out soon they should have launched it together
yaboyyoungairvent@reddit
seems like they'll be rolling out the new nano banna soon in a couple weeks or so based on a promo vid they put out.
Nordic-Squirrel@reddit
It do be reasoning
thatguyinline@reddit
gemini-embedding-002?
Robert__Sinclair@reddit
impressive reasoning!
ucefkh@reddit
This is awesome
genxt@reddit
Any update on nano banana pro/2?
harlekinrains@reddit
Really good on my hallucination testquestions based on arkane literary knowledge. As in aced 2 out of 3. Without websearch.
Seeking feedback, how did it do on yours?
Kafke@reddit
No flash? 🤨
ForsookComparison@reddit
C'mon Deepseek, smack this project manager
procgen@reddit
China BTFO??
pier4r@reddit
MathArena Apex seems incredible.
fathergrigori54@reddit
Here's hoping they fixed the major issues that started cropping up with 2.5, like the context breakdowns etc
True_Requirement_891@reddit
They'll quantise it in a few weeks or months and then you'll see the same drop again.
Remember it's a preview.
_BreakingGood_@reddit
Wow, OpenAI literally in shambles. Probably hitting the fast-forward button on that $1 trillion IPO
Recoil42@reddit
Looks like that Ironwood deployment is going well.
abol3z@reddit
Damn just in time. I just finished optimizing my rag pipeline on the Gemini-2.5 family and I won't complain if I get a performance boost for free!!
SrijSriv211@reddit
It was totally out of the blue for me!!