TheaterFire

Wow anthropic and Google losing coding share bc of qwen 3 coder

Posted by Independent-Wind4462@reddit | LocalLLaMA | View on Reddit | 128 comments

Wow anthropic and Google losing coding share bc of qwen 3 coder

Reply to Post

128 Comments

Melodic_Reality_646@reddit

hmmm someone pointed out that people are more likely to consume closed model using official apis. And it makes sense that enthusiasts will go for open router to try qwen exclusively. So we’re really only seeing part of the picture here. Growth on official apis probably more than compensates this here, folds…
View on Reddit #64394019

entsnack@reddit

Also ironic that /r/LocalLLaMa is essentially /r/RemoteLLaMa when it comes to useful models. https://preview.redd.it/9y26p47n8ljf1.jpeg?width=1105&format=pjpg&auto=webp&s=03f836e5e95826a6d589e552f1db73ffa96460d1
View on Reddit #64394648

ortegaalfredo@reddit

I run GLM-4.5 Locally, on GPUs at Q4, fast. Yes, it gets hot in here.
View on Reddit #64413576

entsnack@reddit

Lesgoo! Is there much of an overlap between /r/homelab and here? Seems like they're still working on downloading the internet.
View on Reddit #64416942

DealingWithIt202s@reddit

Sounds like sweet sweet training data to me
View on Reddit #64956448

GuildCalamitousNtent@reddit

I’m curious what’s the stack to do this.
View on Reddit #64418235

No_Afternoon_4260@reddit

Vllm
View on Reddit #64427873

GuildCalamitousNtent@reddit

🤦🏻‍♂️ he said that, I meant his full setup (hardware included).
View on Reddit #64434959

No_Afternoon_4260@reddit

Sry was thinking software stack
View on Reddit #64435319

ortegaalfredo@reddit

A stack of 12x3090
View on Reddit #64433576

Commercial-Celery769@reddit

2x 3090's in a room makes it very toasty 
View on Reddit #64425230

Western_Objective209@reddit

If a professional camera cost $50k to own but you could rent a camera for less then a penny per photo I imagine not a lot of photographers would own cameras
View on Reddit #64423205

entsnack@reddit

I'm talking about /r/photography not photographers. You can also apply this to /r/audiophile, another expensive hobby community. The ones who cant stomach it go to /r/budgetaudiophile instead of posting their budget builds on /r/audiophile.
View on Reddit #64423814

ttkciar@reddit

You're kind of being an ass, and as far as I can tell it's entirely gratuitous.
View on Reddit #64426794

entsnack@reddit

I think people should rent GPUs on Runpod like the folks at /r/stablediffusion do, not use sketchy Openrouter APIs and complain about being underserved. But somehow Openrouter has become the go-to here.
View on Reddit #64439807

Western_Objective209@reddit

Yeah I'm just talking about the economics of renting vs buying. I jumped through the stupid signup hoops for the first llama release to run it locally, kept up with llama.cpp for a while, and it's just hard to justify when my 3k computer with 32GB of VRAM can hardly run anything yet I can get a million tokens for $1. Working on LLMs is not particularly expensive, but the price goes up a couple orders of magnitude if you want to own the equipment, and it's not immediately obvious that there's any benefit to doing it. Even if you just rent full VMs with nvidia data center cards, it's so cheap compared to buying
View on Reddit #64427978

Lissanro@reddit

I consider R1 0528 and Kimi K2 useful model, and I run them locally daily (IQ4 quants with ik\_llama.cpp).
View on Reddit #64401403

Any_Pressure4251@reddit

This!? You would be stupid to use Open Router for anything other than tests, but there are much cheaper options for Enterprise and Enthusiasts.
View on Reddit #64394647

Specter_Origin@reddit

How do you use official api's considering they have very low usage limits, while open-router has unlimited...
View on Reddit #64399221

Ansible32@reddit

The official APIs you can pay for dollars per million tokens. If openrouter is unlimited they're probably using the models that are not as good and cost pennies per million tokens.
View on Reddit #64402384

Specter_Origin@reddit

Lol, that is not how that works, the official API's even after you pay per dollars have cap on how many request per day and they have tier limits (please read official api documentations, what I say it true for gemini, chatGpt and Claude) Also "using the models that are not as good and cost pennies per million tokens" is not true as you can chose anthropic or OpenAI as provider for their own models and you are being served by OpenAI and Anthropic...
View on Reddit #64402568

Ansible32@reddit

Google Vertex quotes like 2 requests per second on the low end, some things are higher. That's... quite a lot and I really don't know what you're doing that 2 RPS is a problem. https://cloud.google.com/vertex-ai/generative-ai/docs/dynamic-shared-quota
View on Reddit #64402988

Former-Ad-5757@reddit

2 reqs a sec is not a lot, it is practically nothing. 2 reqs a sec seems only a lot if you are doing it manually, use API and it is nothing. Practically it is not a real problem either if you have to set up you workflow first, just try the workflow and your dsq goes up and up and up. It is only a real problem if you want to switch providers and just change a single prompt.
View on Reddit #64447222

Ansible32@reddit

yeah, sure, calling the API in a loop is trivial. That doesn't mean you're doing something that warrants that much usage, and again, it costs $$. If you are actually happy spending that much money they will accommodate you, but at 2RPS you could spend $200 in a minute, the idea that they should support the kind of traffic you want all-you-can-eat is absurd.
View on Reddit #64448066

Former-Ad-5757@reddit

you could spend $200 in a minute? How? Just sending a 1M context won't get you best or even good results. I mainly see people have millions of q's which can be expressed in 2k or 4k. And with API you are not talking about all-you-can-eat at least for the api's I know.
View on Reddit #64448505

Ansible32@reddit

Gemini 2.5 Pro is $10/200K output tokens, which includes thinking. A 10K token query can easily eat 20K output tokens, so that's like 2.4M output tokens if you're doing 2RPS. Which is $120/minute. But higher is certainly possible. And you're not talking about asking questions, you're talking about a collection of automated models that are sending a bunch of data scattershot
View on Reddit #64468306

Former-Ad-5757@reddit

I don't know who you are paying, but for the rest of the world it is $ 10 or $ 15 / 1 M tokens. So basically 5 times less, so basically not $120/min but more like $24/minute. $24 is a far distance away from your claimed $200. But as you say : all your numbers are just numbers you throw out there, they have no base in any reality.
View on Reddit #64472134

Specter_Origin@reddit

If you have ever done tool use via any of the coding tools, like cline, roo code, cascade etc they will consume this limits like a chump change.
View on Reddit #64403176

Ansible32@reddit

If it's hitting the limits on Gemini 2.5 Pro I would be more worried about the bill.
View on Reddit #64403359

agentzappo@reddit

I don’t understand your logic here. Why is it stupid to use OR if you’re using paid endpoints that don’t retain your data? Speaking from a convenience standpoint, I’ve found it’s much easier to issue OR tokens to my teams so I can monitor cost per person/project and allow them access to all of the commercially-available models
View on Reddit #64395347

Ansible32@reddit

You're maximizing the likelihood that someone is retaining your data and not telling you. And most (all?) of the closed models straight-up say they review every thing you write for malicious content and will store and review everything at their discretion, so generally speaking you should assume anything you send over these things is not private.
View on Reddit #64396926

No_Efficiency_1144@reddit

Official Azure, AWS and GCP endpoints are widely considered secure but nowhere else.
View on Reddit #64398117

Ansible32@reddit

What is considered secure has only a passing relationship to what is actually secure. The question with security though is, secure against whom? With the AI models this is evolving so fast it's very hard to be sure that's what's true today will be true tomorrow.
View on Reddit #64399216

ciaguyforeal@reddit

theyre secure in the sense that they are already-bitten bullets. theyve already entangled themselves with microsoft, so whats the difference, would be the thinking. not that its 'more secure' but that its inside your existing security relationships.
View on Reddit #64399730

Ansible32@reddit

Sure, yes, using a single cloud in a business context makes sense. OP was talking about OpenRouter and using everyone and everyone who says "Just trust me bro."
View on Reddit #64402100

ciaguyforeal@reddit

definitely agree you cant just default trust open router. they could be doing anything.
View on Reddit #64408266

CommunityTough1@reddit

This. People misunderstand the providers on OpenRouter labeled as "**As far as we know**, this provider doesn't log data **for training purposes**". First of all, OpenRouter has a built in disclaimer there indicating that it's not a sure thing. Secondly, it also clearly says "for training purposes", which is NOT equivalent to "no logging at all". One such provider with this label, and I'm not picking on them, is Deep Infra. The endpoint is labeled on OR with the "...no logging..." tag, but go to their privacy policy and it clearly says the data may be retained for law enforcement or other legal purposes. Just not "for training" which is all that's required to get that yeah on OR.
View on Reddit #64399157

Any_Pressure4251@reddit

Oh really so you can get a better private enterprise endpoint from Open router than the providers themselves?
View on Reddit #64397447

purplepsych@reddit

But why did anthropic share went down then?
View on Reddit #64444391

illkeepthatinmind@reddit

Yes, but that's separate from the changes within the models used by users of Open Router.
View on Reddit #64441698

o5mfiHTNsH748KVq@reddit

What are yall using to code with open router? Do you use a reverse proxy and cursor or a different tool?
View on Reddit #64395049

llmentry@reddit

I'm old-school, and I upload a JSON of the code repository, using CherryStudio as the interface. I like screening changes, and I don't like giving LLM-driven software access to my actual files. Colour me conservative :) But there plenty of agentic solutions that work with API keys, if that's your thing.
View on Reddit #64395875

unrulywind@reddit

I have been using GitHub branches as checkpoints. Save to branch > play with llm > check > correct > send stable to branch > repeat.
View on Reddit #64430836

llmentry@reddit

I of course use git for development, but I still worry that you're always just one `git branch -D main` away from disaster. I'm probably paranoid, as it clearly doesn't happen in the wild (people would be screaming if it did). But, also -- I *like* understanding and vetting every code change, otherwise it just doesn't feel like my code any more. Plus I can spot any stupid errors/bugs/assumptions the LLM has made before they happen this way. Nobody understands my codebase the way I do, not even an LLM. And it still massively increases my productivity. But, hey, I'm old-school, like I said :/
View on Reddit #64437997

x86rip@reddit

i use RooCode
View on Reddit #64395524

scragz@reddit

I was using cline 
View on Reddit #64395118

Ok_Librarian_7841@reddit

Correct but we're talking about the change herez not the absolute usage.
View on Reddit #64420397

one-wandering-mind@reddit

Yeah. This doesn't seem like it tells much. I use openrouter to play with models. My API usage is mostly Gemini these days. For Google and OpenAI , I use through their APIs directly. But then for actual use of tokens, it is either Claude 4 sonnet via Claude code or GitHub copilot that top my usage or o3 via the chatgpt app.  My openrouter usage typically has newer models and open weights models. Qwen, deepseek, gpt-oss, Gemma. Maybe 1 percent of my total usage of models is via openrouter. I'm sure there are those that use openrouter as their primary source, but I doubt that is the bulk.
View on Reddit #64400010

claythearc@reddit

I think it’s also true for the inverse where people are way less likely to use an official Chinese api so inflates open router
View on Reddit #64399632

nullmove@reddit

For my personal use it's the opposite. OpenRouter provides a layer of (pseudo)anonimity, which I am less likely to forego when it comes to big corps.
View on Reddit #64397355

MoMoneyMoStudy@reddit

Would like to see comparison of volume of usage (tokens, etc) for the LLMs for all coding use, including CLIs, Code editing GUIs, etc. Cursor alone was at an annual Sonnet API spending rate at $1Bil annually based on usage, much of that from customers using their free limit budget allowed by Cursor's subscription plans.
View on Reddit #64397279

Down_The_Rabbithole@reddit

This is true for me. I use claude at work through official API while I experiment with OpenRouter at home to test new models for a while.
View on Reddit #64395383

usernameplshere@reddit

Love to see it
View on Reddit #64624446

maikuthe1@reddit

I contributed to that lol. I've pretty much been using qwen exclusively lately. I tried it like a week or 2 ago just to see how it is and it started getting stuff done right away so I just stuck with it.
View on Reddit #64394096

Far_Buyer_7281@reddit

what language? is it any good in c++?
View on Reddit #64394221

maikuthe1@reddit

Mostly python but I run a 2d MMO that's written in c++ and I added fishing to it the other day. I wrote the basic fishing system myself and then had qwen fill in the other features of it and flesh it out and it one shotted everything and kept everything consistent with my style. Obviously not conclusive but it did very well.
View on Reddit #64394656

ParthProLegend@reddit

How do you do it? Like making a whole ahh game?
View on Reddit #64398114

maikuthe1@reddit

Umm I'm not sure what you're asking exactly. If you're asking how to make a whole game with AI: I made this game and have been working on it for years, long before ChatGPT came out, I didn't use AI to make it. I'm just now using AI to add features.  If you're asking how to make a whole game in general: you just start working on it and don't stop working on it... Gotta chug through the burnout and feature creep.
View on Reddit #64400323

ParthProLegend@reddit

Without AI. What did you learn, language framework and other skills in the process.
View on Reddit #64571348

MoMoneyMoStudy@reddit

But but Replit, bro ! Bolt, bro !!!
View on Reddit #64402339

llmentry@reddit

Well, **GPT-5 is still BYOK on Open Router,** so it's not really a fair comparison for that model. It's also not surprising that the over-priced Anthropic model would massively lose share, now that there are cheaper models that work so well. Would be interesting to see the *total* market share, though, not the relative change.
View on Reddit #64395463

Original_Alps23@reddit

You see both in the chart. Limited to OR of course.
View on Reddit #64529172

runner2012@reddit

People using anthropic use Claude Code anyway, not openrouter.
View on Reddit #64438954

RentedTuxedo@reddit

I really don’t understand the point of the byok. The whole point of open router is that I pay for access to all the models I want. Byok defeats the purpose completely. Why does it even exist?
View on Reddit #64397761

llmentry@reddit

It's OpenAI's decision, not Open Router's.  OAI has effectively said they're struggling to serve the requests they're getting as it is, so I'm not entirely surprised they're applying this.  They've done it before.  Also, I'd guess they like knowing the identity of their users, and the provider lock-in it generates. 
View on Reddit #64399238

RentedTuxedo@reddit

I’m aware it’s OpenAIs decision. Im saying it goes against the spirit of openrouter as a service in my opinion. I’m worried that it’s a trend that will continue and then we’ll be back to needing multiple different accounts and keys for each model provider because they would rather have total vendor lock in.
View on Reddit #64405726

llmentry@reddit

Hopefully not.  I think o3 was byok before this, though, so they may just feel their flagship model is "special".  It just hasn't been as much of an issue before, since 4o / 4.1 weren't regulated this way. I don't like it either :( OTOH, I've not been using OAI for inference since the requirement to permanently retain all prompts was placed on them.  I'm very happy with my current mix of models on OR (Gemini 2.5 Pro, Gemini 2.5 Flash, GPT 4.1 and GLM 4.5), plus GPT-OSS-120B, Qwen3 30B A3B and Gemma3 locally.
View on Reddit #64424178

ParthProLegend@reddit

Byok?
View on Reddit #64398059

RentedTuxedo@reddit

Bring your own key
View on Reddit #64398097

MoMoneyMoStudy@reddit

Pairs nicely w byob
View on Reddit #64401566

Specter_Origin@reddit

I agree and hope this trend does not pick up cause basically now you are bound by usage limits etc
View on Reddit #64399362

55501xx@reddit

The single payment is a convenience for sure, but I more like the ability to try a bunch of models by just changing a string. Once you load up enough money on the underlying provider, it becomes a non issue. Plus you might have some special arrangement with the underlying provider (credits, contracts) that OpenRouter wouldn’t be able to support.
View on Reddit #64398463

MoMoneyMoStudy@reddit

Cursor CEO bro now pushing BFF Sam's LLM over Sonnet for his customers. Follow the money - not always purely a tech choice, especially when a startup needs to start moving to profitability and OpenAI's investment side gig owns a lot of shares and influence. Cursor: $50OMil in ARR, $1Bil spend rate on Claude API.
View on Reddit #64400457

lanfan675@reddit

Anthropic have GOT to get their prices down. I'm willing to use Claude at work, when someone else is paying, but if it's coming out of my pocket, I'll make do with slightly worse results from any of the cheaper models. Even Gemini Pro makes a significant difference.
View on Reddit #64468265

piizeus@reddit

No, Codex CLI, Gemini-Cli, Claude Code all give direct access via their own APIs or subscriptions. I mean openrouter is not really "industry standard" for this.
View on Reddit #64465982

LiquidGunay@reddit

This can also be explained by Cursor / Claude Code / Windsurf gaining market share.
View on Reddit #64456581

balianone@reddit

That's because it's available for free over there.
View on Reddit #64395690

ParthProLegend@reddit

What is?
View on Reddit #64398266

GreenHell@reddit

Qwen3, DeepSeek, and a whole slew of other models
View on Reddit #64410209

ParthProLegend@reddit

Ohhkk thanks
View on Reddit #64453123

laserborg@reddit

how is you guys' experience with python and typescript in qwen3, GPT-5, o3, Gemini-2.5 Pro etc compared to Sonnet 4? I've heard different opinions but for me Sonnet 4 is unbeaten, never tried Claude Code and Opus 4.1 thou.
View on Reddit #64394928

MoMoneyMoStudy@reddit

Know anyone that Vibe Coded a React Native mobile app? Advice for best stack and best approaches?
View on Reddit #64398288

RageshAntony@reddit

I vibe code an entire Flutter app. Qwen 3 coder is good at Flutter. The best is Claude.
View on Reddit #64452169

oxygen_addiction@reddit

Claude all the way.
View on Reddit #64423712

brahh85@reddit

[https://github.com/QwenLM/qwen-code](https://github.com/QwenLM/qwen-code) # 🌏 Regional Free Tiers * **Mainland China**: ModelScope offers **2,000 free API calls per day** * **International**: OpenRouter provides **up to 1,000 free API calls per day** worldwide this means that qwen coder is free so people use anthropic and google models as architects, and then qwen coder for the coding the result is qwen giving people free inference in exchange of anthropic and google outputs , to make next qwen better planner and more compatible to anthropic and google outputs and the other result is anthropic and google losing income and power.
View on Reddit #64399871

Electronic-Air5728@reddit

I tried it a week ago, and it couldn't complete a single task in my small Vue.js project. Maybe it needs to be prompted in a completely different way compared to calude code.
View on Reddit #64447981

OmarBessa@reddit

Anthropic's worst nightmare
View on Reddit #64424087

No_Efficiency_1144@reddit

Why isn’t Opus there? Do people prefer Sonnet?
View on Reddit #64393857

Down_The_Rabbithole@reddit

Sonnet is actually better for coding. It's about equivalent in output but significantly faster so you can iterate quicker on whatever your workload is.
View on Reddit #64395452

mrjackspade@reddit

I guess that only matters if you need to iterate. I use opus, but then I usually only need one version of the code I'm requesting.
View on Reddit #64423310

AaronFeng47@reddit

Sonnet is cheaper 
View on Reddit #64394470

No_Efficiency_1144@reddit

Yeah but normally for code people went for the biggest model around in the past. I wonder if we have finally reached the point where we can use a smaller model. It feels unlikely as the models are still not performing that great.
View on Reddit #64394619

scragz@reddit

opus is so much more expensive it's rarely worth it. 
View on Reddit #64395152

No_Efficiency_1144@reddit

Okay I see so in this case it is a situation of the price increase being so much more than the quality increase that users are looking to maximise benefit per dollar.
View on Reddit #64395235

scragz@reddit

from what I can tell it sounds like opus is about 2x as good but 5x as expensive. it should really only be used when claude is absolutely stuck on something and you've already tried gemini and chatgpt. 
View on Reddit #64401435

MoMoneyMoStudy@reddit

Everything is a trade off between cost savings vs. time. If the paid tool and/or LLM API usage is under $100 a month but saves u at least a couple hours when factoring in accuracy, then it's a no brainer. Getting to the quantitative comparison w your choices out there is what can be hard when emotions are involved. But beware the 1 button does all Vibe coders like Replit and Bolt. YC bro Paul Graham really pushing his Replit investment on the AI buzz crowd.
View on Reddit #64399500

cyber_harsh@reddit

Is the qwen3 coder good , I didn't find it better than the claude code.
View on Reddit #64409874

LocoLanguageModel@reddit

It's great but some people don't find it better than Claude code. 
View on Reddit #64422716

Trick_Ad_4388@reddit

isn't it super obvious that it is due to claude code? nobody in they're right mind, if they are informed, will use claude models via API when you get thousands of dollars of value of API cost for the 20 dollar plan. or 5k-10k of. API value for the 200 max plan. ofc probably no one is productive with all of that "value" but it is still much much cheaper than the API for whatever they're task is. this graph only reflects this or am I missing something?
View on Reddit #64398715

svantana@reddit

Sonnet 4 is the [number one model on OpenRouter](https://openrouter.ai/rankings?view=month), so a lot of people clearly think it's worth it
View on Reddit #64422106

Trick_Ad_4388@reddit

I don't see that as clear. not everyone uses LLMs for coding. and not everyone uses claude code or knows of the value you get from it
View on Reddit #64422210

bobith5@reddit

Even beyond that, this is specifically market share just on Openrouter. It's an interesting but incomplete dataset.
View on Reddit #64400274

AppealSame4367@reddit

Good. Since Qwen Coder and GPT-5 came out Claude Opus got reliable again.
View on Reddit #64419889

vinigrae@reddit

Qwen models are highly impressive
View on Reddit #64419448

randomqhacker@reddit

All of those (aside from GPT-5) are offering free usage on OpenRouter right now. I'm sure that helps!
View on Reddit #64416543

ortegaalfredo@reddit

Tried using Qwen3-235B for roo-code but it don't work, gets confused, can't use the tools, etc. GLM-4.5-Air work perfectly but when I finally managed to get full GLM-4.5 to work it is amazing, I don't think I need any cloud AI now. I would like to run Qwen3-Coder but it's just too big.
View on Reddit #64413465

Secure_Reflection409@reddit

My top 3 models are all Qwen.
View on Reddit #64394815

silenceimpaired@reddit

Which ones are they?
View on Reddit #64402047

Secure_Reflection409@reddit

30b 2507 Thinking, 32b and 235b 2507 Thinking.
View on Reddit #64402186

silenceimpaired@reddit

What’s your quant for 235b? I ended up deleting it because I didn’t think 150gb was worth what it gave (speed/performance) compared to GLM 4.5 Air and GPT OSS 120b.
View on Reddit #64402467

Secure_Reflection409@reddit

Bartowski's IQ4.
View on Reddit #64404046

silenceimpaired@reddit

Agreed. If GPT OSS 120b cost me money, I wouldn’t be using it.
View on Reddit #64404746

lastrosade@reddit

I have just noticed that I've been using the wrong qwen 3 for weeks using the regular one instead of the coder one.
View on Reddit #64397942

MoMoneyMoStudy@reddit

Your OSS GitHub PR code reviewer agent is "shocked". The AI Agent arguments over code superiority will now melt the GPUs, worse than a Discord human mocking by Linus or Hotz.
View on Reddit #64403181

adel_b@reddit

you are finding out that smalle fine tuned model is better than generate purpose and bigger models
View on Reddit #64402540

silenceimpaired@reddit

I was so excited to be able to run this locally until I realized what people are probably using (Qwen3-Coder-480B-A35B-Instruct).
View on Reddit #64402236

Different_Fix_2217@reddit

Yea I found qwen code quite good, near sonnet 4 level but for much cheaper.
View on Reddit #64402225

dhamaniasad@reddit

I’ve tried to like open source coding models. I didn’t like R1 and I didn’t like any other open models that people were raving about. Qwen 3 coder is genuinely a good coding model, not just a good _open_ coding model
View on Reddit #64395151

das_war_ein_Befehl@reddit

I’m not getting your point because it’s open weights
View on Reddit #64396411

noneabove1182@reddit

I think the implication is that qwen 3 coder isn't just a good compared to open, it's a good model even when compared to closed ones
View on Reddit #64397390

dhamaniasad@reddit

That’s right
View on Reddit #64401212

No_Efficiency_1144@reddit

Qwen is the first one he liked
View on Reddit #64398162

Specter_Origin@reddit

"R1" was long time ago, and I would try something like Qwen Coder or deepseek v3 for coding as R1 would omit to many use less token for thinking which is not ideal for coding... if you are on cline or something you would use thinking model for planning and non-reasoning model for actual execution or 'act' mode.
View on Reddit #64399566

Infamous_Jaguar_2151@reddit

Good. Claude terms and services are unacceptable for me. Forbids using it for machine learning in 2025!
View on Reddit #64398627

beedunc@reddit

Qwen 2.5 variants were already high on my capabilities tests, and qw3 is even better.
View on Reddit #64398238

MrDevGuyMcCoder@reddit

That is some creative bullshit statical backflips to get a chart to look like its saying what you want it to....
View on Reddit #64396183

strangescript@reddit

I love that there are still people convinced 3.7 is a better model.
View on Reddit #64395549

this-just_in@reddit

This just shows how subscriptions are impacting OpenRouter.  As people using Opus/Sonnet realize they would be better off paying for a flat rate sub than per token through OpenRouter, they move into subs.  This is the cheapest way to use those models.  Models with cheaper per token costs or without an equivalent sub continue to be price-effective to use through OpenRouter.   Separately, now that OpenRouter requires you to insert your OpenAI API key to use the latest OpenAI models, they will not have accurate metrics for them.
View on Reddit #64394338