OpenRouter: anyone whitelisting specific providers
Posted by Traditional-Gap-3313@reddit | LocalLLaMA | View on Reddit | 9 comments
I'm curious if all the providers on OpenRouter are the same, or are there noticeable differences between them. I have to benchmark some models for a larger processing run for which I'll spin up GPUs on Verda, but I'd first like to benchmark a few of the models.
I'd like to avoid benchmarking directly on cloud gpus since for large models I'd need a 20€/h instance and even loading the model and setting everything up burns 20-30 minutes.
But I'd also like to avoid shit providers polluting the benchmark.
Anyone have any insight into different providers? Are they all the same?
The end goal is generating a training dataset, so it's still related to localllama...
Acceptable-Yam2542@reddit
Ngl I went through something similar when I was testing models for data processing stuff. the providers are definitely not all the same, some give noticeably worse outputs even on the same model name. ended up routing everything through a single layer that handles the switching automatically, cut my costs by like 40% and the quality got more consistent too.
Traditional-Gap-3313@reddit (OP)
can you explain this a bit more? What layer? Another service or?
Acceptable-Yam2542@reddit
not openrouter, its a separate aggregation layer that calls provider APIs directly. cut our error rate from 12% to under 2% and saved like 30% on costs. handles failover automatically so we stopped worrying about single provider outages.
Acceptable-Yam2542@reddit
yeah its basically a single endpoint that sits in front of multiple providers and routes based on availability and cost. went from managing 4 separate api keys and failover logic to just one. latency actually got better too, dropped about 40 percent. the routing logic was the part that took the most tweaking to get right.
Traditional-Gap-3313@reddit (OP)
I don't get this. Is this service layer in front of openrouter? Or did you write your own routing layer and you're targeting the APIs directly? Doesn't really help me much if it's the latter. My question was specifically about openrouter
NoFaithlessness951@reddit
Use :exacto
Traditional-Gap-3313@reddit (OP)
Thanks, I've set exacto in the settings as preferred providers. Do you also target :exacto in the model name directly?
NoFaithlessness951@reddit
Yes you can just do sth like openai/gpt-oss-120b:exacto
Practical-Collar3063@reddit
I found that Fireworks is pretty good at not lobotomising models, they usually run un-quantisized versions of models