OpenAI API Compatible access to Google Gemini and Embedding Models
Posted by theeashman@reddit | LocalLLaMA | View on Reddit | 12 comments
Hey all, I want to share a project I've recently worked on that provides seamless, all-in-one access to Google's commercial Gemini and embedding models using the OpenAI API - completely free of charge! You can check it out here: https://github.com/ekatiyar/gemini-openai-proxy
There's been some discussion here recently about Google's Gemini models, especially regarding their 1 million token context window, which is ideal for use cases like long-form chats, creative writing, and handling large codebases. While Google provides generous usage limits for free users with their API, the lack of native OpenAI API compatibility makes integration challenging, especially since many popular open-source tools and applications only support the OpenAI API. For instance, I use Open Web UI, which only supports the Ollama or OpenAI API protocols.
While services like OpenRouter offer access to Gemini models via an OpenAI-like API, their usage costs can quickly add up, especially when taking advantage of the 1 million token context window. There are some existing projects which implement proxies for communicating with Google's models through the OpenAI API. However, these solutions are incomplete - some only support chat completion, while others focus only on embeddings.
To address this gap, I've forked zhu327's excellent gemini-openai-proxy, and extended it to support Google's embedding model, creating an all-in-one solution to interact with all of Google's commercial models for free using the OpenAI API. This means you can use it both for chats and embedding capabilities in your applications, including those that rely on RAG.
Benefits:
-
OpenAI-compatible access to Google's models at no cost.
-
Comprehensive support for chat completion (with streaming) and embedding models.
Features:
-
Easy deployment via a Docker image.
-
Minimal configuration required to get started.
-
Flexible hosting options—run it locally or on cloud servers.
The source code and setup instructions are on github: https://github.com/ekatiyar/gemini-openai-proxy and you can pull the docker image from ghcr.io/ekatiyar/gemini-openai-proxy:latest
Give it a try and let me know what you think! Contributions and feedback are welcome.
solidsnakeblue@reddit
Thanks, this is super useful with the new AI Web Researcher tool
Cool-Bath-1339@reddit
model version error 400
coder543@reddit
Google’s Gemini models (not embedding) are already available through an OpenAI compatibility layer that Google offers: https://cloud.google.com/vertex-ai/generative-ai/docs/multimodal/call-gemini-using-openai-library
theeashman@reddit (OP)
Man, I've tried using this but ran into an authentication error every time ... not sure what I was doing wrong but this proxy worked for me
coder543@reddit
There’s a tab for “Environment variables”, which is how I did it. It just uses the standard gcloud CLI to get a token.
-johnd0e-@reddit
But you can not use it for free, can't you?
-johnd0e-@reddit
Another good project with the same goals: https://github.com/PublicAffairs/openai-gemini
novexion@reddit
Gemini is already available through OpenAI api format though? You can use the OpenAI api sdk with it too.
Gravy_Pouch@reddit
can you send a link. Google docs are awful.
novexion@reddit
https://cloud.google.com/vertex-ai/generative-ai/docs/multimodal/call-vertex-using-openai-library
-johnd0e-@reddit
The difference is that with VertexAI there is no free tier at all. But using via the proxy is completely free (for personal use of cause).
SonOfWan@reddit
Very cool! I’ll give it a try