LLMs in LM Studio can now grab images from the internet and look at them/show you
Posted by Agreeable_Effect938@reddit | LocalLLaMA | View on Reddit | 19 comments
Soo, I made a plugin that allows LLMs inside LM Studio to feed images from the web into themselves for analysis. They will chain the tools depending on the task.
No MCP/APIs/Registration — these are simple scripts that can be installed in 1-click from the LM Studio website. (Yes, LM Studio has plugin support!). All you need is a model with Vision (Qwen 3.5 9b / 27b are both great)
I also updated the Duck-Duck-Go and Visit Website plugins to be able to work with images; and added some extra:
- The tools automatically fetch images and convert them into smaller thumb files for chat embedding (to avoid clutter).
- The analysis tool will then use full-resolution images for analysis if possible.
- The plugins guide the LLM to embed images if needed, or to use a markdown table gallery, if user explicitly wants alot of images.
You can see few examples of this in the screenshots.
Links:
https://lmstudio.ai/vadimfedenko/analyze-images
https://lmstudio.ai/vadimfedenko/duck-duck-go-reworked
https://lmstudio.ai/vadimfedenko/visit-website-reworked
In case anyone needs it, my Jinja Prompt Template: Pastebin (fixed the problem with tool call errors for me)
My Qwen 3.5 settings (basically, official Qwen recommendation):
Temperature: 1
Top K sampling: 20
Repeat Penalty: 1
Presence Penalty: 1.9 (I think this one is important, fixed repetition problems for me, always gets out of loop)
Top P sampling: 0.95
Min P sampling: 0
System Prompt:
You are a capable, thoughtful, and precise assistant. Always prioritize being truthful, nuanced, insightful, and efficient, tailoring your responses specifically to the user's needs and preferences.
Research before answering the questions: use both reasoning and tool calls to synthesize a proper conclusion.
Link to the previous post
TheOneHong@reddit
not sure if i did something wrong in lmstudio config, it showed as link instead of image
Technical-Earth-3254@reddit
Insane work! I always used the "OG" danielsig duckduckgo-plugins for websearch. It was always sad that the picture retrieval didn't work properly and resulted in errors. Your tool completely fixed that!
I am using a different system prompt, though, but that shouldn't matter too much, it's working great nevertheless.
Virtamancer@reddit
Did you go with qwen3.6-35b-a3b instead?
Agreeable_Effect938@reddit (OP)
Yoo, looks awesome!
Yeah, system prompt doesn't matter that much, especially for general tasks.
There's no deep research plugin right now (even the simple search plugin was broken). One of the problems is that the LM Studio API for plugins is currently very limited. Ideally, for deep research, a single LLM should be able to orchestrate subagents, each with their own research topic. The image analysis plugin already works similarly: it allows the model to push a query to basically a subagent of itself. But it's very limited, subagents can't do a tool chain of searches on it's own.
If API gets a bit more advanced, I'd probably do that sort of plugin
Visual-Walk-6462@reddit
amazing thankyou
Doct0r0710@reddit
Absolutely did not know this. I knew about that rag-v1 and js-code-sandbox plugins, but that's as far as I got. There's apparently an LM Studio Hub, but I can't seem to find a way to discover what's on that hub. Thanks for the plugins btw, looking forward to using the DDG one.
gpalmorejr@reddit
Yeah for some reason there is no "Hub" that is exposed. You have to hope Google shows you something. I don't why they did that.
Technical-Earth-3254@reddit
The Hub is super ass, you have to browse it through google bc they apparently can't implement a market place or similar. My recommendation is to search on Github for plugins and then hop on the hub via their respective readme (thats how I'm doing it).
PimplePupper69@reddit
I wish they would work on the ui so it would be more bearable to use it like the proprietary one
OnlyTodayDeal@reddit
Czy ma ktoś pomysł jak dopytywać lmstudio z danielsig/duckduckgo po API
Chodzi mi dokładnie o połączenie n8n i serwer lmstudio aby przeszukiwać "za free" neta i wyciągać informacje ze stron i rss
Agreeable_Effect938@reddit (OP)
not sure if you're a bot or why you're writing on polish, but the old duckduckgo plugin is completely obsolete, judging by the code, the tool calls were tested before the release on the old plugin. image search wasn't functional
BustyMeow@reddit
Now they work pretty well for me even with (traditional) Chinese characters, generated by Qwen3.5-35B-A3B. I use my own Chinese system prompt as well.
Agreeable_Effect938@reddit (OP)
Awesome. Glad it works in Chinese
moahmo88@reddit
Amazing!Thanks for sharing!
larrytheevilbunnie@reddit
Wait would this cost anything? I thought search providers charge an arm and a leg for images
Agreeable_Effect938@reddit (OP)
The search is based on DuckDuckGo, which seem to be well funded https://duckduckgo.com/duckduckgo-help-pages/company/how-duckduckgo-makes-money
The plugin allows AI models to make a request to a search engine directly. So it's basically like letting AI use Google.
Yeah, search providers typically provide API keys (often paid) that connect to AI backends via MCP. This is simpler, the AI works directly with search. The downside is that some services, like as Reddit, block automatic connections without an API, so this approach doesn't cover the entire internet.
qubridInc@reddit
Very cool direction local multimodal agents only start feeling actually useful once they can see, search, and reason in one loop. LM Studio plugins quietly becoming “poor man’s operators” is a bigger deal than it looks.
hack_the_developer@reddit
Multimodal capabilities are great. The challenge is keeping agents from going off the rails when they have more capabilities.
What we built in Syrin is guardrails as explicit constructs enforced at runtime. Every agent has defined boundaries.
Docs: https://docs.syrin.dev
GitHub: https://github.com/syrin-labs/syrin-python
egomarker@reddit
Good job