Frustrating results with product searching
Posted by Gold-Drag9242@reddit | LocalLLaMA | View on Reddit | 7 comments
I gave the tasks to my agent running on gemma4 26b via openclaw on llamacpp to research products that fulfill my need. It was a rather long description of the use case, of what I don't want and so on.
My expectation was that the agent is spending lots of loops in searching, analyzing etc to find suitable products.
He was done in 1 minute. Found exactly what I don't need and gave me some shallow general product categories to look into.
It's exactly what I not want. I wanted my agent to find the products not to tell me where I should search.
I tried than with Claude sonnet 4.6. It behaved better, searched longer and produced also a a very general list of manufacturers that might be interesting.
After I told sonnet that I don't care for manufacturers who do not have a product in their portfolio that meets my criteria and I want concrete products not just collections/manufactures, I got a list of candidates.
But this was a bit frustrating. This is the kind of research task that I would love to hand over to my agent. But I don't see that they are capable of doing this. But why? They can search the internet, interpret pictures, navigate pdf catalogs etc. What is stopping them?
Gold-Drag9242@reddit (OP)
I took your recommendation to heart and describe now a more sophisticated approach as a skill. I planed it out with claude sonnet. Let's see if it works.
Gold-Drag9242@reddit (OP)
The custom skill works much better, but is no panacea. At east the process is now no longer a "query once and answer"
Parzival_3110@reddit
I think the missing piece is usually less model intelligence and more the browser loop around it. For product research I would force the agent to keep a candidate table, open every vendor page, reject anything without a concrete matching SKU, save source URLs, then only summarize after it has evidence. Otherwise it will happily stop at category pages.
Disclosure since I build in this area: FSB is the browser layer I made for OpenClaw style agents, with owned Chrome tabs, DOM inspection, screenshots, and MCP tools for real sites: https://github.com/LakshmanTurlapati/FSB
The big practical lesson is to make browsing state observable. If the agent cannot prove which page it checked and why it rejected each product, it is probably just doing web flavored brainstorming.
BigYoSpeck@reddit
Without knowing what your requirements and prompt was we can only hazard guesses at the issue
First, LLMs and openclaw aren't magic
Secondly Sonnet it magnitudes more intelligent than Gemma 4, especially the 26B mixture of experts version
Gemma 4 in an autonomous agent harness like this is going to be hit and miss. These things run in huge loops, churning tokens and building large contexts. At a certain point the architecture of Gemma is only going to have a coarse grained understanding of the task and specifics are likely to be lost. Qwen3.6 is probably a better agent model though there's a good chance it will generate long thinking traces
The language you prompt with is important. Use assertive, instructive language. Avoid prompts that are questions, and make them certain instructions
For example don't prompt "can you find", just instruct it to find. For constraints for things like things you don't want use instructions like "must not"
Final tip is don't spend time back and forth refining their work. If they get it wrong from your first prompt, start again with a modified prompt. Their work is disposable, don't fall into the sunk cost fallacy trap of guiding them when they make mistakes, adjust the starting prompt to avoid that mistake happening from the first shot
OutlandishnessIll466@reddit
Jup, agents are not magic. I also had higher hopes, but it turned out you also still need to tell them HOW exactly it should complete your task. It's still a junior. I asked it to promote my website, but without telling exactly what to do it's not going to get anywhere.
HelpfulHand3@reddit
have you tried "research" mode on Gemini, Claude or ChatGPT? that's closer to what you want. there are already deep research frameworks on github that you can use for your local models.
YearnMar10@reddit
That’s just the instructions and the framework surrounding it. If you want something more sophisticated you have to instruct precisely how the agent shall search, how to follow up, how to make deep dives etc
To improve it even more, you can program something around it that enforces a few of those research loops.
AI is smart, but for making them work for minutes or hours, you need to have a sophisticated framework surrounding it.