A 4B Model That Outperforms 32B on GUI Tasks, Fully Open-Source
Posted by Successful-Bill-5543@reddit | LocalLLaMA | View on Reddit | 12 comments
It includes
- 4B GUI Agent model capable of running on local computers.
- Plug-and-play inference infrastructure that handles ADB connections, dependency installation, and task recording/replay
ahaw_work@reddit
Anything similarn to work with pc programs?
Logical_Vermicelli99@reddit
In theory, this model can operate computers and complete tasks in the OS-World environment.
timedacorn369@reddit
This seems to be a model fine tuned only for mobile gui tasks right?
Xamanthas@reddit
Yep and mobile phones dont need this. This is most likely for troll farms and such in SEA and Slavic countries (no offence intended, thats just where they are nowadays)
lookwatchlistenplay@reddit
I have a great personal use case in mind for such a thing. Getting 2000+ notes off of a notes app whose developer decided not to include a bulk notes export feature. The only non-hacktastic way to achieve such a bulk export is to individually open each note by hand and click "Send" -> send the single note somewhere... And I'd have to do this manually 2000 times. It's probably possible to use some existing mobile automation app to help with this, but if I can simply tell the LLM what I want in plain language and it does the thing, that is very helpful.
rm-rf-rm@reddit
And thats why folks, you use Obsidian (file over app)
lookwatchlistenplay@reddit
Indeed! I switched to Obsidian over exactly this headache. I still have 2000 notes I can't easily put anywhere useful (like my PC, for LLM organization or analysis), but going forward, Obsidian it is.
Logical_Vermicelli99@reddit
This is a general VLM model with GUI task processing capabilities. We have tested it on general VLM benchmark metrics, and it still maintains decent general task performance for a 4B-parameter model—for instance, it remains proficient in solving mathematical problems.
previse_je_sranje@reddit
They cooked with this one, just wish I could run it fully locally without USB
rm-rf-rm@reddit
what CUA framework are you using to run it?
Muritavo@reddit
I haven't reviewed it yet, but you could theoretically run adb via wireless with "adb pair" or "adb connect"
noctrex@reddit
Here's a GGUF:
https://huggingface.co/noctrex/GELab-Zero-4B-preview-GGUF