I'm visually impaired, made a thing to drive Wayland around with agents
Posted by Amonwilde@reddit | linux | View on Reddit | 35 comments
Hey folks,
I'm a visually impaired Linux user who typically uses x11 and I wanted to learn about Wayland and port some of my accessibility tools over to Wayland/GNOME but hadn't taken the plunge. A A little while back Anthropic released some desktop driver stuff, of course not on Linux, and I was kind of jealous. I thought putting together something that would let agents control a Wayland desktop would let me learn about the Wayland SPI and device stack plus be a cool project.
Tine is a Python CLI plus a small GNOME Shell extension that combines AT-SPI2 accessibility reads, vision fallback via a labeled coordinate grid, and kernel-level /dev/uinput input. It lets an AI coding agent (Claude Code, Codex, anything that can run a shell command) actually use a GNOME Wayland desktop — click buttons, fill forms, read what's on screen — without the Screencast portal throwing a consent dialog on every action.
Repo: https://github.com/smythp/tine
Caveat: use at your own risk. Agents are nondeterministic, etc. With that said, I just put Arch on an old laptop and let agents control it over ssh.
Let me know what you think.
Inside_Secretary3281@reddit
really impressive work!
BansheeBacklash@reddit
Are you my dad?
I kid, but he's a been a software dev for a few decades and he's also visually impaired. I feel like he could really use something like this. I'll have to send it to him.
Amonwilde@reddit (OP)
lol probably not unless you're 2 years old :)
Great to hear. There are a lot of VI devsout there but we're usually on our own little islands, not a ton of community.
KlePu@reddit
Wait, I'm confused... It's AI slop so we have to hate. But you're impaired, so we have to be helpful. ^(/s)
Amonwilde@reddit (OP)
lol
You can hate if you want. :) Are folks really handcoding projects and posting them on here these days? I'm a principal engineer and it's extremely rare to just sit down and write Python or whatever anymore. Or maybe it's the drive the desktop around you don't like (in principle).
I don't know. It' s a weird world now but in these domains (computing, linux, coding) I think it's just different now and it's hard to live in the old world, or impossible. If I could push a button and keep us in the old world I might push it. That said, now that we're here, we may as well do cool things with the bots? And TBH blind folks were always locked out of a lot of stuff, I did manage to break into this world but most can't, and agents are a booster for disabled folksin particular.
Farados55@reddit
lmao imagine thinking people don't type code by hand anymore.
It's a great idea to use agents to automate things but this is a silly take.
Amonwilde@reddit (OP)
I mean I'm sure it happens. But green field?
The stat according to GH is about 50% as of early this year.
Anyway not really a take, just like, aren't you used to it by now?
Farados55@reddit
What’s your source? This is the only source this year that has “50%” in it and it doesn’t refer to the percentage of code that is AI generated lol
https://github.blog/changelog/2026-03-19-copilot-coding-agent-now-starts-work-50-faster/
I use agents sometimes to diagnose problems or rewrite some trivial stuff sure. But most of the time in the LLVM codebase it produces verbose and unnecessary solutions.
Amonwilde@reddit (OP)
OK. I used a stat in a post that said 46% but that's apparently GH saying that stuff on GH is 46% copilot? Which does sound kinda whacky. If true, the true number would be higher since presumably other tools are also used. But I find it suspect
Here's some marketing spam that cites some sources ranging from 40-50%:
https://www.netcorpsoftwaredevelopment.com/blog/ai-generated-code-statistics
I'll update if I find more reliable stats.
_angh_@reddit
what I see is: I write all the code, and then AI write all the unit tests (it's way too boring). There are your 50% ;)
Sure, some small changes or boiler plate can be ai generated. But for any non trivial case AI tends to overengineer and easily miss things. If dev has no knowledge he will accept it anyway, if have some knowledge will ask ai may times to fix those issues, if is very knowledgeable he will faster write code manually with some AI for repetitive or boring stuff and will get a better general quality and performance.
Amonwilde@reddit (OP)
It's definitely good for tests. I get decent code out of it but it's more than just asking for it, needs a full process which most aren't willing yet to do.
Jmc_da_boss@reddit
"Are folks really still hand coding projects"
The competent people are ya. How out of touch can you possibly be?
Straight-Software-89@reddit
You're the one out of touch. You have a 3yo view of AI. The frontier models available to corporate / big tech is very impressive.
Jmc_da_boss@reddit
I'm using millions of opus 4.6 a week and I am saying you are ludicrously out of touch.
Straight-Software-89@reddit
Sure buddy, stay in denial lol.
And anyways token count is not really a good metric, and on top of that millions is a nothingburger nowadays. 20mins of codex 5.4 pro results in 8m token used.
Jmc_da_boss@reddit
Claude token usage and codex usage are different things. A million Claude is like 50 million codex. Just due to how they show cached and thinking tokens.
Straight-Software-89@reddit
It doesn't matter, as mentioned earlier your argument about tokens in the first place is stupid.
Jmc_da_boss@reddit
It shows usage time and familiarity, it doesn't show outcomes.
But the outcomes don't matter to this conversation just that we both use the LLM regularly. Which means we are both familiar with what they can or can't do.
And you like so many others are slop pushing delusional. Stop shitting loc out, sit down, think and write the important parts by hand. before you blast all your coworkers with slop in the "new era".
Straight-Software-89@reddit
No it doesn't.
Clearly not.
The average person like you can only be dismissive.
I have nothing to prove to you. And even if I wanted to, I can't because (1) you are sleeping for the sake of it, and (2) my anonymity would be destroyed due to FOSS ml activity.
Jmc_da_boss@reddit
Yes, using the LLM all the time shows familiarity with the LLM, this is a very basic concept.
You are suffering from the common LLM psychosis.
Or you work in ML research where code never mattered anyway. So by all means slop it all out in that case it doesn't matter.
Straight-Software-89@reddit
It doesn't show your skill in using the LLM. The entire chat you have been dismissive.
Nothing can change your block of a mind. Your comments are nothing more than a slop itself.
Different-Ad-8707@reddit
The competent are using the tools too. They're also just better engineers and by definition _competent_ and so can use those tools better and also bring up the outputs to their standards.
pitiless@reddit
Ehh, my experiences working with these people (and the code that "they" produce) does not bare out this assertion.
Different-Ad-8707@reddit
That's fair. I'm only an undergrad about to go into my first job. Will see how my experience pans out in this regard.
For me, personally, LLMs just enabled more of my "how could it be?" approach to things/projects I wanted to do. And is now my most used search, research, and self-ed tool.
pitiless@reddit
My recommendation as someone who has has 20 years in the industry is to be wary of offloading too much of your cognition to these tools.
I've now encountered a lot of people who are "young" to the field who are simply failing to develop the reasoning skills that make excellent engineers.
At this point I'm almost convinced that the act of struggling with a problem actually a key part of the learning process and having an easily accessible way to skip this leaves these people with under-developed problem solving skills which harms them as their career progresses (or fails to do so).
Different-Ad-8707@reddit
Thank you for advice.
I'm worried about the same based on what I'm seeing in my peers. Which is why I'm leaning more into my aforementioned "how hard could it be?" mindset to learn more things, figure out and build engineer skills and things of such nature.
`act of struggling with a problem actually a key part of the learning process` I'm pretty sure this has been empirically proven, though I can't quote or reference the literature for it of the top of my head.
I'll keep your words in my mind, as I work become a proper engineer, taking care to not offload too much onto LLMs, and try to survive in this shifting landscape of technology.
khsh01@reddit
Unless I'm generating boilerplate I usually end up hand coding stuff. Especially if I am debugging something.
KlePu@reddit
Me personally, I'm ambiguous with AI agents. I see colleagues pumping out PRs (or MRs in our case, we're using GitLab) at an incredible rate - but the quality is really decent nowadays! Including actual descriptions of what a change is supposed to do, which was really rare in the past ;)
Common sense dictates an easy rule: If a tool is beneficial, use it. Haters gonna hate anyway \^\^
Amonwilde@reddit (OP)
Honestly I am ambivalent because I think it has negative effects. But I've just gone in on it because the negative effects attain whether or not I use the tool (i.e. dead internet, slop on socials) and at least I can play with the pretty interesting new thing even if it's not making the world a better place. And like I said TBH Linux was pretty exclusary if you were VI and it does help with that.
Capable_Music7299@reddit
/s for what? that's the usual thinking
McDonaldsWitchcraft@reddit
to be fair, the wayland protocol is still very hacky when it comes to accessibility so I can't blame them.
undrwater@reddit
This is cool! I used to work in accessibility, and I'm aware how Windows centric it tends to be. There was a movement in Linux for a while to develop accessibility api's, but that seemed to die quickly.
I'll take a look.
Amonwilde@reddit (OP)
Thanks. I actually think that agents could help here. If you could specify a working desktop for VI folks with tests you could keep running against that. Maintainance and regressions killed all those old distros (vinux, talking arch). Wayland does make things hard but this project has helped me migrate a lot of a11y utilities I use in x11 over, so I'm feeling a little more optimistic.
I do think this thing is more broadly useful for those needing desktop automation in Linux, I really wonder if the big frontier labs will even bother with Linux given the desktop situation right now? Cheers
undrwater@reddit
I took a quick look at the project, and it's difficult for me to understand how it works. I'm pretty technical, so I imagine someone who is less so might be lost.
Have you thought about creating a video (or other practical example) of how it all works?
Amonwilde@reddit (OP)
Yes, though probably won't have one out today.
The short version is that it's a CLI you had an aent. The CLI has tooling to let them work with the desktop and do things like get screenshot, parse the AT SPI tree, click stuff, etc.
So basically you install this and hand it to an agent. Realistically the best way tto get this going is to take the link tot he repo and hand it to Claude Code or whatever and say can we get this going. If it were me I would first ask claude to audit the code for security sisues since I'm a rando on the internet and then ask to set it up.
I'll add soe stuff to the start of the readme to make this more clear and do a vid soon.