I really want to switch to Linux fully, but one thing is stopping me.
Posted by Decent-Principle8918@reddit | linux | View on Reddit | 63 comments
Hi, everyone
I've been a on and off Linux user until the steam deck came out. My favorite Linux OS is PopOS, and Fedora in second place. At the moment, i got all macs, just purchased a mac book air 15.
Amazing laptop, I've always loved the Gnome flavoring it has, but the real issue is i need dictation (speech to text) due to my disability. i need help with spelling a lot, and it effects my workflow.
I've already tried in the past talking with devs directly, but it looks like the developers of those accessibility channels aren't getting funding at all to actually implement those features. if i could afford it, i'd 1000% do it.
If they did get it figured out, i'd most likely sell my mac for a Panasonic tough book fz-55 with dual battery expansion. I prefer longer battery life then i do anything else.
Professional-Crab234@reddit
There is a new, user-friendly, and completely free one:
VOXD - a voice-typing / dictation app for linux
"Out of the box" sets you up with LOCAL voice transcription, and even LOCAL ai-rewriting according to your custom pre-made prompts.
I hope it will help you.
GeekTX@reddit
I have no products or what-not to toss at you for ideas ... yet. I have a few folks over the years that I have taken care of with extreme vision issues. If you could tell me more about your disability, only what you are comfortable with, I might be able to help in finding a solution that is somewhat tailored to your needs. I've been in IT for 40+ years w/ 30+ being pro so I have a little experience to work with. ;)
Fortunately, most modern browsers on Linux allow for direct hardware access just as they do on Windoze and Mac. That means that web-based solutions are available to you ... removing the specific OS dependency.
Decent-Principle8918@reddit (OP)
I am autistic with dyscalculia and dyslexia. I also have some neurological issues that require me to use dictation software because I literally can’t spell the word even though it might be a tip my tongue.
It sucks, a LOT. Also can’t read a regular book it’s either manga, comics, and alike or audiobooks. Just glad to have what I need now to my job, and life better.
GeekTX@reddit
oh wow … you got me there. I wish you the best of luck with that.
These-Accountant6023@reddit
Try out nerd dictation, and I think talon works as well
Decent-Principle8918@reddit (OP)
idk if you have used nerd dictation, but do you know if the app once installed can be used outside of terminal like in the browser, or libra office?
JockstrapCummies@reddit
nerd-dictation is good, but it's really one small piece that you need to build on top of if you want to use is as a universal "input method" outside of the CLI.
But its core is basically VOSK, and with a cursory search there's ibus-speech-to-text, which basically uses the same VOSK backend and makes it an iBus inpute method (which is usable cross-GUI).
Decent-Principle8918@reddit (OP)
Well i think once that's setup, and integrated within a lot of the distros is where i'd be more interested in the prospect of using it on a full time bases. I will still set it up, and try it out since i am curious. Question have you every tried app called live caption?
The developer made a app that can accurately around I would say 90% of the time detect a person speech. but the app is not set up to allow you to copy and paste it into anything, and the dev refuses to rework his program to be more useful.
I even offered to give him a few hundred, but he said no.
JockstrapCummies@reddit
That's already doable with Whisper. The official Whisper implementation, the Whisper.cpp port, or the faster-whisper implementation via the whisper-ctranslate2 CLI... They all support live transcription from microphone input, extremely accurately, and outputs plain text that you can use however you want.
There should be plenty of GUI frontends as well but I haven't used those.
mycall@reddit
Whisper works if you like to copy/paste constantly. Too bad it isn't integrated at the OS level, similar to iOS, Android, MacOS or Windows.
DatCodeMania@reddit
can always set up automatic copying very easily, even automatic pasting (but I think it would be better to manually paste)
Decent-Principle8918@reddit (OP)
I wonder how long it will take because lord, am i itching for some linux 🤣
VALTIELENTINE@reddit
How long what will take? They just said a solution already exists and is called whisper
Decent-Principle8918@reddit (OP)
When I get up I’m going to do some research on wonder, and install everything. I’ll let you know how it goes though
VALTIELENTINE@reddit
I’d rather get a good distro agnostic solution than have “the two major DEs” be the ones implementing it
Alfonse00@reddit
What they need to implement properly is not the solution itself, but easy access to it through settings and distros need to add it to the install process.
PermitTenders@reddit
this was a huge concern for me as well and i can attest to how decent nerd dictation is. it can be triggered and used everywhere asnd i'm using it to dictate this sentence. to begin dictation:
and to stop dictation:
i've assigned each script to be triggered using a hotkey via pop's system menu and it works like a charm.
Indolent_Bard@reddit
Is this some kind of sick joke? People with accessibility issues are the last people who should be using the command line. Basic accessibility shouldn't have a hurdle like that.
Decent-Principle8918@reddit (OP)
Maybe I can setup a alt text where i assign that code to let’s say word “speak” I know I did that once for another program
PermitTenders@reddit
sounds good. lmk how you get on with that.
Decent-Principle8918@reddit (OP)
i think it's just a simple command, i will need to look up how to do though again.
Analog_Account@reddit
I just looked this up because I'm curious... I guess people normally set up a keyboard shortcut to enable or disable it. But yes it should work anywhere.
Would a keyboard shortcut work for you with your disability?
If I have some time then I'll see if it works on my PopOS machine and see if it really does work everywhere.
Decent-Principle8918@reddit (OP)
yeah on a mac, you just press a button
kapitaali_com@reddit
you can create your own keyboard shortcuts https://support.system76.com/articles/keyboard-shortcuts/
Overall-Buy4177@reddit
By curiosity, why do you want to be on Linux if the Mac solution works best for you?
Decent-Principle8918@reddit (OP)
I will install both on my steam deck, and if they work I'm going to be very very happy. Just hope it can understand me, but i'll find out.
RemasteredArch@reddit
I don’t have anything productive to say on this topic, but if you’re interested in the tech behind the Linux desktop, you should check out Matt Cambell’s work on Newton. As I understand it, the transition to Wayland has broken the (already not well-designed) existing methods of screen readers and such, so Matt (under contract with the Gnome foundation) has been working on Netwon, a new Wayland-native cross-platform accessibility architecture.
Best way to learn about this and keep an eye on it is to check out Gnome’s accessibility blog: https://blogs.gnome.org/a11y/
Indolent_Bard@reddit
The fact that there are distros that migrated to Wayland by default before this stuff was ready is honestly offensive to humanity itself, and I'm not even disabled.
RemasteredArch@reddit
Agreed to an extent, but in all fairness, X11 will remain supported for quite a bit through LTS distros, which I hope can tide users over until Wayland is as accessible — this is just a blind hope though, I can’t speak to the lived experiences of users. Besides, per the talk Matt gave, reinventing Linux’s accessibility stack has been needed for quite a while, as the current method is quite fundamentally flawed. Besides working with Wayland and sandboxing like Flatpak, Newton also promises (among a variety of other great things) to be much more responsive.
axvallone@reddit
Wayland has actually slowed my progress to port Utterly Voice to Linux. Accessibility applications like this need to be able to control window sizing and positioning, which Wayland apparently does not allow :-( The old Windows API actually provides far more features for window control.
Indolent_Bard@reddit
So basically, you'd need to work with each Wayland compositor individually and add that functionality natively. Yeah, that's a tall order.
RemasteredArch@reddit
Yeah, I don’t have a real comment here, just that it sucks. I imagine the only solution here is in the Wayland spec, which I know too little about to comment upon. Utterly voice looks very cool though, great work!
Decent-Principle8918@reddit (OP)
I would love to check that out, honestly I need to take some computer hardware, and software classes. I’m dyslexic though so the software portion is out of the question, unless I can setup alts.
RemasteredArch@reddit
That’s alright! No shame in not wanting to become a software developer just to be able to use your computer (I just like this stuff because I’m a huge nerd about tech).
If you’re still interested in learning more about it, but prefer speeches to articles, I can also recommend Matt’s talk on it: Modernizing Accessibility for Desktop Linux - Matt Campbell, GNOME Foundation.
Decent-Principle8918@reddit (OP)
I am but idk how far I’ll get, and trust me if I could becoming a software or tech expert I would the pay is amazing!
lynnlei@reddit
fwiw my wife works with someone who uses dictation to code due to disability so it's not restricted at all. they unfortunately work through windows though
Decent-Principle8918@reddit (OP)
Yeah there’s another big issues, that code doesn’t go into long term memory. It goes into short meaning even if I try I’d never be able to retain it. Unless I make a customs alt text, where I effectively rename the code commands to something I can remember. Then maybe just maybe it would work
lynnlei@reddit
im just saying if this is a dream of yours you can do it. your idea there will help for sure. if it's not then don't sweat it. :)
Decent-Principle8918@reddit (OP)
It’s more of a sorta interest. It’d be awesome but in the second hand I don’t have time, and love my job
StrikeSpiritual2624@reddit
Try using Ubuntu - that is easier to install and manage and may have better text to speech functionality.
irelephant_T_T@reddit
accessibility imo is a big problem with linux.
sebexyt155@reddit
Try out DeepSpeech or GCP
ShrimpsLikeCakes@reddit
Speech note is what I use personally for tts and stt
6950X_Titan_X_Pascal@reddit
mac is amazingly great
yotties@reddit
speech-to-text is becoming more and more available in the cloud. That is greatly reducing the dependence on specific hardware. Word-online, google-docs, Onlyoffice etc. all have reasonable speech-to-text available. Software like dragon is great but it locks you in terribly.
Decent-Principle8918@reddit (OP)
Yeah but I need it in the browser, and other system functions
yotties@reddit
It will be interesting to see if cloud-apps become more suitable for system functions. But I would not wait for it.
gebgebgebgebgeb@reddit
I use/am a developer of Numen and sprec
archontwo@reddit
Kdeconnect works on steamdeck
Decent-Principle8918@reddit (OP)
Kdeconnect doesn’t do speech to text, it does text message syncing. It works so so for iOS
FangLeone2526@reddit
Kde connect can send input to your device, and your phone likely has speech to text input built in. It would work to just input text via speech. It would not be as nice to use as nerd-dictation or similar.
Posiris610@reddit
It’s a couple years old, but the Linux Destination podcast did an episode on Accessibility and had some suggestions if I recall. They also had a couple people they were interviewing as well that worked in the development of said services for Linux. It’s episode 284, and may be helpful.
cratercamper@reddit
Be in Linux and have Windows in virtualbox then
CodeMurmurer@reddit
Yep looks aren't switching to Linux based on these responses.
Decent-Principle8918@reddit (OP)
Maybe not right now, but crossing my fingers in the next few years
CodeMurmurer@reddit
I hope so too. Linux deserves more users.
Decent-Principle8918@reddit (OP)
I do still use Linux with my steam deck, Linux is a monster at gaming. I love it!
Zireael07@reddit
not free, but a friend of mine is using Newton Dictate to great effect
heard great things online about speechnote too
Decent-Principle8918@reddit (OP)
I’ll do some research on it
asp174@reddit
My comment won't be of much help right now. But I'm eagerly awaiting openai/whisper for everyday use.
Decent-Principle8918@reddit (OP)
yeah i hope that Ai can help manage programing layouts for developers. My brother uses it with his business, and it helps him immensely. Myself, i use it for my work a LOT without i'd get overstimulated i'm autism.
BUBBLE-POPPER@reddit
Mac Hardware is not the perfect platform for linux
Decent-Principle8918@reddit (OP)
I know that’s why I am going for Panasonic, if I can ever get this working