The Harsh Reality of Small-Scale AI Research: A Personal Tale of Frustration and Limitation
Posted by TerryC_IndieGameDev@reddit | programming | View on Reddit | 23 comments
pip25hu@reddit
AI research currently involves throwing absurd amounts of computing resources at the problem. So yes, the barrier of entry is substantial. Still, if you're serious about this, get yourself a 24 GB card, that's still widely available hardware. Below that, I wouldn't bother.
TerryC_IndieGameDev@reddit (OP)
Yes currently I have a 12g vram card. I am saving for a box with two 24g cards. I can only dream.
Able-Channel7739@reddit
This could've been written in the 60s, when the little guy couldn't work on operating systems and compilers because the computers were still these big mainframes only accessible to large institutions. It would take a couple decades before costs would come down and some random dude from Finland could get anywhere on their own.
LLMs are in the same stage right now, I'm sure they'll become accessible, but it won't be for quite a while. If you want to work on them now, you have to join a larger team with the necessary resources. Or, you can pivot to working on smaller components and applications running on top. It's just as valid of an approach and plenty of people have made their fortunes doing just that.
TerryC_IndieGameDev@reddit (OP)
Very good point. Between paying the working to pay the bills, and being a single parent, I just dont have time to work with others. I wish I did. Many will say it, but I live it, being a single parent takes up 95% of your life. I have started working on smaller models. It is better than nothing. I feel if I make a truly amazing tiny model as a proof of concept perhaps someone will notice. Very valid points and I appreciate the response.
Able-Channel7739@reddit
Keeping the lights on is difficult enough in this economy, can't imagine how hard it must be as a single parent. I hope things will get easier over time.
As an example of something accessible, recently I was reading about the Copilot and their Contextual Filter Model. It's a tiny logistic regression model to figure out when it's a good time to make a code suggestion. Until fairly recently this would've been a messy formula you'd spend weeks tweaking. Now you can throw this together in sklearn in a couple of hours and pop into your program.
Building blocks like this is where an individual developer can really shine. They won't pay the bills by themselves, but if you're starting a software business, they can become a cornerstone of your tech.
One thing I'll say is, don't hope that "someone will notice". People always seem to make this mistake, and I did as well. You can have the most amazing product in the world, and nobody would care. You have to be very strategic and deliberate about how you present your work. A half-baked "minimum viable product" with adequate marketing would go a lot further than an an awesome polished product with none.
Good luck!
TerryC_IndieGameDev@reddit (OP)
Yes it is very hard in today's economy. I appreciate you and you are very correct. Thank you.
slcclimber1@reddit
Well written article. I have experienced similar situations, and it's frustrating and I feel limited. I wish I could compete with the big guys it we just can't.
TomWithTime@reddit
They aren't so great. The big players got nothing on my frog flower fish ai
TerryC_IndieGameDev@reddit (OP)
yeah its horrible. I have an EXCELLENT Tree of Thoughts dataset I made and I am limited to such small models. I just wish I had some funding. Keep your head up, someday we will be able to compete.
v2thegreat@reddit
If you want, I'd be happy to let you run your training on my 3090. I use it for ml too, but not so much lately, so I'd be happy to have it run on this for a few days.
Ofc, it'd be a lot of trust with the model, and I'd prefer to review the code that I'm running on my machine to make sure you won't hack my computer or anything lol.
On my 3090, I have been able to run largest models.
Ofc, it might just make much more sense for you to just rent an older machine from the cloud. I recommend lambda, but it makes a lot of sense to shop around and compare pricing. Doing a: "this will be 2x cheaper per hour, but run for 4x longer" is something to be aware of.
Realistically setting a budget is a healthy way for you to approach this, and sentdex has a great video of cloud vs local that I think you should check out
blind_disparity@reddit
If you boot an OS from usb, you won't need to worry about exposing your personal computer to someone's code and potential exploits. They could provide an image with the environment configured exactly as needed to do their training.
v2thegreat@reddit
Maybe, but what about other devices on my network? Wouldn't they be exposed as well?
blind_disparity@reddit
Your router may be capable of creating isolated subnets. Apparently some have a 'guest network' feature which sounds like a 1 click version of the same.
But if it doesn't, yes, other devices on the network will be exposed. Ideally none of these would be vulnerable anyway. If you stay current on security updates they're probably fine. If you've got a bunch of smart devices, well, personally I'd want those on a separate network anyway as they're generally not well secured. If you've got home servers running then I guess you'll know if they're secured.
But thinking about it, a live boot USB might not be a good choice if model training maxes out your RAM? If it's running on linux it should be quite lightweight, if it's windows obviously that's going to eat loads of the RAM just for the OS.
Anyway, hacking other network devices is much harder than compromising the machine that the code's already running on.
A VM is another good option and could also be provided as an image fully configured and ready to go. You can expose your graphics card directly to a VM and you'll be able to fully control the networking.
live boot or VM, if it doesn't need to connect to the internet you can just not connect it to your network at all. But you can control exactly what a VM can or can't talk to.
Nice-Offer-7076@reddit
Maybe this is your chance to zig when everyone else is zagging?
TerryC_IndieGameDev@reddit (OP)
You know that is a great way to look at things. I appreciate your unique point of view. Thank you!
Nice-Offer-7076@reddit
You're welcome. Just to clarify by zagging I had in mind what people like Yann Le Cunn are saying about LLMs. Useful but ultimately a dead end. Wouldn't it be great if we could take what we learnt from LLMs and reexamine other AI research areas and maybe find something that either:
a) does the same but faster/with far less resources
b) is better and/or advances us further along the AI highway in a different way to LLMs
https://x.com/ylecun/status/1796982509567180927
I have no doubt that the next big thing is being worked on by a few unknown people as we speak
TerryC_IndieGameDev@reddit (OP)
I really like the new liquid llm research, yet again, I have less than the resources I need. Things will improve or they wont. I feel we just need to stay positive. :) I have been working on AI since gpt2 and I am still limited by resources. I am now about to start training on a 3b model. Cross your fingers. :)
mOjzilla@reddit
Isn't this already known to pretty much anyone working in field, small developers have no chance the AI scene is dead on arrival for small guys. If we consider our consumer grade gpu as 3d printers then they are simply too small with current tech. Maybe in when our modeling changes things might be different where we don't need near infinite resources to train a good model, but I doubt that day will come, big techs will make sure of it.
TerryC_IndieGameDev@reddit (OP)
Yeah it is pretty horrible. I feel advanced would happen faster if the little guy had more access to be able to make stuff. I cant afford to pay for cloud hosting of a gpu as I am a single parent. I just cant justify spending the money on a hobby.
mOjzilla@reddit
I feel your pain brother, most innovations occur either by accident or some random nobody who is just tinkering away in his comfort zone, giant tech just standardizes it later on.
I remember 5 - 6 years back when Open AI or not sure who but some company was paying gamers to do their thing while it was idle. It was wild and they paid good price considering the scale of it regardless of where they lived. It took me a while to connect the dots but then I realised how much Open Ai has exploited our free internet and other models will have to pay way more now that the use case is known.
TerryC_IndieGameDev@reddit (OP)
salad still pays gamers to let you rent out your GPU while its idle. If you don't have backing you just can't make anything very large. Someday ill crack this and I too will make models others can use. Or at least I hope I will crack it.
BeautifulTennis3524@reddit
One big thing in AI is that its not always usable. So if you can make something usable for a small model, training it in the cloud a few times may incur acceptable costs.
Btw i think adapters on large models are often very good. So there may not be a real need to train your own 30b from scratch. Ofc that implies licencing…
TerryC_IndieGameDev@reddit (OP)
I can train adapters fine but after you train the lora adapter you must merge it to the raw mode to make a gguf. The merge is where I lack the vram for anything over a 3b not 30b 3b model.