I made a voice controlled Tic-Tac-Toe game as a learning project
Posted by dabiggmoe2@reddit | LocalLLaMA | View on Reddit | 2 comments
Hi,
First of all, I know this might be a silly project, but I made it specifically as an educational project for me in order to learn about finetuning SLMs and utilizing a full pipeline of ASR (Transcription) -> SLM (Intent Parsing) -> Executing Actions -> TTS (Synthesizing results).
I generated my own \~1000 dataset to finetune Gemma4-4B to parse the input intent and toolcall my custom game functions.
Feel free to clone it and test it out https://github.com/moedesux/voice-tic-tac-toe .
I know this might be basic knowledge for most of you here, but I did learn a lot by doing this concrete project more than watching hours of youtube videos. I would very happy and it would make it worthwhile if it can help anyone else in their learning journey.
P.S. (It works perfectly on machine, YMMV 😉 )
P.P.S. I panic deleted my first post because my friends told me the repo link wasnt working. Turned out I forgot the repo was private lol. Sorry again for the repost. This time it will work
P.P.P.S The 2nd post was mistakenly removed by the mods by the mod u/ttkciar was kind enough to restore it and offered the option to repost it so it can appear in the "New" sorting and I accepted his offer 😄
Certain-Cod-1404@reddit
Really cool project, how did you go about generating the dataset ? And why go with an SLM instead of an encoder classifier ?
dabiggmoe2@reddit (OP)
Thanks. I generated about 30-40 training data by hand for each function and then with the help of Qwen3.6-27B I asked it to generate more synthetic training data based on my sample. I had to review them manually and do some manual data cleaning but it was worth it since I didn't have to write 1000 training data by hand. Just the initial ~40 samples.
Tbh this is the first time I hear about the encoder classifier. That's why I love reddit. Would you be kind enough to tell me the difference and how could it be better than the SLM route I took?