Converted my unused laptop into a family server for gpt-oss 20B

Posted by Vaddieg@reddit | LocalLLaMA | View on Reddit | 94 comments

I spent few hours on setting everything up and asked my wife (frequent chatGPT user) to help with testing. We're very satisfied so far.

Keys specs:
Generation: 46-40 t/s
Context: 20K
Idle power: 2W (around 5 EUR annually)
Generation power: 38W

Hardware:
2021 m1 pro macbook pro 16GB
45W GaN charger
Power meter

Challenges faced:
Extremely tight model+context fit into 16GB RAM
Avoiding laptop battery degradation in 24/7 plugged mode
Preventing sleep and autoupdates
Accessing the service from everywhere

Tools used:
Battery Toolkit
llama.cpp server
DynDNS
Terminal+SSH (logging into GUI isn't an option due to RAM shortage)

Thoughts on gpt-oss:
Very fast and laconic thinking, good instruction following, precise answers in most cases. But sometimes it spits out very strange factual errors never seen even in old 8B models, it might be a sign of intentional weights corruption or "fine-tuning" of their commercial o3 with some garbage data