Ollares one: miniPC with RTX 5090 mobile (24GB VRAM) + Intel 275HX (96GB RAM)

Posted by CapoDoFrango@reddit | LocalLLaMA | View on Reddit | 12 comments

It came to my attention this new product: https://one.olares.com that is still not available for sale (kickstarter campaign to start soon).

The specs:

Processor: Intel® Ultra 9 275HX 24 Cores, 5.4GHz
GPU: NVIDIA GeForce RTX 5090 Mobile 24GB GDDR7
Memory: 96GB RAM (2×48GB) DDR5 5600MHz
Storage: 2TB NVMe SSD PCIe 4.0
Ports: 1 × Thunderbolt™ 5 1 × RJ45 Ethernet (2.5Gbps) 1 × USB-A 1 × HDMI 2.1
Wireless Connectivity: Wi-Fi 7 Bluetooth 5.4
Power: 330W
Dimensions (L × W × H): 320 × 197 × 55mm
Weight: 2.15kg (3.1kg with PSU)

Initial price seems it would be around $4000 based on the monthly calculations where they compare it with rented services, where it says "Stop Renting"

It would come with a special distribution of Linux ([Olares](https://github.com/beclab/Olares)) that would make easier to install containerized apps via an app-store and it will run run Kubernetes under the hood, but being a standard Intel chip it should not be difficult to wipe that and install whatever you want inside.

Would this be able to compete with other mini-PCs based on the Ryzen AI Max+ 395 (Strix Halo) or with the NVIDIA DGX Spark ?

[-]

FullOf_Bad_Ideas@reddit

Here's a review of sorts from an AI Influencer Bijan Bowen (pun intended)

Olares One First Look & Testing – A Personal Cloud & Local AI PC

I also asked them about throttling and shipping on their discord. If you want to buy one and have any questions, I think it would be very easy for you to reach out to the founder of the company for more first-hand details.

On power and thermals, the machine holds 55 W on the CPU and 175 W on the GPU without throttling. We’ve put it through a lot of stress testing to make sure power delivery stays stable. The harder part is keeping it quiet while doing that, so we spent most of our time on acoustics: custom low‑noise fans, a big vapor chamber, and lots of airflow and vent tuning. In the lab we see about 23 dB in everyday use and around 38.8 dB with the GPU fully loaded. But in my own experience, day to day it feels silent for light work, and at full load it blends into normal office background noise. It’s not Mac Studio‑quiet yet, but it’s clearly quieter than other 5090 gaming laptops with same configurations.

On production and shipping, we’re working with a top‑tier OEM with deep experience in gaming laptops and mini PCs with GPUs. DVT is done, and units are with certification institutions for CE/FCC and others. Some certs should land around the Kickstarter launch, with the rest following in December or January. To reduce supply risk we also secured key components with NVIDIA and Intel about six months ago.

Some details are on their Kickstarter page which doesn't accept orders yet, but they have charts and info about production timelines - https://www.kickstarter.com/projects/167544890/olares-one-the-local-al-powerhouse-on-your-desk

I think it's a good device for an AI engineer looking to run various AI projects without having a big workstation. It can also be used for gaming after work. It's quite a lot bigger than Spark, but that's good since at least you can expect better performance and cooling. Software stack, as shown in Bijan's video, looks genuinely helpful and I think many people in the community will vibe with the idea of personal AI cloud.

I don't think it's the best miniPC if you want to specifically run big MoE LLM models like Minimax M2 or GPT OSS 120B on large contexts, I think DGX Spark might do better there, but a lot of AI engineering work is testing things on smaller models, or trying out new projects from GitHub, where x86 + CUDA are often required for all dependencies to work, so I think this machine is so much more compatible with various AI projects than Spark or Strix Halo, it's hard to communicate. It might be a Compaq of local AI for all we know.

[-]

Serprotease@reddit

It’s not really a miniPc in my opinion.

Based on the video, it’s not N100, Mac mini, or Spark(This thing is incredibly tiny) level. You know, the kind of hardware that could actually fit in a decent sized pocket.

If you have one of these mini rack for Pi, it looks a bit too large to fit. It’s more like optiplex/17inch gaming pc level.

It’s still small, and I don’t want to be pedantic about it, but it also means that one probably should look also at the value proposition of a 3k laptop/optiplex (with an A4000 sff) and other sff computers if in the market for a mix Cuda+x86+sff.

[-]

FullOf_Bad_Ideas@reddit

That's a good point.

A laptop with similar specs could perform similarly and possibly be found at lower cost on some deal.

caiodelgado@reddit

I made a video review here on PTBR/EN

https://youtu.be/lQDwKHC81oY

Its an amazing machine, if you guys have any questions/tests I can do :)

teckel@reddit

Wouldn't a AMD Ryzen Al Max+ 395 CPU with 128GB LPDDR5X 8000MHz RAM be a faster LLM system that's also smaller, more efficient, and probably has more uses as personally I'd also use it as a Plex server, NAS (with external RAID), torrent server, pi-hole, etc.?

Sufficient_Prune3897@reddit

There is no way anything with a mobile GPU is gonna be worth it's money

misterflyer@reddit

That's impossible to determine without knowing the user's use case. My Alienware Area 18" is slowly replacing my need to run models over API service. I'm running 24B & 32B models at acceptable speeds, and I even have an Unsloth 72B Kimi running pretty well on it.

I can take this thing with me anywhere I want to go (i.e., don't have to be at home just to run local AI nor do I need to be on the internet), and I didn't even have to spend $5k+

I'd say this is more than worth it for my use cases. I'm sure other people would feel the same way depending on their particular situation/use case.

Adventurous-Date9971@reddit

Mobile GPUs are worth it if you tune the setup and size models to VRAM. On a 24GB card, 24B in Q5 or 32B in Q4KM runs fine; 72B only works with partial offload and a slower KV cache on RAM. What helps a lot: force dGPU via the MUX, cap GPU to a stable wattage, undervolt with a gentle curve, keep fans aggressive, and use a cooling pad. In llama.cpp, use QK_M or EXL2, paged KV, and a small draft model for speculative decoding; that alone can shave latency. Keep models on fast NVMe and kill any VRAM-hungry background apps. For longer contexts, offload KV to CPU pinned memory; it’s slower but stable on the road. I still hit OpenRouter for huge contexts and spin short A100 bursts on RunPod for finetunes, while DreamFactory exposes my Postgres as REST so local RAG stays clean without the model touching the database. Dialed-in, a mobile GPU rig absolutely makes sense.

Or you could do what I did and buy the Alienware Area 18" mobile 5090 version. It's $3200 at Microcenter. The Legion Pro 7i 16" mobile 5090 version is a little cheaper.

Even with the crazy stupid RAM price increases, you might be able to upgrade it beyond 64GB ram and stay under $4K total.

The 275HX can support up to 256GB RAM (whenever such a laptop kit comes to the market).

Freonr2@reddit

Already a discussion here:

https://www.reddit.com/r/LocalLLaMA/comments/1otveug/a_startup_olares_is_attempting_to_launch_a_small/

My TLDR:

To briefly set the stage, the 5090 mobile is based on the 5080 desktop chip, is a bit less compute and bandwidth than a 5080 desktop owing to being a mobile part, but with 24GB instead of 16GB. The 5080 desktop is already about half the speed of a 5090. The system is still a dual channel DDR5 system after you run out of VRAM, just like a normal desktop. There's nothing special going on here that requires lengthy rehashing what the performance would be like beyond that.

It feels to me like a very narrow market for the particular combination of very tiny form factor, good diffusion and smaller LLM performance (<32B), and low power usage. Otherwise, a better value is probably found for large (80-120B) MOE LLM in the 395/Spark, or a desktop 5090 which you can at least cram into a "smallish" mini ITX case like the Corsair 2000D.

MaxKruse96@reddit

just from the specs alone: no it cant compete with unified memory machines since its ram is slower by 4-8x compared to them. if you can fit something in 24gb, it will be a lot better though.

i guess its an actual "fast AI homeserver" offering, in which case given current market situation, its probably ok, although you do pay a premium for the formfactor and all that

CapoDoFrango@reddit (OP)

I can see this as interesting options for developers that want to use an Nvidia stack (cuda) as well as a x86 stack, so pre-built binaries and containers work without issues. The Nvidia DGX Spark has an ARM CPU, which is incompatible with x86 binaries.