Extropics TPU??
Posted by Excellent_Koala769@reddit | LocalLLaMA | View on Reddit | 11 comments
Hey guys, here is a YouTube video I recently watched by David Shapiro. Didn't really understand most things that were being said... Can anyone translate this for me lol?
What are TPUs and why are they revolutionary?
https://youtu.be/mNw7KLN7raU?si=Z0W7NdScI9yTpQEh
GreenTreeAndBlueSky@reddit
From my understanding of what TSUs do they's be useful only for sampling at the very end of all the layers at inference. Not sure how this is supposed to have a significant impact on energy consumption at all.
EmperorOfNe@reddit
The Tensor array in TSU doesn't differ from matrices, which comes with the benefit of using each memory address for both layer loading and KV caching. GPU waste a lot of resources because they can only do sequential processing as they have no buffers other than going down the tree in memory tree. At least that is my understanding.
A very simple ascii art representation of both shows the difference
GPU Processing including KV cache:
TSU including KV cache:
I hope this highlights how, despite powerful parallel compute units, GPUs often suffer from memory over-allocation, fragmentation, and bandwidth bottlenecks that leave many resources underutilized in LLM inference workloads.
The streamlined, tiled, and instruction-controlled design of TSUs achieves much higher hardware utilization with less waste compared to the layered and often over-provisioned GPU KV cache systems.
Waste in idle resources but still using a lot of power versus highly condensed access writes without loss of resources will deliver substantial energy savings of watts vs token processing (joules).
Double_Cause4609@reddit
Extropic produces extremely efficient processors that operate on physical logic, whereas most processors we use operate on digital logic. Extropics methods are somewhat speculative (still very early days), probably don't offer enough numerical stability for training, and have a limited set of viable implemented functions.
Long story short: Very cool, but when they're actually viable, we'll probably have competing products available and a reasonable ecosystem of hardware that sounds like sci-fi to us now.
Finanzamt_Endgegner@reddit
TPUs are what for example Google uses for AI instead of GPUs. They are basically more specialized for AI and tensor operations than normal GPUs. This makes sense because GPU stands for Graphics Processing Unit, while TPU stands for Tensor Processing Unit. Which means its cheaper and wastes less power (;
GreenTreeAndBlueSky@reddit
Exptropic makes TSUs, OP miswrote
Finanzamt_Endgegner@reddit
oh yeah right, those are a bit more complicated, basically switching deterministic arithmetic for probabilistic sampling to get more energy efficient ai hardware, though it will only help with denoising type ai no?
GreenTreeAndBlueSky@reddit
They dont really talk about llms but they do say they can do diffusion and some classifiers.
Finanzamt_Endgegner@reddit
yeah llms generally dont use denoising (at least atm) but image models etc do, and there are tests with diffusion models for llms
SlowFail2433@reddit
Hardware that allows floats between 0 and 1
OkIndependence3956@reddit
My brain read "Ben Shapiro", caught me off guard for a second.
hurtreallybadly@reddit
TPU, LPU, GPU, CPU
the holy grail.
I like Groq's LPU though :)