Llama 405B running locally!
Posted by ifioravanti@reddit | LocalLLaMA | View on Reddit | 63 comments
https://preview.redd.it/foqiuzj0ezod1.png?width=3440&format=png&auto=webp&s=602c1dd1c694eb3106331d0cb1fb238873c269c2
https://preview.redd.it/wdp2aw91ezod1.png?width=2008&format=png&auto=webp&s=e4e24938e60fc30e15c40a74ce8f632ab9d68d8e
Here Llama 405B running on Mac Studio M2 Ultra + Macbook Pro M3 Max!
2.5 tokens/sec but I'm sure it will improve over time.
Powered by Exo: [https://github.com/exo-explore](https://github.com/exo-explore) and Apple MLX as backend engine here.
An important trick from Apple MLX creato in person: u/awnihannun
Set these on all machines involved in the Exo network:
sudo sysctl iogpu.wired\_lwm\_mb=400000
sudo sysctl iogpu.wired\_limit\_mb=180000
63 Comments
nomorebuttsplz@reddit
kryptkpr@reddit
ifioravanti@reddit (OP)
nomorebuttsplz@reddit
Short-Sandwich-905@reddit
Euphoric_Contract_96@reddit
ifioravanti@reddit (OP)
spookperson@reddit
Thomas27c@reddit
Short-Sandwich-905@reddit
MoneyPowerNexis@reddit
min2qaz@reddit
Kenny741@reddit
Shoddy-Tutor9563@reddit
pmp22@reddit
ifioravanti@reddit (OP)
toodimes@reddit
visionsmemories@reddit
MoffKalast@reddit
Evolution31415@reddit
quiettryit@reddit
ProtoSkutR@reddit
JacketHistorical2321@reddit
ortegaalfredo@reddit
spookperson@reddit
Evening-Detective976@reddit
spookperson@reddit
spookperson@reddit
Evening-Detective976@reddit
spookperson@reddit
Evening-Detective976@reddit
Expensive-Paint-9490@reddit
ResearchCrafty1804@reddit
dogcomplex@reddit
chrmaury@reddit
Maristic@reddit
ifioravanti@reddit (OP)
Roidberg69@reddit
claythearc@reddit
kao0112@reddit
ifioravanti@reddit (OP)
Aymanfhad@reddit
MoffKalast@reddit
nero10579@reddit
estebansaa@reddit
quiettryit@reddit
askchris@reddit
GreatBigJerk@reddit
drosmi@reddit
Thomas27c@reddit
fallingdowndizzyvr@reddit
spookperson@reddit
fallingdowndizzyvr@reddit
ifioravanti@reddit (OP)
s101c@reddit
kjerk@reddit
TypingImposter@reddit
ifioravanti@reddit (OP)
syberphunk@reddit
TypingImposter@reddit
FkingPoorDude@reddit
mrjackspade@reddit
ifioravanti@reddit (OP)