Wanting to try out L3-70b-euryale on my computer, but don't know what version to choose
Posted by EEEEEEEEEEEEEEEE_Man@reddit | LocalLLaMA | View on Reddit | 2 comments
So I'm interested in using l3-70b-euryale as a chatbot, but I don't know what model to choose. I checked on what to choose on for performance, and other guides, but it's WAY too confusing to follow, like there's zero examples on what to choose on a pc build. And I'm pretty new with local AI
Specs:
CPU: AMD Ryzen 7 5700X
Ram: 16GB of DDR4
GPU: RTX 3070
OS: Windows 10
Linkpharm2@reddit
You can't run this model. L3-70b is the model. 70b refers to 70 billion numbers. The most you can compress this down to is about 24gb. You have 8?12? I forget. You can only use models that fit in your vram.
alamacra@reddit
By your own (very convenient!) table, it will run in Q2 if the guy uses koboldcpp / llamacpp / any other launcher than can offload layers to RAM. It will be slow though, since it'll be DDR4 dual channel at most, so about 1 token/s if you are lucky.
So I suggest to try it, but I suspect the speed won't be enough. The quality is actually surprisingly decent for the 70b Q2 models.