New Local LLM Rig: Ryzen 9700X + Radeon R9700. Getting ~120 tok/s! What models fit best?

Posted by jsorres@reddit | LocalLLaMA | View on Reddit | 11 comments

Hi ! I just finished building a workstation specifically for local inference and wanted to get your thoughts on my setup and model recommendations.

•GPU: AMD Radeon AI PRO R9700 (32GB GDDR6 VRAM)

•CPU: AMD Ryzen 7 9700X

•RAM: 64GB DDR5

•OS: Fedora Workstation

•Software: LM Studio (Vulkan backend), wanna test LLAMA

•Performance: Currently hitting a steady \~120 tok/s on simple prompts. (qwen3.6-35b-a3b)

What is the largest model architecture you recommend running comfortably? Should I be focusing on Q4_K_M quantizations ?