GPT-OSS-120B vs DGX Spark

Posted by AdamLangePL@reddit | LocalLLaMA | View on Reddit | 18 comments

Just curious what are your best speeds with that model. The max peak that i have using vllm is 32tps (out) on i think Q4 k\_s. Any way to make it faster without loosing response quality ?