Hugging Face launches a new repo type: Kernels
Posted by clem59480@reddit | LocalLLaMA | View on Reddit | 17 comments
Posted by clem59480@reddit | LocalLLaMA | View on Reddit | 17 comments
xignaceh@reddit
Am I right in understanding that this means optimized code/instructions for given hardware?
FullOf_Bad_Ideas@reddit
if devs will use it this way.
It's just that we had models and datasets until now, more or less. S3 storage called this way. Now HF will also host repos marked as "kernels", also S3 storage.
Without dev effort that doesn't change how well anything runs on your hardware - devs have gotten a new drive to store their code in but that's it.
brahh85@reddit
i want to vibe code some kernels, and having a selection of good ones in the same place saves me a lot of effort in my experiments https://huggingface.co/kernels-community/kernels
FullOf_Bad_Ideas@reddit
it's basically hypothetical "Awesome Kernels" github copied to HF.
brahh85@reddit
it isnt https://github.com/goabiaryan/awesome-gpu-engineering
in hugging face with 2 clicks you get source code , the github is nice as a guide to learn , but the process to feed code to an AI is way more tortuous
ANR2ME@reddit
No Vulkan? (ie. for mobile GPU support)
Individual_Holiday_9@reddit
How do theee guys make money?
patricious@reddit
I guess LLM vendors pay them more for the exposure and hosting, might be completely wrong tho 🤷
Alex_L1nk@reddit
They claim ROCm support but none of FA2\3 or activation or RELU or any basic ML operation lack builds for AMD. Only CUDA and occasionally XPU with Metal, which is very sad. Unless I'm missing something?
a_beautiful_rhind@reddit
I don't know any backends with plainly swap-able kernels.
So you get some code and have to put in the work?
woct0rdho@reddit
Transformers does it exactly, see https://huggingface.co/docs/transformers/en/kernel_doc/loading_kernels
a_beautiful_rhind@reddit
Neat. TIL.
__JockY__@reddit
Holy shit, I’m stoked for some sm120 love!
FullOf_Bad_Ideas@reddit
Predictable, but I won't be hyped about another label for simple universal data storage.
That's just github releases page but stored on AWS instead of Azure.
I hope they'll follow through and add integrations with pip and community projects.
woct0rdho@reddit
It's more than universal data storage. It works closely with transformers. Transformers practically defines the interfaces of some commonly used operators such as attention and MoE, and now we can easily download kernels that implement these interfaces, such as FlashAttention and SageAttention.
charmander_cha@reddit
Esperando ansiosamente entender do que se trata
PotatoQualityOfLife@reddit
I literally, this morning, custom compiled Kernel 7 RC7 because this wasn't a thing. LOL