TheaterFire

Hugging Face optimised Segment Anything 2 (SAM 2) to run on-device (Mac/ iPhone) with sub-second inference!

Posted by vaibhavs10@reddit | LocalLLaMA | View on Reddit | 17 comments

Reply to Post

17 Comments

vaibhavs10@reddit (OP)

We're shipping the following: Apache licensed optimised model checkpoints - tiny, small, base ad large!: https://huggingface.co/collections/apple/core-ml-segment-anything-2-66e4571a7234dc2560c3db26 Open source application to annote any image in a sub-second: https://github.com/huggingface/sam2-studio Conversion guides for SAM2 fine tunes like Medical SAM and much more! P.S. Video support on its way, what would you like to see next? 🤗
View on Reddit #35762706

VermicelliNo821@reddit

Hello, any news about video support? 😀
View on Reddit #44858194

reccehour@reddit

Been hacking away with the demo for a few days and it's awesome. Do you guys plan to add an iOS example?
View on Reddit #38153426

chibop1@reddit

Is there SAMAutomaticMaskGenerator that can automatically generate masks for different areas from just image?
View on Reddit #35775368

Enough-Meringue4745@reddit

It's essentially just a grid of points passed into SAM that are passed through an NMS filter
View on Reddit #35823959

vaibhavs10@reddit (OP)

Perhaps a fine-tune, but SAM2 natively does not support it.
View on Reddit #35778176

chibop1@reddit

Ah, I could be wrong, but I thought I read SAM1 supported SAMAutomaticMaskGenerator. Too bad it's taken out in SAM2. :(
View on Reddit #35780994

Calcidiol@reddit

Looks good! I have one question about the optimized aspect -- fairly obviously you've optimized it FOR inference and running under core-ml and your devices, which I assume may be all of the scope of the optimization. But are there any other highly significant noteworthy "generic" aspects of optimization that simply make it more efficient in general that are not particularly platform specific that were encompassed?
View on Reddit #35790374

MostlyRocketScience@reddit

That looks fantastic! Imagine an image editor where you can just drag objects around with Stable Diffusion automatically inpainting in the missing background. No lasso selection, no manually clone stamping the background, just simple and intuitive dragging.
View on Reddit #35781813

lordpuddingcup@reddit

Cool to see Apple putting work into really optimized versions.
View on Reddit #35763032

vaibhavs10@reddit (OP)

We’ll add it in the roadmap! 🤗
View on Reddit #35778500

Everlier@reddit

The moment when you think the color picker didn't open during a live demo, but it opened on one of the 6 12K monitors you use to read papers about ML.
View on Reddit #35772175

vaibhavs10@reddit (OP)

Hahahaha! True! And to play crysis
View on Reddit #35778133

xlrz28xd@reddit

Quick question - which TUI tool is that showing resource utilisation?
View on Reddit #35771078

wolttam@reddit

mactop (it says in the titlte bar)
View on Reddit #35775764

chibop1@reddit

Is there an example code to use it in Swift?
View on Reddit #35771768

Eliiasv@reddit

Very cool.
View on Reddit #35769193