Hugging Face optimised Segment Anything 2 (SAM 2) to run on-device (Mac/ iPhone) with sub-second inference!

[-]

vaibhavs10@reddit (OP)

We're shipping the following: Apache licensed optimised model checkpoints - tiny, small, base ad large!: https://huggingface.co/collections/apple/core-ml-segment-anything-2-66e4571a7234dc2560c3db26 Open source application to annote any image in a sub-second: https://github.com/huggingface/sam2-studio Conversion guides for SAM2 fine tunes like Medical SAM and much more! P.S. Video support on its way, what would you like to see next? 🤗

Reply

[-]

VermicelliNo821@reddit

Hello, any news about video support? 😀

Reply

[-]

reccehour@reddit

Been hacking away with the demo for a few days and it's awesome. Do you guys plan to add an iOS example?

Reply

[-]

chibop1@reddit

Is there SAMAutomaticMaskGenerator that can automatically generate masks for different areas from just image?

Reply

[-]

Enough-Meringue4745@reddit

It's essentially just a grid of points passed into SAM that are passed through an NMS filter

Reply

[-]

vaibhavs10@reddit (OP)

Perhaps a fine-tune, but SAM2 natively does not support it.

Reply

[-]

chibop1@reddit

Ah, I could be wrong, but I thought I read SAM1 supported SAMAutomaticMaskGenerator. Too bad it's taken out in SAM2. :(

Reply

[-]

Calcidiol@reddit

Looks good! I have one question about the optimized aspect -- fairly obviously you've optimized it FOR inference and running under core-ml and your devices, which I assume may be all of the scope of the optimization. But are there any other highly significant noteworthy "generic" aspects of optimization that simply make it more efficient in general that are not particularly platform specific that were encompassed?

Reply

[-]

MostlyRocketScience@reddit

That looks fantastic! Imagine an image editor where you can just drag objects around with Stable Diffusion automatically inpainting in the missing background. No lasso selection, no manually clone stamping the background, just simple and intuitive dragging.

Reply

[-]

lordpuddingcup@reddit

Cool to see Apple putting work into really optimized versions.

Reply

[-]

vaibhavs10@reddit (OP)

We’ll add it in the roadmap! 🤗

Reply

[-]

Everlier@reddit

The moment when you think the color picker didn't open during a live demo, but it opened on one of the 6 12K monitors you use to read papers about ML.

Reply

[-]

vaibhavs10@reddit (OP)

Hahahaha! True! And to play crysis

Reply

[-]

xlrz28xd@reddit

Quick question - which TUI tool is that showing resource utilisation?

Reply

[-]

wolttam@reddit

mactop (it says in the titlte bar)

Reply

[-]

chibop1@reddit

Is there an example code to use it in Swift?

Reply

[-]

Eliiasv@reddit

Very cool.

Reply

Hugging Face optimised Segment Anything 2 (SAM 2) to run on-device (Mac/ iPhone) with sub-second inference!

Reply to Post

17 Comments

vaibhavs10@reddit (OP)

VermicelliNo821@reddit

reccehour@reddit

chibop1@reddit

Enough-Meringue4745@reddit

vaibhavs10@reddit (OP)

chibop1@reddit

Calcidiol@reddit

MostlyRocketScience@reddit

lordpuddingcup@reddit

vaibhavs10@reddit (OP)

Everlier@reddit

vaibhavs10@reddit (OP)

xlrz28xd@reddit

wolttam@reddit

chibop1@reddit

Eliiasv@reddit