Florence2 + SAM2 🔥

This demo integrates Florence2 and SAM2 by creating a two-stage inference pipeline. In the first stage, Florence2 performs tasks such as object detection, open-vocabulary object detection, image captioning, or phrase grounding. In the second stage, SAM2 performs object segmentation on the image.

Mode
Select a mode to use.
Examples