Florence2 + SAM2 🔥
This demo integrates Florence2 and SAM2 by creating a two-stage inference pipeline. In
the first stage, Florence2 performs tasks such as object detection, open-vocabulary
object detection, image captioning, or phrase grounding. In the second stage, SAM2
performs object segmentation on the image.