Referring Expression Segmentation

SAM4MLLM: Enhance Multi-Modal Large Language Model for Referring Expression Segmentation

We introduce SAM4MLLM, an innovative approach which integrates the Segment Anything Model (SAM) with Multi-Modal Large Language Models (MLLMs) for pixel-aware tasks. Our method enables MLLMs to learn pixel-level location information without requiring …