Meta launches new AI models for advanced image detection and 3D reconstruction

Meta has introduced two new artificial intelligence models, SAM 3 and SAM 3D, as part of its Segment Anything Collection. These models are now available for public use through the Segment Anything Playground platform.

SAM 3 is designed to detect and track objects in images and videos using both text and visual prompts. Unlike earlier versions, which only supported segmentation based on visual cues, SAM 3 allows users to segment objects by entering detailed text descriptions. According to Meta, this enables the model to identify a wider range of concepts, including more specific details such as “red baseball cap,” instead of being limited to generic labels like “bus” or “car.” The company states:

"SAM 3 overcomes this limitation, accepting a much larger range of text prompts. Type in 'red baseball cap' and SAM 3 will segment all matching objects in the image or video. SAM 3 can also be used with multimodal large language models to understand longer, more complex text prompts, like 'people sitting down, but not wearing a red baseball cap.'"

Meta plans to incorporate SAM 3 into its creative media tools. For example, Edits—Meta’s video creation app—will soon allow creators to apply effects to specific people or objects within their videos. Additional features powered by SAM 3 will be added to Vibes on the Meta AI app and meta.ai.

SAM 3D consists of two open source models that reconstruct three-dimensional objects from a single image. One model focuses on object and scene reconstruction while another estimates human body shape. Both are intended for use in fields such as robotics, science, sports medicine, augmented reality/virtual reality (AR/VR), and game asset creation. As stated in the release:

"The SAM 3D release marks a significant step in leveraging large scale data to address the complexity of the physical world. It has the potential to significantly advance critical fields like robotics, science, and sports medicine..."

To support research progress in this area, Meta collaborated with artists on an evaluation dataset called SAM 3D Artist Objects that includes diverse images and objects.

A practical application of these advancements is already visible on Facebook Marketplace with a new feature called View in Room. This tool lets users visualize how home decor items might look in their spaces before making a purchase.

The Segment Anything Playground platform provides access to these models without requiring technical expertise from users. Individuals can upload images or videos and use short text prompts for object segmentation or employ the tools for creative edits such as spotlight effects or pixelating faces.

As part of this launch:

- Meta is releasing model weights for SAM 3 along with an evaluation benchmark dataset for open vocabulary segmentation.

- A research paper detailing how SAM 3 was built is also available.

- Through partnership with Roboflow annotation platform, users can annotate data and fine-tune the model for specific needs.

- For SAM 3D, Meta is sharing model checkpoints and inference code along with a new benchmark dataset that sets higher standards for realism compared to existing options.

More information about these developments can be found at this link.

"We’re excited to share these innovative new models with you, and hope they empower everyone to explore their creativity, build, and push the boundaries of what’s possible," Meta said.

Learn more about SAM 3 and SAM 3D on the AI at Meta blog.

Meta launches new AI models for advanced image detection and 3D reconstruction

Related