Announcing a renaissance in computer vision AI with Microsoft's Florence foundation model


Written by Xuedong Huang, Technical Fellow, Cloud and AI


Extract robust insights from image and video content with Azure Cognitive Service for Vision

We are pleased to announce the public preview of Microsoft’s Florence foundation model, trained with billions of text-image pairs and integrated as cost-effective, production-ready computer vision services in Azure Cognitive Service for Vision. The improved Vision Services enables developers to create cutting-edge, market-ready, responsible computer vision applications across various industries. Customers can now seamlessly digitize, analyze, and connect their data to natural language interactions, unlocking powerful insights from their image and video content to support accessibility, drive acquisition through SEO, protect users from harmful content, enhance security, and improve incident response times.


Microsoft was recently named a Leader in the IDC MarketScape: Worldwide General-Purpose Computer Vision AI Software Platforms 2022 Vendor Assessm... (doc #US49776422, November 2022). The new Vision Services improves content discoverability with automatic captioning, smart cropping, classifying, background removal, and searching for images. Furthermore, users can track movements, analyze environments, and receive real-time alerts with responsible AI controls.


Reddit will be using Vision Services to generate captions for hundreds of millions of images on its platform. Tiffany Ong, Reddit Product Manager of Consumer Product has said,


“With Microsoft’s Vision technology, we are making it easier for users to discover and understand our content. The newly created image captions make Reddit more accessible for everyone and give redditors more opportunities to explore our images, engage in conversations, and ultimately build connections and a sense of community."

Read the full article

0 Replies