Image Analysis 4.0 now in public preview with a unified API endpoint and new OCR model

Microsoft

Oct 13, 2022

Enterprises and hobbyists alike have been using the Azure Computer Vision’s Image Analysis API to garner various insights from their images. These insights help power scenarios such as digital asset management, search engine optimization, image content moderation and alt-text for accessibility among others.

Newly improved features including Read (OCR)

We are thrilled to announce the public preview release of Computer Vision Image Analysis 4.0 which combines existing and new visual features such as Read (optical character recognition), captioning, image classification and tagging, object detection, people detection, smart cropping into one API. One call is all it takes to run all these features on an image.

The Read (OCR) feature integrates more deeply with the Computer Vision service and includes performance improvements that are optimized for image scenarios that make OCR easy to use for user interfaces and near real time experiences. Read now supports 164 languages including Cyrillic, Arabic, and Hindi languages.

OCR demonstrated for a road sign.

Tested at Scale and Ready for Deployment

Microsoft’s own products from PowerPoint, Designer, Word, Outlook, Edge, and LinkedIn are using Vision APIs to power design suggestions, alt-text for accessibility, search engine optimization (SEO), document processing and content moderation.

You can get started with the public preview by trying out the visual features with your own images on Vision Studio. Upgrading from a previous version of the Computer Vision Image Analysis API to V4.0 is simple with these instructions.

We will continue to release breakthrough vision AI through this new API over the coming months, including capabilities powered by the Florence foundation model featured in this year’s premiere computer vision conference keynote at CVPR.

Object detection showing a cat with a 91.10% confidence score.