Large Language Models (LLMS) in Computer Vision Computer vision is a branch of Artificial Intelligence (AI) that emulates human visual capabilities by using advanced algorithms. Cutting-edge AI technologies reached a level of accuracy that is comparable to human vision in many tasks, such as image classification, object detection, and image captioning.
These incredible results have been possible thanks to the introduction of the so-called Large Language Models (LLMS), advanced machine learning models trained on a huge amount of unlabeled data from diverse sources like books, articles and websites, allowing it to capture general patterns and structure of data and to be adapted to a wide variety of tasks.
Understanding how to integrate computer vision into your applications is crucial to develop modern, smart solutions in several domains, like retail or security, and to guarantee accessibility to your services, for example, by providing a text description of the images you share on your website.
In addition to that, you'll see an example of how you can embed all these capabilities in your solution through REST APIs.
This session was hosted as part of the Microsoft Learn AI Skills Challenge motion, designed to enable participants to get a head start on immersive and curated AI training content across Microsoft products and services, including a series of workshops in multiple languages. You can watch them all on-demand at AI Skills Challenge - Events | Microsoft Learn.