Computer Vision in Azure AI solutions
Source: My personal notes from Course AI-102T00-A: Develop AI solutions in Azure - Microsoft Learn with labs from Exercises for Develop vision solutions in Azure
Image Analysis
Section titled “Image Analysis”Use cases: create captions for image, image tagging, detect and locate objects and people
Existing models will respond with JSON with the image information. Response will include information like entity and the boundary box of the entity in the image.
Access can be using an SDK or API to submit images and get responses.
Text in Images
Section titled “Text in Images”Use cases: get information from text like addresses, identifiers, numbers, photographer, license plates, and digitize notes
Optical character recognition (OCR) recognizes text and structure. Responses from the service include blocks and text and information for that block.
Example solution: A camera periodically uploads images to blob storage. On an event, the code checks for an update, extracts entities and responds.
Facial Recognition: Detect, Analyze And Recognize Faces
Section titled “Facial Recognition: Detect, Analyze And Recognize Faces”Use cases: detect and locate faces, analyze facial features.
Using Azure AI face services requires approval due to sensitivity and are used for facial recognition solutions. For features, see Computer Vision (CV) Concepts and in Azure - Computer Vision (CV) Concepts and in Azure
Accessing Azure AI Vision Face resources can be done using Face SDK to get detection and face attributes and features responses.
Classify images and detect objects
Section titled “Classify images and detect objects”Use case: classify images (categorization) and detect object(s) and classify them
Classification can be:
- Multi class where each image is tagged with 1 class label from several, for example apple from [apple, orange, pineapple, banana].
- Multi label where each image can be tagged with multiple classes, for example fruit bowl with apple, orange, and banana
Example solutions: Food processing where image classification checks for “good” versions of a food product and remove “bad” versions. Medical imaging checks if a disease is present or not.
See Research Data - Mendeley Data for sample data sets to play with.
Azure AI Custom Vision
Section titled “Azure AI Custom Vision”Models can be trained with custom classifications.
Generative AI and Vision
Section titled “Generative AI and Vision”Multimodel generative AI
Section titled “Multimodel generative AI”Multimodel generative AI model respond to prompts and return created content. Prompts can include text, speech, and images and typically include a text part and media part.
Generate images with AI
Section titled “Generate images with AI”Uses prompt to create images.
Image generation can have responsible AI implications like other generative AI with malicious use.
Exercise: Analyze Images
Section titled “Exercise: Analyze Images”Create Azure AI vision resource and using the SDK, submit images for captions, entity location, get objects, tagging, and people identification.
Exercise: Detect and analyze faces
Section titled “Exercise: Detect and analyze faces”Create a face service and using the SDK, submit images for face detection and boundary box location.
Exercise: Generate images with AI
Section titled “Exercise: Generate images with AI”Use the OpenAI DALL-E generative AI model like dall-e-3 to create
images and OpenAI Python SDK to create a simple app to generate images
based on prompts.