Analyze Images
This section of the Microsoft AI-102: Designing and Implementing a Microsoft Azure AI Solution exam covers analyzing images with Azure AI Vision. Below are study notes for each sub-topic, with links to Microsoft documentation, exam tips, and key facts
Select Visual Features to Meet Image Processing Requirements
π Docs: Azure AI Vision features
Overview
- Azure AI Vision can extract a variety of features from images
- Features include:
- Tags
- Objects
- Categories
- Descriptions
- OCR (printed and handwritten text)
- Spatial analysis
Key Points
- Select features based on requirements
- Multiple features can be combined in a single request
- Feature choice affects cost and performance
Exam Tip
Watch for scenario questions mapping requirements β correct feature
Detect Objects in Images and Generate Image Tags
π Docs: Object detection
Overview
- Object detection identifies entities within an image
- Image tagging generates a set of descriptive labels
Key Points
- Tags include confidence scores
- Object detection provides bounding boxes
- Can identify thousands of common objects
Use Case
Retail solution detecting products on shelves with bounding boxes
Include Image Analysis Features in an Image Processing Request
π Docs: What is Image Analysis?
Overview
- Image processing requests specify which features to analyze
- Request payload includes:
- Image source (URL or binary data)
- List of features
Key Points
- REST API and SDKs available (Python, C#, Java, JavaScript)
- Features requested determine output fields
- Can batch multiple images in one request
Exam Tip
Remember that URL or binary data can be used as inputs
Interpret Image Processing Responses
π Docs: Image descriptions
Overview
- Responses contain structured JSON results
- Includes:
- Tags with confidence
- Object bounding boxes
- Category hierarchy
- Text regions for OCR
Key Points
- Always check confidence thresholds
- Low-confidence results may need filtering
- Can integrate results with downstream apps (search, indexing, etc.)
Best Practices
Filter out results below a set confidence threshold for production apps
Extract Text from Images Using Azure AI Vision
π Docs: OCR - Optical Character Recognition
Overview
- OCR extracts printed text from images and documents
- Works with multiple languages
- Provides text lines and bounding box coordinates
Key Points
- OCR is asynchronous for large documents
- Text can be returned in plain text or structured format
- Often combined with Document Intelligence for advanced extraction
Use Case
Digitizing scanned contracts into searchable text
Convert Handwritten Text Using Azure AI Vision
π Docs: Vision Portal demo
Overview
- Recognizes handwritten text in images and documents
- Supports cursive and block-style handwriting
- Returns extracted text and bounding boxes
Key Points
- Accuracy depends on handwriting quality
- Works best with clear, high-resolution scans
- Can be used in note-taking or form digitization apps
Exam Tip
Keywords like handwriting or forms with writing β Azure AI Vision handwriting OCR
Quickβfire revision sheet
- π Visual features: tags, objects, categories, descriptions, OCR
- π Object detection β bounding boxes, tagging β descriptive labels
- π Requests can use URL or binary input, features specified per request
- π Responses contain confidence scores, bounding boxes, structured JSON
- π OCR extracts printed text, Handwriting OCR extracts cursive/block text