Implement Custom Vision Models
This section of the Microsoft AI-102: Designing and Implementing a Microsoft Azure AI Solution exam covers building, training, and deploying custom vision models. Below are study notes for each sub-topic, with links to Microsoft documentation, exam tips, and key facts
Choose Between Image Classification and Object Detection Models
๐ Docs: What is Custom Vision?
Overview
- Image classification: predicts the overall category of an image
- Object detection: identifies and locates multiple objects within an image using bounding boxes
Key Points
- Use classification when one label per image is enough
- Use object detection when multiple items need identifying
- Both require labeled training data
Exam Tip
Scenario question: โIdentify multiple items in a photoโ โ Object detection
Label Images
๐ Docs: Tag and label images
Overview
- Labeled images are required for supervised learning
- Images must be uploaded and tagged with the correct category
Key Points
- Tags must be consistent across training set
- At least 15โ30 images per tag recommended
- Balanced datasets improve accuracy
Best Practices
Use diverse images (lighting, angle, resolution) for robust models
Train a Custom Image Model
๐ Docs: Train custom models
Overview
- Training uses the uploaded and labeled dataset
- Types:
- Image classification
- Object detection
Key Points
- Quick Training โ faster, less accurate
- Advanced Training โ slower, more accurate
- Requires multiple iterations for tuning
Exam Tip
Training type and dataset size impact accuracy and cost
Evaluate Custom Vision Model Metrics
๐ Docs: Evaluate model performance
Overview
- Evaluation metrics help measure model quality
- Metrics:
- Precision = true positives รท (true positives + false positives)
- Recall = true positives รท (true positives + false negatives)
- mAP (mean average precision) for object detection
Key Points
- Tradeoff: precision vs recall
- High recall = fewer missed detections
- High precision = fewer false positives
Exam Tip
Expect formula-style questions on precision vs recall
Publish a Custom Vision Model
๐ Docs: Train a Computer Vision Model with Azure Custom Vision
Overview
- After training, models must be published to an endpoint
- Published models can be accessed via API calls
Key Points
- Each project can have multiple iterations
- Must publish the correct iteration for inference
- Endpoint includes prediction URL and key
Consume a Custom Vision Model
๐ Docs: Call the Prediction API
Overview
- Use REST API or SDKs to submit images for prediction
- Input formats: URL or binary image data
Key Points
- Predictions include label + confidence score
- Results returned in JSON format
- Supports batch processing
Use Case
import requests
url = "https://<endpoint>/customvision/v3.0/Prediction/<project-id>/classify/iterations/<iteration>/image"
headers = {"Prediction-Key": "KEY", "Content-Type": "application/octet-stream"}
with open("test.jpg", "rb") as f:
resp = requests.post(url, headers=headers, data=f)
print(resp.json())
Build a Custom Vision Model Code First
๐ Docs: Quickstart: Custom Vision SDK
Overview
- Custom Vision models can be built programmatically
- SDKs available for Python, C#, Java, JavaScript
Key Points
- Code-first approach automates image upload, labeling, training, and publishing
- Useful for MLOps pipelines
- Enables integration with CI/CD workflows
Best Practices
Use code-first when automating retraining or scaling projects
Quickโfire revision sheet
- ๐ Classification = one label per image, Detection = multiple objects with bounding boxes
- ๐ Label images consistently, minimum ~15โ30 per tag
- ๐ Training: Quick = fast, Advanced = accurate
- ๐ Metrics: Precision, Recall, mAP
- ๐ Models must be published to be consumed
- ๐ Predictions return JSON with confidence scores
- ๐ Code-first approach enables automation and CI/CD integration