Skip to content

Implement Custom Vision Models

This section of the Microsoft AI-102: Designing and Implementing a Microsoft Azure AI Solution exam covers building, training, and deploying custom vision models. Below are study notes for each sub-topic, with links to Microsoft documentation, exam tips, and key facts


Choose Between Image Classification and Object Detection Models

๐Ÿ“– Docs: What is Custom Vision?

Overview

  • Image classification: predicts the overall category of an image
  • Object detection: identifies and locates multiple objects within an image using bounding boxes

Key Points

  • Use classification when one label per image is enough
  • Use object detection when multiple items need identifying
  • Both require labeled training data

Exam Tip

Scenario question: โ€œIdentify multiple items in a photoโ€ โ†’ Object detection


Label Images

๐Ÿ“– Docs: Tag and label images

Overview

  • Labeled images are required for supervised learning
  • Images must be uploaded and tagged with the correct category

Key Points

  • Tags must be consistent across training set
  • At least 15โ€“30 images per tag recommended
  • Balanced datasets improve accuracy

Best Practices

Use diverse images (lighting, angle, resolution) for robust models


Train a Custom Image Model

๐Ÿ“– Docs: Train custom models

Overview

  • Training uses the uploaded and labeled dataset
  • Types:
    • Image classification
    • Object detection

Key Points

  • Quick Training โ†’ faster, less accurate
  • Advanced Training โ†’ slower, more accurate
  • Requires multiple iterations for tuning

Exam Tip

Training type and dataset size impact accuracy and cost


Evaluate Custom Vision Model Metrics

๐Ÿ“– Docs: Evaluate model performance

Overview

  • Evaluation metrics help measure model quality
  • Metrics:
    • Precision = true positives รท (true positives + false positives)
    • Recall = true positives รท (true positives + false negatives)
    • mAP (mean average precision) for object detection

Key Points

  • Tradeoff: precision vs recall
  • High recall = fewer missed detections
  • High precision = fewer false positives

Exam Tip

Expect formula-style questions on precision vs recall


Publish a Custom Vision Model

๐Ÿ“– Docs: Train a Computer Vision Model with Azure Custom Vision

Overview

  • After training, models must be published to an endpoint
  • Published models can be accessed via API calls

Key Points

  • Each project can have multiple iterations
  • Must publish the correct iteration for inference
  • Endpoint includes prediction URL and key

Consume a Custom Vision Model

๐Ÿ“– Docs: Call the Prediction API

Overview

  • Use REST API or SDKs to submit images for prediction
  • Input formats: URL or binary image data

Key Points

  • Predictions include label + confidence score
  • Results returned in JSON format
  • Supports batch processing

Use Case

import requests

url = "https://<endpoint>/customvision/v3.0/Prediction/<project-id>/classify/iterations/<iteration>/image"
headers = {"Prediction-Key": "KEY", "Content-Type": "application/octet-stream"}

with open("test.jpg", "rb") as f:
    resp = requests.post(url, headers=headers, data=f)
print(resp.json())

Build a Custom Vision Model Code First

๐Ÿ“– Docs: Quickstart: Custom Vision SDK

Overview

  • Custom Vision models can be built programmatically
  • SDKs available for Python, C#, Java, JavaScript

Key Points

  • Code-first approach automates image upload, labeling, training, and publishing
  • Useful for MLOps pipelines
  • Enables integration with CI/CD workflows

Best Practices

Use code-first when automating retraining or scaling projects


Quickโ€‘fire revision sheet

  • ๐Ÿ“Œ Classification = one label per image, Detection = multiple objects with bounding boxes
  • ๐Ÿ“Œ Label images consistently, minimum ~15โ€“30 per tag
  • ๐Ÿ“Œ Training: Quick = fast, Advanced = accurate
  • ๐Ÿ“Œ Metrics: Precision, Recall, mAP
  • ๐Ÿ“Œ Models must be published to be consumed
  • ๐Ÿ“Œ Predictions return JSON with confidence scores
  • ๐Ÿ“Œ Code-first approach enables automation and CI/CD integration