Using Copilot Studio for Image Recognition Tasks

Introduction

Copilot Studio is primarily designed for building AI-powered conversational bots, but it can also be integrated with AI models for image recognition tasks using Azure AI Services, Vision APIs, and Power Automate workflows.

By connecting Copilot Studio to Microsoft Azure Cognitive Services (Computer Vision API) or custom-trained AI models, we can enable chatbots to:
✅ Analyze images and extract details.
✅ Recognize objects, faces, and text (OCR).
✅ Classify and tag images automatically.
✅ Detect image anomalies or defects.

Step 1: Understanding Image Recognition in Copilot Studio

1.1 What Can Image Recognition Do?

With Azure Computer Vision API, Copilot Studio can:
✔ Analyze images – Detect objects, colors, categories.
✔ Extract text from images (OCR) – Read text from scanned documents.
✔ Detect faces – Identify age, gender, and emotions.
✔ Classify images – Categorize images into predefined labels.
✔ Identify brands and landmarks – Recognize logos and famous places.

1.2 Use Cases for Image Recognition in Copilot Studio

📌 Retail & E-commerce → Identify products from customer-uploaded images.
📌 Banking & Finance → Process scanned documents and extract text.
📌 Healthcare → Identify skin conditions using AI-powered diagnosis.
📌 Manufacturing → Detect defects in machinery and production lines.

Step 2: Setting Up Image Recognition in Copilot Studio

Since Copilot Studio does not have built-in image recognition, we integrate it with Azure Cognitive Services or Custom AI Models.

2.1 Prerequisites

✅ Microsoft Azure Account (Sign up here)
✅ Azure Computer Vision API Key
✅ Copilot Studio Access (Power Virtual Agents)
✅ Power Automate License (for API integration)

2.2 Create an Azure Computer Vision API

1️⃣ Go to Azure Portal.
2️⃣ Search for “Computer Vision” in the Azure Marketplace.
3️⃣ Click Create and enter:

Resource Group → Select/Create a group.
Region → Choose the closest data center.
Pricing Tier → Choose a free or paid tier.
4️⃣ Click Review + Create → Wait for deployment.
5️⃣ Go to Resource → Copy the API Key and Endpoint URL.

✅ Azure Computer Vision API is now ready!

2.3 Enable Image Upload in Copilot Studio

1️⃣ Log in to Copilot Studio (Power Virtual Agents).
2️⃣ Open your bot and go to Settings → Click Enable File Uploads.
3️⃣ Set Accepted File Types:

.jpg, .png, .pdf (for document scans).
4️⃣ Save changes.

✅ Users can now upload images via chat!

Step 3: Connecting Copilot Studio with Image Recognition API

We use Power Automate to send uploaded images to Azure Computer Vision API and retrieve recognition results.

3.1 Create a Power Automate Flow

1️⃣ Open Power Automate → Click Create Flow.
2️⃣ Select Automated Flow → Name it “Analyze Image with AI”.
3️⃣ Choose Copilot Studio Trigger:

Select “When a user uploads an image”.

3.2 Add an HTTP Request to Azure Computer Vision API

1️⃣ Click + New Step → Choose “HTTP”.
2️⃣ Set the Method to POST.
3️⃣ Enter Request URL:

https://<your-region>.api.cognitive.microsoft.com/vision/v3.2/analyze?visualFeatures=Tags,Description,Objects,Text

4️⃣ Click Headers → Add:

Ocp-Apim-Subscription-Key → Paste your API Key.
Content-Type → application/json.
5️⃣ In Body, enter:

{
  "url": "@{triggerOutputs()?['body/url']}"
}

✅ This sends the uploaded image URL to Azure Vision API!

3.3 Process the API Response

1️⃣ Click + New Step → Select “Parse JSON”.
2️⃣ Use Dynamic Content → Select the API Response Body.
3️⃣ Define JSON schema:

{
  "type": "object",
  "properties": {
    "description": {
      "type": "object",
      "properties": {
        "captions": {
          "type": "array",
          "items": {
            "type": "object",
            "properties": {
              "text": { "type": "string" }
            }
          }
        }
      }
    }
  }
}

✅ This extracts AI-generated image descriptions!

3.4 Send Results Back to Copilot Studio

1️⃣ Click + New Step → Select “Respond to Power Virtual Agents”.
2️⃣ Enter Dynamic Response Message:

"AI Analysis: @{body('Parse_JSON')?['description']['captions'][0]['text']}"
3️⃣ Save and Publish the Flow.

✅ Now, the chatbot can describe uploaded images!

Step 4: Testing the Image Recognition Bot

1️⃣ Open Copilot Studio → Click Test Bot.
2️⃣ Upload an image (e.g., a dog picture 🐶).
3️⃣ The chatbot should respond:
“AI Analysis: A brown dog sitting on the grass.”

Step 5: Expanding Image Recognition Capabilities

5.1 Extract Text from Images (OCR)

Modify the Azure API Request to enable OCR:
🔹 Change the visualFeatures parameter:

https://<your-region>.api.cognitive.microsoft.com/vision/v3.2/read/analyze

🔹 Process response to extract text from documents.

5.2 Detect Faces and Emotions

Integrate Azure Face API to analyze:
✔ Age Estimation
✔ Gender Detection
✔ Emotion Recognition (happy, sad, angry, etc.)

5.3 Identify Objects in Images

Modify the API to include object detection:

https://<your-region>.api.cognitive.microsoft.com/vision/v3.2/analyze?visualFeatures=Objects

AI will now list all objects detected in an image.

Final Thoughts

🚀 Key Takeaways:

✔ Step 1: Set up Azure Cognitive Services.
✔ Step 2: Enable image uploads in Copilot Studio.
✔ Step 3: Use Power Automate to send images to Azure Vision API.
✔ Step 4: Process AI responses and return results in chatbot messages.
✔ Step 5: Expand features with OCR, Face Detection, and Object Recognition.

Would you like a pre-built Power Automate template to speed up development?

Using Copilot Studio for image recognition tasks