Visual verification is an innovative feature that uses cutting-edge computer vision AI to empower you during your tasks. This intelligent system, trained on a custom dataset of images, is adept at recognizing specific objects and their states within your workspace. This feature simplifies and streamlines the process, putting powerful machine learning and computer vision into the hands of ordinary content creators.

Audience

The following information is designed with the content creator in mind. Tenant administrators will benefit from this knowledge, with those who also create Instruct Experiences with visual verification reaping the greatest rewards. Regardless of your role, you should already have familiarity with CareAR’s Experience Builder capability.

How it Works for the End User

Visual verification is integrated seamlessly within the digital work instructions created within Experience Builder. This feature acts as a vital checkpoint for the end user. As you progress through the instructions, the system prompts you to point your device's camera at the object to be analyzed. The camera images are then processed by the computer vision AI, verifying the presence and condition of designated objects. Well-designed procedures will detect errors and guide you toward successful completion. Only upon successful verification (if configured), signifying the correct completion of the step, can you proceed to the next stage.

How it Works for the Content Creator

With an optimized workflow, content creators can quickly gather a custom dataset of scans (images) that represent the object and states they wish to use in their content. Upon completion of the dataset creation, the content creator adds labels to the various object and state combinations. From here, the content creator launches the machine learning (ML) training session. Once the training session completes (generally from 30 to 90 minutes), an ML model is generated that the content creator can then use in their Experience.

Benefits

Enhanced Safety and Compliance: visual verification safeguards you by ensuring adherence to critical safety protocols and regulatory requirements
Accurate Part Detection: The AI's sharp eye guarantees precise identification of the correct parts needed for each step, minimizing errors
Streamlined Quality Control: Achieve consistent quality by leveraging visual verification as a reliable tool to confirm completion according to defined standards
Error Reduction: Oversight and data capturing with visual verification help reduce job errors and allow for quick problem identification and resolution

Example Applications

Visual verification finds application in a diverse range of scenarios, including:

•Product Assembly and Installation: Guide proper component placement and assembly sequence.

•Field Service: Ensure technicians execute procedures accurately at customer sites.

•Quality Assurance: Verify the finished product meets all quality specifications.

Definitions

You’ll hear frequent reference to objects and states throughout this discussion. When we talk of an object, we are referring to an individual physical device, a part, a piece of equipment, a tool, or some other physical object or item. Where visual verification really shines is when an object has one or more parts that are articulated, meaning it has a part that moves in space – for example, a lever, an on/off switch, a drawer, a door, etc. If you think about the unique combination of the object and the parts in each position (e.g., a safety switch locked in the off position), this represents a unique state of the object.

As you’ll soon learn, we need to capture scans of each unique object state and give that state a name, or label, that will be used in our Instruct Experience. More advanced applications can certainly involve two or more objects that frequently (or always) occur together as a collection.

When we mention dataset, this refers to a collection of images that will be the basis for the machine learning training process. CareAR provides a scan tool that is used to quickly create and add to your project’s dataset.

Prerequisites

To make use of visual verification, the following are prerequisites:

A CareAR account with AI Services enabled (or the legacy CareAR Platform plan)
A user account with the role of content creator or tenant administrator
CareAR 24.02 or Later
Compatible iOS device for Visual Verification ML model creation
1. System Requirements and Supported Web Browsers

How to enable visual verification within an experience

Enabling visual verification is a multistep process. Each of those steps is outlined here:

Build your plan for visual verification
Scan object for visual verification
Label Scans for visual verification
Train the Machine Learning model
Incorporate ML model into an Experience
Publish and Test

Build your plan for visual verification

Before you get started with scanning and creating your model, it’s important to create a plan for all the steps and various states you will need to account for in the visual verification procedure.

To help you manage your work, we recommend the following:

Write down the procedure you plan to have your user follow, from start to finish, with each distinct step numbered.
Review the procedure and document each unique physical state of the process. This should correspond to the object in a particular state.
Make an explicit list of all the items that you want to label up front so that you can keep track of your progress. Depending on project complexity, you may need more than one scanning session.
If multiple states must co-exist before the user proceeds to the next step, be sure to write that down as well. You will use this information when you build your Instruct experience.
Document all the states that must happen simultaneously for your procedure to have been fully completed. This is important as this is the unique combination of states required to happen for you to indicate to the user that they’ve successfully completed the job.
As you make your scans, you will need to collect sufficient images for each combination that you listed. For each state, you’ll want to gather scans in different lighting conditions. So having a list already prepared as you gather your scans will help you track your progress.

Scan object for visual verification

Let’s get started creating our dataset using the plan you created in the previous step, The objective here is to build a dataset of images that will then be labeled and sent for the machine learning model creation. To do this, you need the CareAR app 24.02 or later, a supported iOS device (see System Requirements and Supported Web Browsers – CareAR Help Center), an account in a CareAR tenant with role of Content Creator or Tenant Admin, and you’ll need to be connected.

Step 1: Create New Project

Using the CareAR Scan app, tap Create new project, give your project a unique name, and tap CREATE AND START SCANNING.

Step 2: Start Scanning

Tap the red button and begin scanning the object in the initial state you wrote down in your work plan. You need to move your iPhone/iPad’s camera to capture the object in every possible angle that you expect your user to approach or view the object.

You’ll note a counter at the top of the scan app. Try to collect a few hundred images before you stop scanning.

To stop scanning, tap the STOP button on the right. Alternatively, you can pause and resume your scan activity.

Step 3: Name and Save Your Scan

Now let’s give your scan a name (tap the pen icon at the top of the app). Now tap Save. Your images will be sent to the CareAR platform under your account. Congratulations! You completed your first scan!

Step 4: Repeat!

You’ll want a large number of images for each label, so ensure you have several hundred (ideally 1000 images) scanned for each label. Work your way down the list you created in your work plan, repeating steps 1-3.

Once you’ve captured enough images, you’ll proceed on to the next phase: creating labels for your scans.

Best Practices for Scanning

A minimum of 200 images per label is needed to perform detections. It is recommended to have closer to 1000 images per label for optimal results.
Consider the intended angle and distance of the user when performing scans. For example, if the user will be standing a set distance from the object, scans should be taken from that same distance to optimize for the use case.
If the object is mobile, take scans in different environments and lighting conditions to improve the model’s ability to detect in diverse conditions. Scan under low lighting, bright artificial lighting, natural lighting, and moderate lighting conditions.
Ideally, include scans of the object in its typical setting or settings that a user will find the object in. For example, a safety switch is nearly always mounted on a wall; include scans of the switch mounted on a wall. If unboxing the safety switch is part of your procedure, then it makes sense to also capture scans of the switch before it’s mounted – perhaps placed on a table.[JR8] Likewise, if you want to be able to recognize individual loose parts, then include scans of the parts as well.
False positives can occur when a similar appearing object is detected by the camera. To reduce false positives, it is recommended to include a negative scan. This is a scan which does not contain the object expected to be detected. It can include images of known false positive objects. A quick way to help reduce false positives is to include the surrounding environment in scans without the object itself. Note that the negative scan will not have any labels created.
Another good strategy for negative scans is to scan the environment without the object. For example, if you’re scanning an object on a table, scan the table without the object and include it in the dataset without any labels. The logic here is that the ML training will see images of the background without a label and understand that the images of the background should be excluded when determining the state of the target object.
If you know that you’ll have a large number of individual scans to gather, you can always break up your work into multiple sessions. This is where the work plan can help you track your progress.
Keep in mind that ML model training teaches the model how to identify your object and its states without knowing anything about the object. The more images you have (both positive and negative), the better.

Label Scans for visual verification

Now that you’ve created your dataset, we’ll move on to the labeling phase of the process. Labeling is done in the CareAR web portal. Let’s go!

Step 1: Log into the CareAR Portal

Using your CareAR credentials, log into the CareAR portal (https://carear.app/#/login). This part of the process requires that your role within your CareAR tenant is either content creator or tenant admin . If you have a different role, then you will not be able to proceed with this step and you’ll need to contact your tenant administrator to reassign your role.

Step 2: Accessing your Project

Using the menu on the left of the portal screen, click AI Services > Object Detection. You’ll be presented with a list of all the projects created for your tenant, sorted from newest to oldest. Find the project with the name you provided in step 1 of the Scanning process. Click the row of your project. You’ll be presented with a list of all of your scans relating to your project.

Recall in the scanning session, you gave your scan a name and saved it. The scanned images were sent to the CareAR platform, and this is how you access them to perform your labeling.

Step 3: Initial Cleanup

When you click your first scan, you’ll find all the images are collected together and composited in a way that creates a 3D scene of the object and very likely parts of the surrounding area.

To navigate this scan, move your mouse and click-and-hold to move, pan, tilt, and orient the object to your liking. It’s a good idea to “crop down” the 3D object to be close and tight to the object. Click the Crop button and you’ll be presented with four different views: Top, Left, Front, Right. For each view, adjust the four corners of the crop target to get fairly close to the perimeter of the object in that view, but leave a bit of space around the object. Repeat this step for each of the different views and click Save.

Step 4: Create Labels

Now that the 3D object is nicely cropped, let’s apply the first label to the portion of the object that you’re interested in. Again, reference the work plan that you created at the start of the process.

You’ll notice that the Label counter (upper right part of the labeling tool) will indicate Labels(0) . This is because you’ve not yet created any labels. In this first step, you’ll create your first label.

To do this, follow these steps:

Click + Add Label
Click + Add New
Give the label a readable, short name that reflects the essence of the part (and its state). For example, (Light Switch On) if you’ve scanned a light switch in the on position. Picking an appropriate name is important as this will be the name you see when you create visual verification in your Experience.
You can optionally set the color of the label by entering the RGB values or using the color picker (shown when clicking the color chip) to select your desired color
Click SAVE

Now you have created a title for the label. You’ll observe that the Label counter will have incremented by one to Labels(1) . Note that you can re-use this label, now that it exists for your project.

Step 5: Adjust the Bounding Box

Now we need to properly position, size, and orient the label. The label is presented visually as a cube (a “bounding box”) which will encompass the entire area of the part you wish to name with a label.

Click the label and notice that four tool icons are shown on the labeling canvas: position, orientation, dimension, and delete.

Excluding the delete shortcut, each of the bounding box controls allow you to adjust the size, position, and orientation of the bounding box. Click on of the tools and you’ll see three lines displayed (one each for the x-axis, y-axis, and z-axis); click one of the lines and drag it in the dimension you want to adjust. It may take a bit of experimenting to get the hang of, but once you’ve done one bounding box, the others will move along quickly.

The objective here is to fully contain with the bounding box the part or area which you want to label. Give yourself a bit of cushion in each direction. Including distinctive features of the object (in the example above, the ON/OFF sticker) that can help set context for the state, it can be helpful to include that feature within your bounding box.

Once you’ve applied your bounding box, click outside the label and rotate the objective in different directions to ensure your position, orientation, and dimensions are correct. It’s important to note that you’re creating a label that will be applied to however many images you collected in the scanning phase, so take the time to make it accurate from as many positions as you can.

Once you’re satisfied with your work, click SAVE. Now you can label your next part/state.

Prepare labels for each of the unique parts/states that you listed in your work plan. Once you’ve completed all the necessary labels, we’re ready to move to the training phase. Click EXIT to leave the labeling session.

Best Practices for Labeling

After labeling all scans look at how many images exist for each label. We recommend a minimum of 200 images per label and best performance can be expected when the image count per label approaches 1000 images. If you find a label is lacking enough images, add additional scans to achieve the number.
Keep the cube as small as possible around the object while still including the entire object. Do not leave a lot of empty space with the cube.
If there are reference marks e.g., stickers, or other irregular contours or markings that will serve as a reference, encompass those reference marks in the bounding box).
Check your bounding box’s position, dimension, and orientation in as many different perspectives within the labeling canvas. Adjust any of those three attributes as needed so that the bounding box perfectly encompasses the subject of interest.
Include so-called negative scans (images absent of the target object) and ensure there are no labels for these scans. Again, a negative scan improves the overall training process and will lessen the incidents of false positives.

Train the Machine Learning model

Step 1: Create New ML Model

To create the Machine Learning model (the “ML model”), click the label “ML Models”. Click the button on the bottom CREATE NEW ML MODEL or the link to the right + CREATE NEW MODEL.

You’ll see a popup wizard presented.

Enter the Data ML model name and an optional description. Now select the scans that you want to include for the model training. These scans must have labels created. Now you’re ready to launch the training: click CREATE.

You’ll be shown another popup that confirms the model training process has started. You can optionally provide your email to be notified when the process has completed.

Now we wait. The process can take anywhere from 30 to 90 minutes to complete depending on the number of images being used. As training is underway, you’ll see the status reflected in the portal as follows:

Step 2: Complete Training

Once training has completed, you’ll receive a notification email, and you’ll see that the portal indicates completion.

Now it’s time to proceed to using the ML model in your Instruct Experience.

Suggested Best Practices

Unless you are very comfortable with making changes to the training parameters, it is highly recommended you use the default settings.

Incorporate ML model into an Experience

The final phase of the visual verification build process is to incorporate the ML model into your Instruct Experience. Let’s assume you’ve already created your base level experience or you’re modifying an existing experience. The object detection function is a property associated with the button element and forms the start of this final phase.

Step 1: Create a New Button for visual verification

Within the Experience, create a new button for triggering the visual verification.

Set the On click properties of the button (see the panel to the right of the canvas) to Start detection . Click SAVE.

Step 2: Map the ML Model

Note the Detection object was created and connected to the button. Click on the Detection object. To the right, the properties will prompt you to enter the ML model to use for detection. Click + Select ML model. The available models will be displayed. Pick the model that you named in the training process.

Step 3: Configure Detection

You’ll now have various properties, attributes, and actions that you can set that relate to the ML model and detection, generally. This is the crux of the visual verification feature.

For Visual Verification, you’ll need to build a set of actions that correspond to each of the states that you captured in your work plan and scanning sessions.

In the example below, we examine a safety switch and verify whether the power is on, the power is off, or if the power is off AND the switch is locked.

Step 3a: Visual Guide

Let’s first decide if we want to include a visual guide for the end user. A visual guide is a simple outline that is overlaid on the camera display that helps nudge the user to align their camera in the right position for visual verification. The visual guide is like the way the QR code reader in your smartphone’s camera overlays small corner points around the QR code when your camera detects the code in the field of view.

In the example below, the visual guide contains rounded corner segments. But the guide can be whatever shape best suits your application – regular/irregular geometric shapes or perhaps a stylized outline of the object.

You can either upload a .PNG file of the guide or point to a URL where the file is located.

Step 3b: Set the Detection Logic

You can control whether the user has the option to skip the detection if it fails to detect the expected object. To do this, enabled the Allow user to skip switch per the illustration below. You can also set where you want the user to proceed to if the object detection fails or if the user decides to skip. Options include Navigate to page, Navigate to URL, Navigate to Previous Page, or Navigate to Search.

Step 3c: Define Actions

Now we will set the actions that correspond to the states that visual verification encounters. Add your first action by clicking + ADD ACTION.

Experience Builder creates Action #1 for you. You’ll now need to tell the Experience what to do if it detects a certain state or combination of states. (Each state corresponds to a label.)

Once again, this is where your work plan from the very beginning of the process will help you! Under the If system detects heading, use the pull-down selection tool to find the correct state to use.

NOTE: Each of the options presented are each of the labels you created in your labeling session and that are associated with the ML model!

If you need to have a combination of two or more states, then click the (+) to add another state.

Next, indicate what you want to happen if the system detects the state (or states) you defined. Options include Navigate to page, Navigate to URL, Navigate to Previous Page, or Navigate to Search.

Click SAVE. Now you’ve completed your first action.

See the following illustration to help navigate through the steps.

You’ll now create additional actions for your experience based on the list of states (and combinations) that you captured in your work plan.

If you need to delete an action, simply click the small garbage can icon.

Once you’ve captured all the states from your work plan and you implemented the success criteria, you will publish your experience and start your testing.

Publish and Test

If you haven’t already published (or re-published) your experience, click PUBLISH.

Use the QR code for the newly published experience to start testing out your visual verification.

You’ll want to recreate each of the states that map to distinct actions and confirm that the flow navigates as you intended.

As the content creator, you’ll want to do as much testing as you can before you share the experience with others.

Best Practices for Testing

Test under a variety of lighting conditions.
If you’re getting false positive identifications, you may not have enough scans. You can always add additional scans to your project and re-train the ML model. Lots of false positives is generally an indication that that you need additional training data (i.e., more scans). Ensure that you include plenty of scans under different lighting and placement situations. It can help to vary the distance of your scans, too.
Recruit additional users to help test your experience. Once you’re happy with the results, you can then share your experience more broadly with others.
A useful tool to use for testing is the Scan app. Within your project you can download the ML model directly to your device. Once it’s downloaded you can click on the Test and it will turn on your camera and show you the state of the object with visual verification in real time. It’s helpful to test the ML model directly without the context of an experience to simply the troubleshooting process.