Image Annotation

By Keshav Kumar | Published on May 17, 2024

How Companies Use Image Annotation to Produce High-Quality Training Data

 

Image annotation is the foundation behind many Artificial Intelligence (AI) products you interact with and is one of the most essential processes in Computer Vision (CV). In image annotation, data labelers use tags, or metadata, to identify characteristics of the data you want your AI model to learn to recognize. These tagged images are then used to train the computer to identify those characteristics when presented fresh, unlabeled data. Think about when you were young. At some point, you learned what a dog was. Eventually, after seeing many dogs, you started to understand the different breeds of dogs and how a dog was different from a cat or a pig. Like us, computers need many examples to learn how to categorize things. Image annotation provides these examples in a way that’s understandable for the computer. With the increased availability of image data for companies pursuing AI, the number of projects relying on image annotation has grown exponentially. Creating a comprehensive, efficient image annotation process has become increasingly important for organizations working within this area of machine learning (ML).

 

 

 

 

Applications of Image Annotation

To compile a complete list of current applications that leverage image annotation, you’d have to read through thousands of pages. For now, we’ll highlight some of the most compelling use cases across major industries.

 

Agriculture

Using drones and satellite imagery, farmers leverage AI for countless benefits: estimating crop yield, evaluating soil, and more. An exciting example of image annotation in practice is from John Deere. The company annotates camera images to differentiate between weeds and crops at a pixel-level. They then use this data to apply pesticides only on the areas where weeds are growing rather than the entire field, saving tremendous amounts of money in pesticide use each year.

 

Healthcare

Doctors are supplementing their diagnoses with AI-powered solutions. For instance, AI can examine radiology images to identify the likelihood of certain cancers being present. In one example, teams train a model using thousands of scans labeled with cancerous and non-cancerous spots until the machine can learn to differentiate on its own. While AI isn’t intended to replace doctors, it can be used as a gut-check and added accuracy for crucial health decisions.

 

Manufacturing

Manufacturers are discovering that image annotation can help them capture information on inventory in their warehouses. They’re training computers to evaluate sensory image data to determine when a product is soon to be out-of-stock and needs additional units. Certain manufacturers are also using image annotation projects to monitor infrastructure within the plant. Their teams label image data of equipment, which is then used to train computers to recognize specific faults or failures, driving faster fixes and better maintenance overall.

 

Finance

While the finance industry is far from fully harnessing the power of image annotation projects, there are still several companies making waves in this space. Caixabank, for example, uses face recognition technology to verify the identity of customers withdrawing money from ATMs. This is done through an image annotation process known as pose-point, which maps facial features like eyes and mouth. Facial recognition offers a faster, more precise way of determining identity, reducing the potential for fraud. Image annotation is also critical for annotating receipts for reimbursement or checks to deposit via a mobile device.

 

Retail

Image annotation is critical for many different AI use cases. Want to use AI to deliver the right results for a specific item – such as someone searching for jeans? Image annotation is required to build a model that can look through a product catalog and serve results that the user wants. Several retailers are also piloting robots in their stores. These robots collect images of shelves to determine if a product is low or out-of-stock, indicating it needs reordering. These robots can also scan barcode images to gather product information using a process known as image transcription, one of the methods of image annotation described below.

 

Types of Image Annotation

 

types of image annotation explained

 There are three popular types of image annotation, and the one to select for your use case will depend on the complexity of the project. With each type, the more high-quality image data used, the more accurate the resulting AI predictions will be.

 

Classification

The easiest and fastest method for image annotation, classification applies only one tag to an image. For example, you might want to look through and classify a series of images of grocery store shelves and identify which ones have soda or not. This method is perfect for capturing abstract information, such as the example above, or the time of day, if cars are in a picture, or for filtering out images that don’t meet the qualification from the start. While classification is the fastest image annotation at giving a single, high-level label, it’s also the vaguest out of the three types we highlight as it doesn’t indicate where the object is within the image. [See why Shotzr anticipates identifying over 61 million images for removal from their review queue]

 

Object Detection

With object detection, annotators are given specific objects that they need to label in an image. So if an image is classified as having soda in it, this takes it one step further by showing where the soda is within the image, or if you’re looking specifically for where the orange soda is. There are several methods used for object detection, including techniques such as:

  • 2D Bounding Boxes: Annotators apply rectangles and squares to define the location of the target objects. This is one of the most popular techniques in the image annotation field.
  • Cuboids, or 3D Bounding Boxes: Annotators apply cubes to the target object to define the location and the depth of the object.
  • Polygonal Segmentation: When target objects are asymmetrical and don’t easily fit into a box, annotators use complex polygons to define their location.
  • Lines and Splines: Annotators identify key boundary lines and curves in an image to separate regions. For example, annotators may label the various lanes of a highway for a self-driving car image annotation project.

Because object detection allows overlap in the usage of boxes or lines, this method is still not the most precise. What it does provide is the object’s general location while still being a relatively fast annotation process.

 

Semantic Segmentation

Semantic segmentation solves object detection’s overlap problem by ensuring every component of an image belongs to only one class. Usually done at the pixel level, this method requires annotators to assign categories (such as a pedestrian, car, or sign) to each pixel. This helps to teach an AI model how to recognize and classify specific objects, even if they are obstructed. For example, if you have a shopping cart obstructing part of the image, semantic segmentation can be used to identify what orange soda looks like down to the pixel level so that the model will be able to recognize that it is still, in fact, orange soda. It’s worth noting that the three image annotation methods outlined above are by no means the only methods. Other types you may have heard about include those specifically used for facial recognition, an example being landmark annotation (where the annotator plots characteristics—think eyes, nose, and mouth—using pose-point annotation). Image transcription is another standard method, used when there’s multimodal information in the data—i.e., there is text in the image and it requires extraction.

 

How to Make Image Annotation Easier

 

Broadly, image annotation is difficult for many of the same reasons that building any AI model is challenging. AI requires large amounts of high-quality data to work properly (the more examples a computer can learn from, the better it will perform), a diverse team to annotate that data, and comprehensive data pipelines for execution. For many organizations, the time, money, and effort required may not be feasible. For those that don’t have the internal resources to accomplish an end-to-end image annotation project, turning to third-party vendors for assistance is a valid option. These vendors can provide the image data, annotators, tooling, and expertise to assist in such a massive endeavor. With image annotation, specifically, the images often come with a whole host of problems. The image may have poor lighting, the target object may be occluded, or parts of the image may be unrecognizable to even a human eye. Teams must make decisions on how to represent these aspects prior to beginning an image annotation project. Teams will also need to be careful about naming their labels and differentiating classes, as these factors can confuse the annotator, and ultimately the machine. Classes that are too similar, for instance, will create unnecessary confusion. In solving these problems, expect to create an AI solution with greater accuracy and speed. When done correctly and with precision, image annotation yields high-quality training data, an essential component of any effective AI model.

Image annotation is the foundation behind many Artificial Intelligence (AI) products you interact with and is one of the most essential processes in Computer Vision (CV).