# What Are AprilTags?

AprilTags are a system of visual tags developed by researchers at the University of Michigan to provide low overhead, high accuracy localization for many different applications.

Additional information about the tag system and its creators can be found on their website This document attempts to summarize the content for FIRST robotics related purposes.

## Application to FRC

In the context of FRC, AprilTags are useful for helping your robot know where it is at on the field, so it can align itself to some goal position.

AprilTags have been in development since 2011, and have been refined over the years to increase the robustness and speed of detection.

Starting in 2023, FIRST is providing a number of tags, scattered throughout the field, each at a known pose.

All of the tags are from the 16h5 family.

Note

Many of the pictures in this documentation are from the 36h11 family, which are similar (but not identical) to the 16h5 actually in use for FRC. All the underlying concepts are the same.

What is the 16h5 family?

The AprilTag library implementation defines standards on how sets of tags should be designed. Some of the possible tag families are described here.

FIRST has chosen the 16h5 family for 2023. This family of tags is made of a 4x4 grid of pixels, each representing one bit of information. An additional black and white border must be present around the outside of the bits.

While there are $$2^{16} = 65536$$ theoretical possible tags, only 30 are actually used. These are chosen to:

1. Be robust against some bit flips (IE, issues where a bit has its color incorrectly identified).

2. Not involve “simple” geometric patterns likely to be found on things which are not tags. (IE, squares, stripes, etc.)

3. Ensure the geometric pattern is asymmetric enough that you can always figure out which way is up.

All tags will be printed such that the tag’s main “body” is 6 inches in length.

For home usage, tag files may be printed off and placed around your practice area. Mount them to a rigid backing material to ensure the tag stays flat, as the processing algorithm assumes the tags are flat.

## Software Support

The main repository for the source code that detects and decodes AprilTags is located here.

WPILib has forked the repository to add new features for FRC. These include:

1. Building the source code for common FRC targets, including the roboRIO and Raspberry Pi.

2. Adding Java Native Interface (JNI) support to allow invoking its functionality from Java

3. Gradle & Maven publishing support

## Processing Technique

While most FRC teams should not have to implement their own code to identify AprilTags in a camera image, it is useful to know the basics of how the underlying libraries function.

An image from a camera is simply an array of values, corresponding to the color and brightness of each pixel.

## Usage

### 2D Alignment

A simple strategy for using targets is to move the robot until the target is centered in the image. Assuming the field and robot are constructed such that the gamepiece, scoring location, vision target, and camera are all aligned, this method should proved a straightforward method to automatically align the robot to the scoring position.

Using a camera, identify the centroid of the AprilTags in view. If the tag’s ID is correct, apply drivetrain commands to rotate the robot left or right until the tag is centered in the camera image.

This method does not require calibrating the camera or performing the homography step.

### 3D Alignment

A more advanced usage of AprilTags is to use their corner locations to help perform on-field localization.

Each image is searched for AprilTags using the algorithm described on this page. Using assumptions about how the camera’s lense distorts the 3d world onto the 2d array of pixels in the camera, an estimate of the camera’s position relative to the tag is calculated. A good camera calibration is required for the assumptions about its lens behavior to be accurate.

The tag’s ID is also decoded. from the image. Given each tag’s ID, the position of the tag on the field can be looked up.

Knowing the position of the tag on the field, and the position of the camera relative to the tag, the 3D geometry classes can be used to estimate the position of the camera on the field.

If the camera’s position on the robot is known, the robot’s position on the field can also be estimated.

These estimates can be incorporated into the WPILib pose estimation classes.

## 2D to 3D Ambiguity

The process of translating the four known corners of the target in the image (two-dimensional) into a real-world position relative to the camera (three-dimensional) is inherently ambiguous. That is to say, there are multiple real-world positions that result in the target corners ending up in the same spot in the camera image.

Humans can often use lighting or background clues to understand how objects are oriented in space. However, computers do not have this benefit. They can be tricked by similar-looking targets:

Resolving which position is “correct” can be done in a few different ways:

1. Use your odometry history from all sensors to pick the pose closest to where you expect the robot to be.

2. Reject poses which are very unlikely (ex: outside the field perimeter, or up in the air)

3. Ignore pose estimates which are very close together (and hard to differentiate)

4. Use multiple cameras to look at the same target, such that at least one camera can generate a good pose estimate

5. Look at many targets at once, using each to generate multiple pose estimates. Discard the outlying estimates, use the ones which are tightly clustered together.

Decimation factor impacts how much the image is down-sampled before processing. Increasing it will increase detection speed, at the cost of not being able to see tags which are far away.
Blur applies smoothing to the input image to decrease noise, which increases speed when fitting quads to pixels, at the cost of precision. For most good cameras, this may be left at zero.
Threads changes the number of parallel threads which the algorithm uses to process the image. Certain steps may be sped up by allowing multithreading. In general, you want this to be approximately equal to the number of physical cores in your CPU, minus the number of cores which will be used for other processing tasks.