AdaFace: Quality Adaptive Margin for Face Recognition

Recognition in low quality face datasets is challenging because facial attributes are obscured and degraded. Advances in margin-based loss functions have resulted in enhanced discriminability of faces in the embedding space. Further, previous studies have studied the effect of adaptive losses to assign more importance to misclassified (hard) examples. In this work, we introduce another aspect of adaptiveness in the loss function, namely the image quality. We argue that the strategy to emphasize misclassified samples should be adjusted according to their image quality. Specifically, the relative importance of easy or hard samples should be based on the sample's image quality. We propose a new loss function that emphasizes samples of different difficulties based on their image quality. Our method achieves this in the form of an adaptive margin function by approximating the image quality with feature norms. Extensive experiments show that our method, AdaFace, improves the face recognition performance over the state-of-the-art (SoTA) on four datasets (IJB-B, IJB-C, IJB-S and TinyFace). Code and models are released in https://github.com/mk-minchul/AdaFace

tl;dr

We show that margin functions (additive or angular) behave differently in controlling the emphasis of samples during training. We vary margin functions to avoid training on unidentifiable images during training.

An illustration of AdaFace feature space. We vary margin functions based on the feature norm.

Problem Definition

Figure 1. • Examples of face images with different qualities and recognizabilities. Both high and low quality images contain vari- ations in pose, occlusion and resolution that sometimes make the recognition task difficult, yet achievable. Depending on the degree of degradation, some images may become impossible to recognize. By studying the different impacts these images have in training, this work aims to design a novel loss function that is adaptive to a sample’s recognizability, driven by its image quality.

Comparison of Different Margin Functions

Figure 2. • Illustration of different margin functions and their gradient scaling terms on the feature space. B0 and B1 show the decision boundary with and without margin m, respectively. The yellow arrow indicates the shift in the boundary due to margin m. In the arc, a well-classified sample will be close to (in angle) the ground truth class weight vector, Wyi . A misclassified sample will be close to Wj , the negative class weight vector. The color within the arc indicates the magnitude of the gradient scaling term g (Eq. 12). Samples in the dark red region will contribute more to learning. Note that additive margin shifts the boundary toward Wyi , without changing the gradient scaling term. However, positive angular margin not only shifts the boundary, but also makes the gradient scale high near the boundary and low away from the boundary. This behavior de-emphasizes very hard samples, and likewise MagFace has similar behavior. On the other hand, negative angular margin induces an opposite behavior. CurricularFace adapts the boundary based on the training stage. Our work adaptively changes the margin functions based on the norm. With high norm, we emphasize samples away from the boundary and with low norm we emphasize samples near the boundary. Circles and triangles in the arc show example scenarios in the right most plot (AdaFace).

Overall Pipeline (Loss Function)

Figure 3. • Conventional margin based softmax loss vs our AdaFace. (a) A FR training pipeline with a margin based softmax loss. The loss function takes the margin function to induce smaller intra-class variations. Some examples are SphereFace, CosFace and ArcFace [4,20,35]. (b) Proposed adaptive margin function (AdaFace) that is adjusted based on the image quality indicator. If the image quality is indicated to be low, the loss function emphasizes easy samples (thereby avoiding unidentifiable images). Otherwise, the loss emphasizes hard samples.

CVPR 2022 Oral Presentation

The source code can be downloaded from here

AdaFace: Quality Adaptive Margin for Face Recognition

Problem Definition

Comparison of Different Margin Functions

Overall Pipeline (Loss Function)

CVPR 2022 Oral Presentation

AdaFace Source Code

Publications