Face alignment is a process of applying a supervised learned model to a face image and estimating the locations of a set of facial landmarks, such as eye corners, mouth corners, etc. Face alignment is a key module in the pipeline of most facial analysis algorithms, normally after face detection and before subsequent feature extraction and classification. Therefore, it is an enabling capability with a multitude of applications, such as face recognition, expression recognition, face deidentification, etc. Despite the continuous improvement on the alignment accuracy, face alignment is still a very challenging problem, due to the non-frontal face pose, low image quality, occlusion, etc. Among all the challenges, we identify the pose invariant face alignment as the one deserving substantial research efforts, for a number of reasons.

Motivated by the needs to address the pose variation, and the lack of prior work in handling poses, as shown in Fig. 1, we proposed a novel regression-based approach for pose-invariant face alignment, which aims to estimate the 2D and 3D locations of face landmarks, as well as their visibilities in the 2D image, for a face with arbitrary pose (e.g., -90< yaw<+90).

PIFA Introduction

Figure 1: Given a face image with an arbitrary pose, our proposed algorithm automatically estimates the 2D locations and visibilities of facial landmarks, as well as 3D landmarks. The displayed 3D landmarks are estimated for the image in the center. Green/red points indicate visible/invisible landmarks.

Proposed Method

The overall architecture of our proposed PIFA method is shown in Fig. 2. We first learn a 3D Point Distribution Model (3DPDM) from a set of labeled 3D scans, where a set of 2D landmarks on an image can be considered as a projection of a 3DPDM instance (i.e., 3D landmarks). For each 2D training face image, we assume that there exists the manual labeled 2D landmarks and their visibilities, as well as the corresponding 3D ground truth 3D landmarks and the camera projection matrix. Given the training images and 2D/3D ground truth, we train a cascaded coupled-regressor that is composed of two regressors at each cascade layer, for the estimation of the update of the 3DPDM coefficient and the projection matrix respectively. Finally, the visibilities of the projected 3D landmarks are automatically computed via the domain knowledge of the 3D surface normals, and incorporated into the regressor learning procedure.

PIFA Overview

Figure 2: Overall architecture of our proposed PIFA method, with three main modules (3D modeling, cascaded coupled-regressor learning, and 3D surface-enabled visibility estimation). Green/red arrows indicate surface normals pointing toward/away from the camera.

Qualitative results

As shown in Fig. 3, despite the large pose range of -90< yaw<+90, our algorithm does a good job of aligning the landmarks, and correctly predict the landmark visibilities. These results are especially impressive if you consider the same mean shape (2D landmarks) is used as the initialization of all testing images, which has very large deformations with respect to their final landmark estimation.

AFLW results

Figure 3: Testing result of AFLW database. As shown in the top row, we initialize face alignment by placing a 2D mean shape in the given bounding box of each image. Note the disparity between the initial landmarks and the final estimated ones, as well as the diversity in pose, illumination and resolution among the images. Green/red points indicate visible/invisible estimated landmarks.

PIFA Source Code

PIFA implementation may be downloaded from here. The part of AFLW database used for training and testing can be found from here.

If you use PIFA code, please cite to the papers:

Publications

  • Pose-Invariant Face Alignment via CNN-based Dense 3D Model Fitting
    Amin Jourabloo, Xiaoming Liu
    International Journal of Computer Vision, , Apr. 2017 (in press)
    Bibtex | PDF
  • @article{ pose-invariant-face-alignment-via-cnn-based-dense-3d-model-fitting,
      author = { Amin Jourabloo and Xiaoming Liu },
      title = { Pose-Invariant Face Alignment via CNN-based Dense 3D Model Fitting },
      booktitle = { International Journal of Computer Vision },
      month = { April },
      year = { 2017 },
    }
  • Large-pose Face Alignment via CNN-based Dense 3D Model Fitting
    Amin Jourabloo, Xiaoming Liu
    Proc. IEEE Computer Vision and Pattern Recogntion (CVPR 2016), Las Vegas, NV, Jun. 2016
    Bibtex | PDF | Poster
  • @inproceedings{ large-pose-face-alignment-via-cnn-based-dense-3d-model-fitting,
      author = { Amin Jourabloo and Xiaoming Liu },
      title = { Large-pose Face Alignment via CNN-based Dense 3D Model Fitting },
      booktitle = { Proc. IEEE Computer Vision and Pattern Recogntion },
      address = { Las Vegas, NV },
      month = { June },
      year = { 2016 },
    }
  • Pose-Invariant 3D Face Alignment
    Amin Jourabloo, Xiaoming Liu
    Proc. International Conference on Computer Vision (ICCV 2015), Santiago, Chile, Dec. 2015
    Bibtex | PDF | Poster
  • @inproceedings{ pose-invariant-3d-face-alignment,
      author = { Amin Jourabloo and Xiaoming Liu },
      title = { Pose-Invariant 3D Face Alignment },
      booktitle = { Proc. International Conference on Computer Vision },
      address = { Santiago, Chile },
      month = { December },
      year = { 2015 },
    }