How far are we from solving the 2D & 3D Face Alignment problem? (and a dataset of 230,000 3D facial landmarks)

Adrian Bulat and Georgios Tzimiropoulos

Abstract

This paper investigates how far a very deep neural network is from attaining close to saturating performance on existing 2D and 3D face alignment datasets. To this end, we make the following three contributions: (a) we construct, for the first time, a very strong baseline by combining a state-of-the-art architecture for landmark localization with a state-of-the-art residual block, train it on a very large yet synthetically expanded 2D facial landmark dataset and finally evaluate it on all other 2D facial landmark datasets. (b) We create a guided by 2D landmarks network which converts 2D landmark annotations to 3D and unifies all existing datasets, leading to the creation of LS3D-W, the largest and most challenging 3D facial landmark dataset to date (~230,000 images). (c) Following that, we train a neural network for 3D face alignment and evaluate it on the newly introduced LS3D-W. (d) We further look into the effect of all "traditional" factors affecting face alignment performance like large pose, initialization and resolution, and introduce a "new" one, namely the size of the network. (e) We show that both 2D and 3D face alignment networks achieve performance of remarkable accuracy which is probably close to saturating the datasets used.

Paper and code

Paper: [arxiv] [pdf]

Torch7 Code: [github]

Pytorch Code: [github]

Download models for 2D and 3D face alignment:

2D-FAN

3D-FAN

2D-to-3D-FAN

3D-FAN-depth

Dataset

LS3D-W is a large-scale 3D face alignment dataset constructed by annotating the images from AFLW[2], 300VW[3], 300W[4] and FDDB[5] in a consistent manner with 68 points using the automatic method described in [1].

To gain access to the dataset please enter your email address in the form located at the bottom of this page. You will shortly receive an email at the specified address containing the download link.

If you encounter any issue please contact us at adrian [at] adrianbulat [dot] com. The old address adrian.bulat@nottingham.ac.uk is no longer monitored.

Update: The entire LS3D-W dataset has now been released. In addition, we also made available the pretrained 2D-to-3D-FAN model to allow conversion of existing 2D points to 3D (the 2D points must be annotated in a consistent manner with the training set used).

Publication

@inproceedings{bulat2017far,
  title={How far are we from solving the 2D \& 3D Face Alignment problem? (and a dataset of 230,000 3D facial landmarks)},
  author={Bulat, Adrian and Tzimiropoulos, Georgios},
  booktitle={International Conference on Computer Vision},
  year={2017}
}

References:
[1] A. Bulat, G. Tzimiropoulos. How far are we from solving the 2D \& 3D Face Alignment problem? (and a dataset of 230,000 3D facial landmarks), arxiv, 2017
[2] M. Kostinger, P. Wohlhart, P.M. Roth, and H. Bischof. Annotated facial landmarks in the wild: A large-scale, real-world database for facial landmark localization, In ICCVW, 2011
[3] J. Shen, S. Zafeiriou, G. G. Chrysos, J. Kossaifi, G. Tzimiropoulos, and M. Pantic. The first facial landmark tracking in-the-wild challenge: Benchmark and results. In ICCVW, 2015
[4] C. Sagonas, G. Tzimiropoulos, S. Zafeiriou, and M. Pantic. 300 faces in-the-wild challenge: The first facial landmark localization challenge. In ICCVW, 2013
[5] V. Jain, E. Learned-Miller FDDB: A Benchmark for Face Detection in Unconstrained Settings. UMass Amherst Technical Report, 2010