The utility of synthetic images for face modelling and its applications

Date

2013

Authors

Asthana, Akshay

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

In recent years, deformable face model based approaches have been a very active area of research. The Active Appearance Model (AAM) has been by far the most popular approach for generating the face models and has been used in several facial image analysis based applications. This thesis investigates in detail the utility of synthetically generated facial images for face modelling and its applications, such as facial performance transfer and pose-invariant face recognition, with a particular focus on AAM. Beginning with a detailed overview of the AAM framework, an extensive application oriented review is presented. This includes a comparative study of various existing 2D variants of AAM fitting methods for the task of automatic facial expression recognition (FER) and auditory-visual automatic speech recognition (AV-ASR). For FER, the experiments were performed under both person dependent and person independent scenarios. In contrast for A V-ASR, the experiments were performed under the person dependent scenarios where the main focus was on accurately capturing the lip movements. Overall, the best results were obtained by using the Iterative Error Bound Minimisation method, which consistently resulted in accurate face model alignment even when the initial face detection used to initialise the fitting procedure was poor. Furthermore, to demonstrate the utility of the existing AAM framework, a novel approach of learning the mapping between the parameters of two completely independent AAMs is presented to facilitate the facial performance transfer from one subject to another in a realistic manner, a problem which is of a particular interest to the computer graphics community. The main advantage of modelling this parametric correspondence is that it allows a meaningful transfer of both the non-rigid shape and texture across faces irrespective of the speaker's gender, shape and size of the faces, and illumination conditions. Although this application oriented review shows the potential benefits of the AAM framework, its usability is limited due to the requirement of the pseudodense annotation of landmark points for every training image, which typically have to be annotated in a tedious and error-prone manual process. Therefore, a method for automatic annotation of face images, with arbitrary expressions and poses, and automatic model building is presented. Firstly, an approach that utilises the MPEG-4 based facial animation system to generate a set of synthetic frontal face images, with different facial expressions, from a single annotated frontal face image is proposed. Secondly, a regression-based approach for automatic annotation of non-frontal face images with arbitrary expressions that uses only the annotated frontal face images is presented. This approach employs the idea of initially learning the mapping betweeQ. the landmark points of frontal face images and the corresponding non-frontal face images at various poses. Using this learnt mapping, synthetic images of unseen faces at various poses are generated by predicting the new landmark locations and warping the texture from the frontal image. These synthetic face images are used for generating a synthetic deformable face model that is used to perforn1 fitting on unseen face images and, hence, annotate them automatically. This drastically reduces the problem of automatic face annotation and deformable model building to a problem of annotating a single frontal face image. The generalisability of the proposed approach is demonstrated by automatically annotating the face images from five publicly available databases and the results are verified by comparing them to the ground truth obtained from manual annotations. Furthermore, a fully automatic pose-invariant face recognition system is presented that can handle continuous pose variations, is not database specific, and can achieve high accuracy without any manual intervention. The main idea is to explore the problem of pose normalising each gallery and probe image (i.e. to synthesise a frontal view of each face image) before performing the face recognition task. Firstly, to achieve full automation, a robust and fully automatic view-based AAM system is presented for locating the facial landmark points and estimating the 3D head pose from an unseen single 2D face image. Secondly, novel 2D and 3D pose normalisation methods are proposed that leverage on the accurate 2D facial feature points and head pose information extracted by the view-based AAM system. The current pose-invariant face recognition system can handle pose variation up to {u00B1}45{u00B0} in yaw angles and {u00B1}30{u00B0} in pitch angles. Extensive face recognition experiments were conducted on five publicly available databases. The results clearly show excellent generalisability of the proposed system by achieving high accuracy on all five databases and outperforming other automatic methods convincingly, with the proposed 3D pose normalisation method outperforming the proposed 2D pose normalisation method.

Description

Keywords

Citation

Source

Type

Thesis (PhD)

Book Title

Entity type

Access Statement

Open Access

License Rights

Restricted until

Downloads