Neural aggregation network for video face recognition
Date
Authors
Yang, Jiaolong
Ren, Peiran
Zhang, Dongqing
Chen, Dong
Wen, Fang
Li, Hongdong
Hua, Gang
Journal Title
Journal ISSN
Volume Title
Publisher
IEEE
Abstract
This paper presents a Neural Aggregation Network (NAN) for video face recognition. The network takes a face video or face image set of a person with a variable number of face images as its input, and produces a compact, fixed-dimension feature representation for recognition. The whole network is composed of two modules. The feature embedding module is a deep Convolutional Neural Network (CNN) which maps each face image to a feature vector. The aggregation module consists of two attention blocks which adaptively aggregate the feature vectors to form a single feature inside the convex hull spanned by them. Due to the attention mechanism, the aggregation is invariant to the image order. Our NAN is trained with a standard classification or verification loss without any extra supervision signal, and we found that it automatically learns to advocate high-quality face images while repelling low-quality ones such as blurred, occluded and improperly exposed faces. The experiments on IJB-A, YouTube Face, Celebrity-1000 video face recognition benchmarks show that it consistently outperforms naive aggregation methods and achieves the state-of-the-art accuracy.
Description
Keywords
Citation
Collections
Source
IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops
Type
Book Title
Entity type
Access Statement
License Rights
Restricted until
2037-12-31
Downloads
File
Description