Paper Poster Code Video Dataset
Recent work reports disparate performance for intersectional racial groups across face recognition tasks: face verification and identification. However, the definition of those racial groups has a significant impact on the underlying findings of such racial bias analysis. Previous studies define these groups based on either demographic information (e.g. African, Asian etc.) or skin tone (e.g. lighter or darker skins). The use of such sensitive or broad group definitions has disadvantages for bias investigation and subsequent counter-bias solutions design. By contrast, this study introduces an alternative racial bias analysis methodology via facial phenotype attributes for face recognition. We use the set of observable characteristics of an individual face where a race-related facial phenotype is hence specific to the human face and correlated to the racial profile of the subject. We propose categorical test cases to investigate the individual influence of those attributes on bias within face recognition tasks. We compare our phenotype-based grouping methodology with previous grouping strategies and show that phenotype-based groupings uncover hidden bias without reliance upon any potentially protected attributes or ill-defined grouping strategies. Furthermore, we contribute corresponding phenotype attribute category labels for two face recognition tasks: RFW for face verification and VGGFace2 (test set) for face identification.
Ambiguous Definition of Race: The historical and biological definitions of race vary and racial context is not fixed over time [1].
Privacy of Protected Attributes: Exposing demographic origin with in face recognition studies may identify the representation of a particular group, leading to the potential for racial profiling and associated targeting [2].
Confined Groupings: Skin or racial grouping strategies limits the scope of any study as they fail to capture the whole aspect of the racial bias problem within face recognition where it needs to consider both multi-racial or less stereotypical members of such groups.
Racial Appearance Bias: Studies [3,4] show that individuals with more stereotypical racial appearance suffer poorer outcomes than those with less stereotypical appearance for their race. A better understanding of the role of phenotypic variation complements solutions for both racial and racial appearence bias.
We propose using race-related facial (phenotype) characteristics within face recognition to investigate racial bias by categorising representative racial characteristics on the face and exploring the impact of each characteristic phenotype attribute: skin types, eyelid type, nose shape, lips shape, hair colour and hair type.
Facial phenotype attributes and their categorisation.
We annotate the phenotype attributes on each subjects of RFW [5] and VGGFace2 [6] benchmark datasets. For both datasets, we observe that the dominant phenotype attribute categories are Skin Type 2/3, Straight Hair, Narrow Nose, Other (non-monolid) Eyes, Small Lips, which correlates to the dominant presence of Caucasian faces as can be seen on the figure below.
The distribution of facial phenotype attributes of RFW (left) and VGGFace2 Test (right) datasets.
Face verification, also known as one-to-one verification, is the task of comparing two different facial images to estimate whether they belong to the same individual subject. We follow two pairing strategies to explore the impact of a single attribute (attribute-based) and appearance-based facial groups (subgroup-based) on the evaluation performance of face verification. We also test such pairing strategies on two different training setups; Setup 1 (Imbalanced Training Data:VGGFace2), Setup 2 (Racially Balanced Training Data:BUPT-Balanced).
We pair each attribute category with all other attribute categories to assess cross-attribute pairing performancen - we clearly show that Type 5, Type 6 and monolid eyes pairings have higher false positive matching rates than others.
Cross-attribute based pairings false matching rate, each cell depicts FMR on a logarithmic scale which is log10(FMR) with lower negative values (close to zero) encoding superior false match rates.
We create various subgroups with different phenotypic attribute combinations in the dataset. For example, one such subgroup consists of subjects with skin type 3, monolid eyes, straight hair, wide nose, and small lips. Our main purpose of such pairing is to show the effects of single attribute changes over a group-for instance, what would change when only skin gets darker, but other attributes remain the same? Furthermore, whilst the average accuracy of the subgroups with Type {5,6} skin type is 86.97%, subgroups with Type {1,2} skin type is 92.56%, but this notably includes other attributes effects.
Subgroup-based face verification performance on RFW, sorted by descending order of accuracy. We create various subgroups where each subgroup has same phenotypic attribute combinations.
To start working with this project you will need to take the following steps:
Install Python packages using conda env create --file environment.yaml
For face verification, please install RFW dataset and for face identification VGGFace 2.
Download pre-trained models and annotations from here. After installation, please place model.ckpt under models/ folder and place FDA files under test_assets/ folder.
To reproduce the performance reported in the paper: First, align images to 112x112.
python face_alignment.py --dataset_name RFW --data_dir datasets/test/data/African/ --output_dir datasets/test_aligned/African --landmark_file datasets/test/txts/African/African_lmk.txt
python face_alignment.py --dataset_name VGGFace2 --data_dir datasets/VGGFace2/ --output_dir datasets/test_aligned/VGGFace2_aligned --landmark_file datasets/VGGFace2/bb_landmark/loose_bb_test.csv
python face_atribute_verification.py --data_dir datasets/test_aligned/ --model_dir models/setup1_model/model --pair_file test_assets/AttributePairs/setup1/skintype_type1_6000.csv --batch_size 32
python face_cross_atribute_verification.py --input_predictions test_assets/AttributeCrossPairs/skintype_type2.csv --dist_name 'vgg_dist' --output_path test_assets/AttributeCrossPairs
The distribution of race-relavent phenotype attributes of RFW and VGGFace2 test datasets.
If you are making use of this work in any way (including our pre-trained models or datasets), you must please reference the following articles in any report, publication, presentation, software release or any other associated materials:
@InProceedings{yucermeasuring,
author = {Yucer, S. and Tektas, F. and Al Moubayed, N. and Breckon, T.P.},
title = {Measuring Hidden Bias within Face Recognition via Racial Phenotypes},
booktitle = {Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision},
year={2022},
publisher = {IEEE},
arxiv = {http://arxiv.org/abs/2110.09839},
}