This note is a brief summary of the 3DMM paper A Morphable Model For The Synthesis Of 3D Faces.
Note: I have no idea why there is a “|” following each math environment and “\bold” has no effect, for clarity, see the original version 3D Morphable Model Method.
The model construction process consists of two steps: compute correspondence and construct model. Notice that these are steps for TRAINING, when the model is constructed, we can apply this model to new faces and scans through matching algorithm.
A face has two major properties: geometry represented as shape-vector S=(X1,Y1,Z1,X2,...,Yn,Zn)T∈ℜ3n that contains the X,Y,Z -coordinates of its n vertices; texture represented as texture-vector T=(R1,G1,B1,R2,...,Gn,Bn)T∈ℜ3n that contains the R,G,B color values of the n corresponding vertices. An arbitrary new shapes Smodel and new texture Tmodel can be expressed in linear combination of the m exemplar faces:
The construction process can be described as a PCA procedure, i.e., use principle components(eigenvectors of convariance matrices of shape and texture) to represent the model:
To quantify the results in terms of the plausibility of being faces, the author fits a multivariate normal distribution to the data set of 200 faces, then the probability for coefficients α⃗ is given by
To map facial attributes(gender, fullness of faces, darkness of eyebrows, double chins, hooked and concave noses) defined by hand-labeled set of example faces to the parameter space of the morphable model, first define shape and texture vectors that will manipulate a specific attribute:
Matching a morphable model to images is to optimize the coefficients of the 3D model( α⃗ ,β⃗ ) along with a set of rendering parameters( ρ⃗ ) such that they produce an image as close as possible to the input image.
From parameters (α⃗ ,β⃗ ,ρ⃗ ) ,colored images
In which, P(α),P(β) can be estimated by Eq (2), P(ρ) is a normal distribution and use the starting values for ρ¯j and a ad hoc values for σR,j . And p(\boldIinput|α,β,ρ)∼exp(−12σ2I⋅EI) . In [2], the reason of the distribution is “For Gaussian pixel noise with a standard deviation σI , the likelihood of observing \boldIinput , given α,β,ρ , is a product of one-dimensional normal distributions, with one distribution for each pixel and each color channel.” I still cannot understand this sentence (expect some explanations from readers). Posterior probability is then maximized by minimizing
The above are the procedure to map a 3D morphable model to images, in order to apply to scans, we just need to replace I(x,y) to I(h,ϕ) ,
All process stated above are based on the assumption that all exemplar faces are in full correspondence. This section will describe two algorithms for computing correspondence.
Optic flow is first proposed to estimate corresponding points in images I(x,y) , a gradient-based optic flow is modified for applying to 3D scans I(h,ϕ) , taking into account color and radius values simultaneously [3].
Since optic flow does not incorporate any constraints on the set of solutions, it fails on some of the more unusual faces in the database. The modified bootstrapping method improve correspondence iteratively.
The process if as follows:
1. use optic flow to compute preliminary correspondences between faces and a reference face.
2. compute morphable models based on the correspondences and average faces as new reference face.
3. match the models to 3D scans, now we have original scans and approximated scans.
4. compute the correspondences between the two scans using optic flow.
5. iterate above steps.
[1] P.J. Burt and E.H. Adelson. Merging images through pattern decomposition. In Applications of Digital Image Processing VIII, number 575, pages 173–181. SPIE The International Society for Optical Engeneering, 1985.
[2] Blanz,V.,&Vetter,T.(2003).Face recognition based on fitting a 3d morphable model. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 25(9), 1063–1074.
[3] T. Vetter and V. Blanz. Estimating coloured 3d face models from single images:An example based approach. In Burkhardt and Neumann, editors, ComputerVision – ECCV’98 Vol. II, Freiburg, Germany, 1998. Springer, Lecture Notes in Computer Science 1407.