The methodology to morph one face into another face involves two main operations: morphing the first image's shape towards the second image's shape and then cross dissolving the color. If we were to simply cross-dissolve the color, we would get ghosting artifacts such as the first face's ear slowly fading away. To avoid this, we aim to transform the shape first. To do this, we need a way to define shape and we will achieve this by defining corresponding key points of each shape.
For the two images I define a distinct set of key points that has a corresponding point in each photo. For example, the bottom red key point represents the bottom of the chin for both faces.
To find the midway face, we need to first morph the shape and then cross-dissolve the color. Finding the midway shape is trivial: the key points of the mid way face are the average of the corresponding key points in both photos. However, how do we fill in the color to even know what colors to cross dissolve?
We solve this problem through a series of operations. The first step is to create a triangulation of our key points. Using this triangulation, we can calculate the affine transformation of each pixel within the triangles. With an affine transformation for each triangle, we can fill in the color of the midway face. One important thing to note is that to fill in color, we actually use an inverse warp to get the colors of each pixel in the midway face. This is so that we can use interpolation rather than sampling as our method to fill in the color.
Pictured above is the triangulations of the keypoints from George Clooney, Norman Karr, and the average shape of the two faces.
**Note: the actual meshes used for morphs also includes key points for the 4 corners of the image to capture the backgrounds**
Norman morphed into the average shape
Clooney morphed into the average shape
Cross dissolve of the two averaged faces (The Midway Face)
If we can calculate the midway face, we can calculate the 1/4 or 3/4 way faces using the same methods and using different weights for the averages. Once we do this, we can compute many of the in-between faces and then create an elegant morph video.
The morph sequence is pretty good but there are some obvious shortcomings. The first is that the teeth of my smile seem to ghost in. Secondly, there is no space between George Clooney's hair and the image boundary so as this part morphs, the background inherits just black pixels. Thus, there is an obvious remnant of Clooney's hair getting ghosted away in the background.
The ability to transform a face into a different shape is a remarkably powerful tool that we can use for many applications. The next idea we experiment with is using this tool to create a mean face from a dataset of faces. From a corpus of faces and key points, I calculated an average shape from all the key points and than transform each face into the average shape. Then I average together every transformed photo to create a single average face.
In my case, I am using The IMM Face Database. I create a total average face as well as some additional subpopulation averages.
**Note that the total average faces look dominantly male likely because this database contains 33 males and 7 females**
Pictured below are a miscellaneous collection of faces morphed into the average shape or the average face morphed into other faces.
Norman warped onto average shape
Average face morphed onto Norman's face
Average face with first male's shape
Although some of these morphs definitely leaning towards creepy, there are some good things happening. For example, most faces being morphed to the average faces become more uniformly round and symmetrical. The same can be said for my face as well except for my dimples make the distortion look a little bit too apparent.
Once we have an average shape, we can also create caricatures of individuals. Caricatures are basically representations of people where their distinct features become more pronounced; i.e. enlarging Obama's ears. We can perform a similar operation with our key point representations of faces. We subtract the mean face from our face to get a vector representing how our face deviates from our mean. We then add a scaled version of vector back into our face to make our unique features more pronounced.
**Note: I defined my own new set of key points because the key points defined from the Danes' dataset was not sufficient to account for hair properly.**
This was a simple addition for bells and whistles. I put together a morph of the BAIR Computer Vision faculty by following the same process as before but for multiple faces. Extensions to this would be to include the entire BAIR faculty but that would involve manually selecting many keypoints. I'll take suggestions for a background song/sound to add to the video.
With a simple process of keypoints and images, I thought it would be interesting to apply it to objects other than faces. I decided to apply it to objects with a hilt. I selected keypoints corresponding to hilts and blades so that the morphs would sync up based on the different pieces of the objects.
The danes dataset did not have enough faces to produce an interesting set of principle components from PCA. So to experiment with PCA, I used "The Extended Yale Face Database B" which is a dataset with 16128 images of 28 different faces under various different lightings and poses. I filtered out images that have super dark lighting and very low contrast but otherwise kept all images for PCA.
Pictured below are the first 16 eigenvectors of the dataset
The idea of PCA is that you calculate a specific set of vectors that capture the most information about the dataset. Thus, we should be able to compress, and then reconstruct, each face into the eigenvector basis produced from PCA.
(Image was present in dataset)
Reconstructed image from compressed vector
(Image not present in dataset)
Reconstructed image from compressed vector
The results shown below are the results from using the 16 previously calculated eigenvectors. The first pair of images are for an image that was present in the dataset used for PCA. The second pair of images is an attempt to compress an out of distribution image of George Clooney. The reconstruction of the in-distribution image is not perfect, the 16 eigenvectors was enough to recapture the original expression and general face structure of the image. Unfortunately, the reconstruction of George clooney is pretty underwhelming. I suspect it may be because the image is out of distribution, the image may not be aligned as well with the data, or 16 eigenvectors was simply not enough eigenvectors to represent George Clooney.