3D seems to be everywhere. Three dimensional printers promise to be able to replicate everything from cars to body parts while virtual reality absolutely relies on it. Scanning objects in 3D or taking 3D pictures or video for VR or augmented reality requires a special 3D camera. These 3D cameras are being developed for sure but maybe there is another way with the help of artificial intelligence deep learning algorithms.
Computer scientists at the University of Nottingham and Kingston University have solved a complex problem that has, until now, defeated experts in vision and graphics research. They have developed technology capable of producing 3D facial reconstruction from a single 2D image – the 3D selfie.
Their new web app allows people to upload a single colour image and receive, in a few seconds, a 3D model showing the shape of their face.
People are queuing up to try it and so far, more than 400,000 users have had a go. You can do it yourself by taking a selfie and uploading it to their website.
The research – ‘Large Pose 3D Face Reconstruction from a Single Image via Direct Volumetric CNN Regression‘ – was led by PhD student Aaron Jackson and carried out with fellow PhD student Adrian Bulat both based in the Computer Vision Laboratory in the School of Computer Science. Both students are supervised by Georgios (Yorgos) Tzimiropoulos, Assistant Professor in the School of Computer Science.
The work was done in collaboration with Dr Vasileios Argyriou from the School of Computer Science and Mathematics at Kingston University.
The results will be presented at the International Conference on Computer Vision (ICCV) 2017 in Venice next month.
Technology at a very early stage
The technique is far from perfect but this is the breakthrough computer scientists have been looking for.
It has been developed using a Convolutional Neural Network (CNN) – an area of artificial intelligence (AI) which uses machine learning to give computers the ability to learn without being explicitly programmed.
The research team, supervised by Dr Yorgos Tzimiropoulos, trained a CNN on a huge dataset of 2D pictures and 3D facial models.
With all this information their CNN is able to reconstruct 3D facial geometry from a single 2D image. It can also take a good guess at the non-visible parts of the face.
Simple idea complex problem
Dr Tzimiropoulos said:
The main novelty is in the simplicity of our approach which bypasses the complex pipelines typically used by other techniques. We instead came up with the idea of training a big neural network on 80,000 faces to directly learn to output the 3D facial geometry from a single 2D image.
This is a problem of extraordinary difficulty. Current systems require multiple facial images and face several challenges, such as dense correspondences across large facial poses, expressions and non-uniform illumination.
Aaron Jackson said:
Our CNN uses just a single 2D facial image, and works for arbitrary facial poses (e.g. front or profile images) and facial expressions (e.g. smiling).
Adrian Bulat said
The method can be used to reconstruct the whole 3D facial geometry including the non-visible parts of the face.
Their technique demonstrates some of the advances possible through deep learning, a form of machine learning that uses artificial neural networks to mimic the way the brain makes connections between pieces of information.
Dr Vasileios Argyriou, from Kingston University’s Faculty of Science, Engineering and Computing, said:
What’s really impressive about this technique is how it has made the process of creating a 3D facial model so simple.
What could the applications be developed from deep leaning algorithms?
Aside from the more standard applications, such as face and emotion recognition, this technology could be used to personalise computer games, improve augmented reality, and let people try on online accessories such as glasses.
It could also have medical applications – such as simulating the results of plastic surgery or helping to understand medical conditions such as autism and depression.
Aaron’s PhD is funded by the University of Nottingham. His research is focused on deep learning applied to the human face. This includes 3D reconstruction and segmentation applied to the human face and body.
Adrian Bulat is a PhD student in the Computer Vision Lab. His main research interests are in the area of face analysis, human pose estimation and neural network quantization/binarization.