in , ,

3D Avatar Diffusion: Microsoft Unveils Neural Network Capable of Creating 3D Human Avatars from Photographs

Microsoft project 3D Avatar Diffusion

Microsoft has unveiled a neural network that can generate incredibly realistic 3D human avatars from images. 

The 3D Avatar Diffusion project, created by a team of Microsoft Research experts, is an artificial intelligence system that automatically produces 3D avatars that can be viewed in high resolution from all sides in 360 degrees. Avatars’ appearances, such as haircuts, facial expressions, and clothes, can also be changed.

This technology has the potential to greatly accelerate the traditional and complicated process of 3D modelling while also providing new options for 3D artists. The system’s avatars may be utilised in virtual and augmented reality to build realistic 3D character models for video games.

The 3D Avatar Diffusion neural network is based on the diffusion model, “state-of-the-art generative technique” — machine learning method.

Microsoft 3D Avatar Diffusion

Because diffusion models are generative, they can generate new data comparable to training data. The researchers trained the model using a collection of over 200,000 3D face models, allowing it to construct a convincing 3D avatar from a single 2D photograph. According to Microsoft, “Once the generative model is trained, one can control the avatar generation based on the latent code derived from either an input image, text prompt or random noise.”

The prohibitive memory and processing requirements associated with 3D make it difficult to build the complex features required to produce high-quality avatars. The developers offer an embedded diffusion network to overcome this problem (Rodin).

Microsoft’s 3D Avatar Diffusion project represents a significant leap in artificial intelligence and 3D modelling, providing faster and more efficient methods of creating highly complex and lifelike avatars. This technology has the potential to transform virtual and augmented reality, as well as play an important part in the coming metaverse.

Earlier Microsoft announced VALL-E, a new speech synthesis AI model which can simulate a human voice recorded from a three-second audio sample.