3D Avatar Diffusion: Microsoft Unveils Neural Network Capable of Creating 3D Human Avatars from Photographs


Published on:

Microsoft has unveiled a neural network that can generate incredibly realistic 3D human avatars from images. 

The 3D Avatar Diffusion project, created by a team of Microsoft Research experts, is an artificial intelligence system that automatically produces 3D avatars that can be viewed in high resolution from all sides in 360 degrees. Avatars’ appearances, such as haircuts, facial expressions, and clothes, can also be changed.

This technology has the potential to greatly accelerate the traditional and complicated process of 3D modelling while also providing new options for 3D artists. The system’s avatars may be utilised in virtual and augmented reality to build realistic 3D character models for video games.

The 3D Avatar Diffusion neural network is based on the diffusion model, “state-of-the-art generative technique” — machine learning method.

Microsoft 3D Avatar Diffusion

Because diffusion models are generative, they can generate new data comparable to training data. The researchers trained the model using a collection of over 200,000 3D face models, allowing it to construct a convincing 3D avatar from a single 2D photograph. According to Microsoft, “Once the generative model is trained, one can control the avatar generation based on the latent code derived from either an input image, text prompt or random noise.”

The prohibitive memory and processing requirements associated with 3D make it difficult to build the complex features required to produce high-quality avatars. The developers offer an embedded diffusion network to overcome this problem (Rodin).

Microsoft’s 3D Avatar Diffusion project represents a significant leap in artificial intelligence and 3D modelling, providing faster and more efficient methods of creating highly complex and lifelike avatars. This technology has the potential to transform virtual and augmented reality, as well as play an important part in the coming metaverse.

Earlier Microsoft announced VALL-E, a new speech synthesis AI model which can simulate a human voice recorded from a three-second audio sample.

Vishak is a skilled Editor-in-chief at Code and Hack with a passion for AI and coding. He has a deep understanding of the latest trends and advancements in the fields of AI and Coding. He creates engaging and informative content on various topics related to AI, including machine learning, natural language processing, and coding. He stays up to date with the latest news and breakthroughs in these areas and delivers insightful articles and blog posts that help his readers stay informed and engaged.

Related Posts:

Leave a Reply

Please enter your comment!
Please enter your name here