Today, Shutterstock launched its new generative 3D API, giving enterprises a fast and ethical way to produce realistic 3D models with AI. At GTC in March, NVIDIA and Shutterstock announced early plans for the Edify text-to-3D GenAI NIM. Today in Denver, the Shutterstock 3D service launched its first commercial offering. This Edify tool enables creators and designers to generate 3D assets from just a text or image prompt.
Dade Orgeron, VP of Innovation at Shutterstock and formerly the CCO of Shutterstocks’s TurboSquid points out it has been a long time coming, “we feel like it’s ready for us to bring it to market. And as with any generative offering, it’s a constant evolution.” he explains. “Part of that evolution is the feedback loop that we can get from users to improve and make the model better and better.” The SIGGRAPH launch is for enterprise customers first, “we’re going to be working with some of the high-touch enterprise customers during our beta phase and then reaching out to further partners as well.”
Two of those early customers are HP and WPP, (see the video below). “In September, we will release a more open self-serve option, which is going to be a credit-based system, starting at a really low price so people can start to try to use it.” The initial offering will only be an API product, He adds. “it’s not something that’s going to live on the TurboSquid site or live on the Shutterstock site as an e-comm experience.” The team plans to have a plugin for Blender plus “we’re going to have it running on a Gladio instance, and we’ll have a Jupyter Notebook. ” Gradio is an open-source Python package that allows users to quickly build a demo or web application, and it is often used for machine learning models. Similarly, Jupyter Notebook is a web-based interactive platform often used to teach and explain code. Both will allow someone to share a link to a demo easily, so there’s a variety of ways that people will be able to try the Text-to-3D, even if they don’t know how to use an API.
The Shutterstock 3D assets include ready-to-edit meshes, UVs, and 4K PBR materials. They can be imported directly into popular DCC tools such as Blender or Maya and used for prototyping, set dressing, or for previz ideation.
The 3D image previews can be generated in under 10 seconds per asset. In addition to generating 3D assets to populate a scene, NVIDIA and Shutterstock also provide the ability to generate lighting and backgrounds for these scenes via a full floating point 360 HDRI Edify tool. 360 HDR images offer IBL realistic lighting approaches with up to 16K resolution and high dynamic range.
This Edify solution represents the first text-to-3D and 360 HDR generators to be trained on licensed 3D content. Building on Shutterstock’s vast catalogue, the Edify 3D and 360 is a major advance in generative AI. Built on NVIDIA’s NIMs they are a comprehensive solution for deploying generative AI for developers, and for building applications that run on DGX cloud.
The system is a multi-modal 3D generator model. It will specifically generate a 3D mesh with simple PBR materials. “If you want to feed it an image, you can feed it an image, or if you want to feed it a prompt, you can feed it a prompt. It’s your choice on how you want to start,” Dade explains. “It’ll accept multiple inputs.”
At launch, there are flags to be able to specify exactly how many polygons the user wants, and there is a strong meshing option that does a great job of generating quads which previously was not discussed as an option. There is also adaptive sampling on a mesh “so users can reduce polygons where they’re not needed and increase them where they are,” Dade explains. “So it’s really getting close to that point where it outputs a mesh that people can then actually immediately manipulate and use.”. The output formats are currently OBJ, GLB, PLY and, importantly, USDZ.
A key aspect that is of concern to many artists is the providence and rights regarding the training data. “That’s something that we’re really proud of,” Dade explains. “We are one of the very few GenAI tools that can say that all of our models are ethically sourced. That means all the data that’s gone into those models is licensed data. Any artist or contributor to that dataset that’s been used for training is compensated for that usage. There’s not a lot of people that can say that. We’re really proud to say it.” The two GenAI tools were trained using 500M+ ethically sourced 3D models and 650M images.
In addition, there are guard rails to stop the generation of IP-infringing 3D models, of course, it is still possible to create something that resembles a known IP, but at the prompt level, the system stops the user from directly referring to common IP, Dade explains, “We have a pretty extensive list of IP to make sure that you can’t specifically ask for those types of things.” Adding that 3D object and prop creation is a little different. “We have a lot of safeguards in place for our e-commerce solution for generative images on Shutterstock, and we stop certain things from being generated through that system. But when you’re doing 3D, it’s a little bit different. Perhaps, I really do need a gun or a blood splatter or something gory because I’m building a video game that requires items like that. Shutterstock knows that there is a threshold that they have to account for, “and so a big part of what we see as part of our responsibility is making sure that we’re being mindful of that and crafting that in the best and most responsible way that we can.” For example, there are fully anatomically correct human models, often needed for medical or even art (statue) applications. These are things that might be blocked for stock footage GenAI prompts but are handled differently for Shutterstock’s 3D models.
All 3D models can be used commercially but not resold as part of a rival 3D library, in much the same way as any turboSquid assets might be licensed. However, the 3D GenAI capability will not be initially offered as part of TurboSquid. “To be honest, right now, it’s such a new product; there’s not many options out there, so this will kind of revolutionise the entire industry,” Dade explains. “We want to learn, we want to make sure that we’re making the right decisions and that we’re bringing a tool to artists in the way that they want to use it.” He points out that such a move is not ‘off the table’, and they would be open to such a move if users signal that they want such an option. “But for right now, we see the API as a better option for providing it in the places where the artists want it currently.”