At the Eurographics Symposium on Rendering last year, Brent Burley and a team from Walt Disney Animation Studios presented a paper called, ‘Sorted Deferred Shading for Production Path Tracing’. This was the first major signal that Disney Animation was developing their own new production renderer from scratch, based around a new idea Burley had come up with. Burley had pitched the idea in 2011, ultimately to Disney Animation President Ed Catmull, who gave the team his encouragement to explore and build the new style of path tracer. The reception at ESR2013 was not so universally glowing. “People said, “Oh well that’s not going to work in practice,” says Hank Driskill, Technical Supervisor on Big Hero 6 (BH6), at Disney Animation. “We were told that a lot!” adds Andy Hendrickson, Disney Animation’s Chief Technology Officer, who oversaw the creation of BH6 working with directors Don Hall and Chris Williams. Both are veteran Disney animators.
The central innovation described in the paper and that is at the heart of the new renderer – called Hyperion – is to find a way to path trace a scene with a lot of geometry. But unlike most approaches, it does two things differently. Firstly, the renderer sorts large potentially out-of-core ray batches, effectively lumping similar ray bounces together, and secondly and most importantly, the renderer does not do the actual shading of the ray hits until the rays are sorted and grouped. This allows for a cache free system of doing large model global illumination, irradiance, radiosity and/or physically based lighting inside the memory constraints of a practical renderer on a chip.
The system avoids the caching and data management burden associated with large point clouds and deep shadow maps. For many other renderers without caches, it is important to reduce the ray bounces and shade fewer points. Hyperion allows complex ray bounces and shading on extremely large geo, exactly what was needed for the complex world of BH6, the first feature film to use the new renderer. BH6 is the 54th animated feature from Disney and it is a superhero-comedy film inspired by the Marvel Comics superhero team of the same name. The film is set in a fictional metropolis called San Fransokyo. The city was actually based on a caricature of real county assessor property data from the city’s Assessor-Recorder’s office maps of San Francisco. The city was altered for effect, for example the hills are steeper and higher than the real San Fran. The city was an artistic and technical challenge because it was three times as complex as any Disney film so far.
The city of San Fransokyo has 83,000 buildings built procedurally, plus a similar number of street props and trees. There are 216,000 street lights in the city instanced off six instanced based designs. The pipeline and Hyperion supports instancing and that was used in these shots, as well as for the micro-bots featured in the film. One of the most complex challenges for a production ray tracing renderer is coping with the vast amounts of lights and light signage in a city at night. “There are a couple of hundred thousand lights in the city,” says Driskill. “It was a challenge, we had to work hard to get that to all render noise-free and look great, but we took it on, and I think the imagery speaks for itself.
In terms of lighting, not only does the system have area lights, the hallmark of physically based lighting approaches, but the renderer also supports illuminating volumes and arbitrarily shaped light emitting geometry. This is a general tool but it was used for such things as one of the characters. Fred breathes fire – “the fire illuminates the environment, it illuminates the smoke coming off the fire, it is really really pretty, yeah anything can be a emitter,” notes Driskill.
Hyperion took two years to build, and as the team was making BH6, “we had to remind ourselves we were making the movie on a beta,” says Driskill, “it really was just a clever idea that made true multi-bounce illumination more palatable with more complex environments, which is something that a lot of global illumination renderers struggle with ie. lots and lots and lots of geometry. Our movies were getting more complex and we felt it was an idea worth pursuing.”
Driskill is hesitant to say that even today the software is even really a 1.0 version, but of course it has already proven its value and image quality with not only BH6 but the short film Feast that will play before the feature when it premieres in November worldwide. Driskill feels the imagery set in the vast and complex San Fransokyo just “wouldn’t have been able to have been made any other way.”
But what about RenderMan? According to Hendrickson the attractive technique for Disney Animation was the use of global illumination, not that it is remarkably new across the industry. But each content creation group (ILM, Pixar, etc) has their own pipelines, and this helps “innovate forward as fast as we can.” Each group does individual “computer graphics experiments and then shares the outcomes with the other groups,” he explains.
One of these innovative explorations was lead by Burley. It is the group’s intention to continue with Hyperion’s development as a separate renderer, but with benefits spreading to other areas. “We are going to continue with this development but the interesting things we discover and the really cool stuff – other groups will grab and put in their code bases too,” says Hendrickson. A good example is the shader descriptions that appeared in PrMan v19. The PxrDisney BxDfs (BRDFs) that ship with the new RenderMan are from Disney Animation and are exactly the same type that Hyperion uses.
The renderer is a uni-directional path tracer, strictly speaking, there are certainly aspects that pass information back but as bi-directional techniques would require full light paths, the current Hyperion would not be able to explore bi-direction path tracing without serious modification. The aspect that is not dissimilar to a bi-directional approach is contained in the fact that a ray stops when it reaches a light source (clearly, it has no more bounces), and it hands back the results “weighted by everything they have bounced off to get there,” says Driskill. “Once a ray has done that, that last ray remembers where it came from so you start populating backwards where your first bounce light sources will be and so on.” This means that the next ray that happens to land in the same place going in roughly the same direction will be informed where the illumination sources will be. It is not a bi-directional system, but it is perhaps more than a naïve uni-directional tracer too. Hyperion does caustics, and by just tracing back the bounces just a couple of bounces, “produced results we are really happy with, including some beautiful caustics we had,” adds Driskill.
Hyperion was faced with the task of rendering BH6 with not only a large urban environment with a lot of reflective surfaces and light sources, but also complex character rendering with detailed sub-surface scattering (SSS). “Our next film is entirely furry characters,” comments Driskill. As such, the team needed to solve SSS, fur and organic material rendering as much as more traditionally easy ray traced hard surfaces.
The team’s approach on SSS has not been published yet, but they produced their own new SSS algorithm for BH6. “We looked through all the current methods,” says Hendrickson, “and we decided that none of them actually worked as well as we’d like them so we did our own.
As the team was building the renderer from the ground up, they got to examine each aspect of a modern renderer and decide if they wanted to explore new options or deploy more known algorithms. For example, volumetric rendering for transmissive volumes, something that clearly was going to feature in the form of fog – given the film’s location. The team decided not on a traditional approach and worked heavily with Disney Zurich Research on a hybrid approach, thought to be not dissimilar from the paper Zurich Research published at SIGGRAPH this year, but Disney is yet to publish this publicly.
Hyperion uses MIS extensively. Even though an estimated 700,000+ lines of new C++ code was primarily written in the last two years, it was still built on a lot of key lessons learn from films like Wreck-It Ralph and Frozen, which in the case of Frozen produced the BRDF framework and “our first pass at importance sampling with area lights – which we did in Ralph,” says Driskill. “It was all built on a lot of things we learnt over those two movies.”
The renderer and pipeline can render with deep data. It was not used on all shots, however. “We use it sparingly just because of the data management concerns,” says Driskill, referring to the huge data sets complex deep data rendering can produce.
Stereo: bent rays
Disney has a very long and proud history, especially in recent years with stereo production pipelines. One of the advanced techniques used in say a film such as Tangled, was to split the image planes depth wise for some character shots. Since a hero character close to camera felt attractive with a stereo convergence, that would mean the background in the same shot would be too extreme and unpleasing on the eye. Instead of dialing down the stereo effect overall, Disney pioneered the use of special camera rigs that would render the foreground with one ‘stereo camera’ or solution, while rendering the background with another. Thus the foreground character would appear round and with more fullness, but their background would appear more relaxed, less stereo pronounced. So successful were these stereo camera rigs that Disney Animation provided them to Pixar, and they were adopted in Pixar features also.
With BH6, the team went further developing bent ray cameras. Initially this means that in the same render the rays change in shot, avoiding having to segment the renders in depth and doing this by altering the rays’ characteristics left to right eye as the rays moves in z space into the scene. The first implementation for BH6 does this with an abrupt switch at a threshold point, but the team are exploring actually bending the rays in the sense that the transition from foreground to background would feather or transition smoothly from one stereo setup to another. “It was a huge success – the lighting guys love us,” says Driskill. “Now they just say render both eyes and they get they get the effect for free with what we call the multi-rig all encapsulated inside that one render process in a really fun way.” This is part of several new stereo optimizations and enhancements the team plan to soon explore implementing, and this is also one of the many areas Disney has patented.
Hyperion is not a spectral renderer like Maxwell or Weta’s new Manuka, as it was not needed for BH6, but Hendrickson admits that they are interested in exploring spectral rendering especially for “fluorescents, chromatic effects and iridescents and such things” in the future.
Rendering, CODA and the farm
As a production renderer Hyperion was always designed for use on a hard core render farm, and Disney has a serious render farm.
Disney Animation’s render farm is actually a cloud spread across four physical sites, according to Electronic Design magazine. It ranks about 75th compared to other supercomputers with over 55,000 Intel cores. There is some custom acceleration support “but GPUs and FPGAs are something for the future,” notes the magazine. The system has 400 Tbytes of memory and uses about 1.5 MW of power. That is actually a reasonable amount of power compared to the render farm size, and normal power/air conditioning requirements for big farms. “The system is built around 1U COTS servers. It is linked via 10 Gigabit Ethernet links. All non-volatile storage is solid state disks (SSD). The Disney archives are currently 4 Pbytes”.
Disney’s in-house CODA job distribution system makes the system appear as a single virtual system. It handles a range of chores from rendering to asset management. The system typically performs 1.1 million render hours per day. This equates to hundred of thousands of tasks per day. The system is so automated that there is no overnight staff, explained Driskill to fxguide.
The CODA system was built three years ago and like the Bundled Rays idea, it as initially thought to be something that may not work in practice. The system works with the concept of “speculative execution,” notes Driskill. “You launch jobs to fill the queue all the time, and individual groups have a section of the queue that belongs to them but anyone can be using any of it – with the notion that if they actually need it, it boots every other job out of the way.” That idea of launching jobs but then killing jobs to make room if needed was originally thought to be an odd idea. But Driskill points out that the system works so well, it helped push to make a lot of code re-entrant (able to be picked up and continued if interrupted). “It keeps our queue utilization really high – we operate in the 90+% range in keeping the queue busy,” he says. The team clocked over a million render hours a day on BH6.
Above is an original test render done, pre-Hyperion, and released as a teaser for the film.
How it works
(This section is based on interviews and the ESR2013 paper: Sorted Deferred Shading for Production Path Tracing by Christian Eisenacher, Gregory Nichols, Andrew Selle and Brent Burley.)
In explaining the innovation in Hyperion, Driskill says that other global illumination renderers “loads up its geometry and then it starts throwing out its rays from the camera, and those rays hit objects and then they bounce.” As with the logic of calculating the rendering solution, at the intersection of a ray and an object, based on its material, the renderer then fires off a series of secondary rays. “And you do that for one bounce you get a whole lot of scattering of a bunch of rays – going a bunch of different directions, when you try that for 2,3,4, or 5 or 15 bounces it gets astronomically expensive.”
Coupled with the complexity is that as the rays spread out, they need the scene geometry in memory for all those intersections, “because any given ray you are processing at any given moment is going somewhere,” says Driskill.
“But the clever idea (in Hyperion),” adds Driskill, “was the idea of ray bundling. If you treat the render as a two step process:
- you are going to throw rays
- you are going to resolve what they do when they hit something
And if you bundle all the rays that are more or less heading in the same direction, you grab all of those and treat each ‘wave’ of rays as their own thing.”
In the ESR paper, this was explained as “we sort large, potentially out-of-core ray batches to ensure coherence. Working with large batches is essential to extract coherent ray groups from complex scenes. Second, we sort ray hits for deferred shading with out-of core textures. For each batch we achieve perfectly coherent shading with sequential texture reads, eliminating the need for a texture cache.” Few renderers consider out-of core texture access.
In the overview above, the process starts with primary rays sent from the camera. Hyperion performs ray sorting, and as seen in the expanded view it bins rays by direction and groups them into large, sorted ray batches of fixed size, typically about 30-60M rays per batch, according to the ESR2013 paper. The system then streams inactive ray batches to a local solid state ‘hard’ drive until the system is ready to sort and trace the next batch.
Next, the system performs a scene traversal, using one sorted ray batch at a time. The team exploit the fact that their ray batches are directionally coherent to perform approximate front-to-back traversal at each node. The result of traversal is a list of hit points (one per ray).
Next, hit point sorting organizes ray hits by shading context. Each subdivision mesh is associated with one or more texture-files containing a per-face texture for each layer; thus, a full shading context consists of a mesh ID and a face ID. They then group hit points by mesh ID, and then sort each group by face ID for coherent texturing and shading.
Shading happens in parallel with each thread processing a different mesh. If a shading task has many hit points, it is partitioned into sub-tasks, further increasing parallelism. If an object or hit point is found to be emissive, its emission is splatted into the image buffer. The shader also feeds secondary rays back into ray sorting to continue ray paths. Because each mesh face is touched at most once when shading a ray batch, all shader inputs, including texture maps, are only accessed once. – ESR2013 paper
This core concept “is extremely powerful,” says Driskill, “as it lest you have have just a huge amount of geometry in the scene because you are not having to keep all of it ‘handy’ in case a ray makes it over there.You treat all the rays in bundles according to where they are heading, it’s one of those clever ideas where you go, ‘Oh wow..’. It’s different – and we decided it was worth trying. We did a little science experiment with it, the experiment looked really promising and then we just decided to go for it, and we made a movie with it – which was kinda crazy!”
The team published results in the same paper comparing performance between the Hyperion approach and both the then current versions of RenderMan and Arnold, over a range of bounces (2,3,4 bounces of indirect). Textures were stored in a compressed format on a texture server. They tested the renders using 12 threads on a 12-core Xeon 5675 system with 48GB memory.
The Hyperion or standard approaches without textures was dominated by ray traversal time and there was little difference between the three renderers (9.8, 10.7, 8.7 minutes for Arnold, PRMan, and the Disney Animation method, respectively). But of course textures are key to any scene and when textures were added back in, the renderers without sorting performed less well, exhibiting a “superlinear increase in cost for each additional texture layer. With our method, each additional texture layer added a modest linear cost of roughly 52 seconds. For four texture layers, render times were 1094, 214, and 11.2 minutes for Arnold, PRMan, and our method. Without sorting, our render times degenerated to 819 minutes, around 70x slower,” the papers published results stated. It also noted that while Disney is, of course, a company that uses Ptex, the use of texturing with Ptex is not required. “Our method can be formulated equally well in the context of conventional texture storage methods. In particular, atlased models could be accessed in groups sorted by subimage and UV order”.
Comparing Hyperion and commercial production renderers, especially in the area of texturing, is difficult as different systems use different approaches. To make the test meaningful, “Ray counts and sampling strategies were matched as closely as possible, and all advanced features such as adaptive sampling, radiosity caching, etc., were turned off to isolate raw intersection and shading performance. The built-in ptexture shadeop was used in PRMan and a comparable shader node was implemented in Arnold using the same libPtex library.”
Also both RenderMan and Arnold were configured to “cache 1000 texture files and 100MB per thread, a generous cache size for 2710 texture files per layer.”
Note: In the early days of the renderer and still in some of the preliminary press on BH6 the renderer is referred to as a streaming renderer. This term has been all but dropped but it referred to the way data or geo is streamed into the render pipeline – dealing with only a sub-set of the geometry – ‘streaming the geometry in and out’, but it is not thought to be an accurate description of the renderer moving forward, still the term exists from the early publicity. The term was used for the code anme of the initial prototype project but is no longer used internally at Disney.
Hyperion was also used on the Short Feast, watch for special interviews and coverage of this and BH6 when the film is released in Nov here at fxguide.com
All images © Disney. All rights reserved. Used with permission.