As part of the recent Nuke 13.0 release, Foundry has integrated a new suite of machine learning tools including the CopyCat (available in NukeX and Studio), a native Nuke node that allows artists to train neural networks to create custom effects for their own image-based tasks. CopyCat is a powerful tool for reproducing a Nuke VFX solution via trained machine learning and not just by scripting a process directly. Having learned the desired approach to an image sequence from just a few frames, CopyCat allows the artist to infer the same results on an entire clip. The process is remarkable but clearly requires a problem or process that justifies the time it takes to set up and train on the sample training material.
One such application is soft color segmentation, which was first published a few years ago, but has been generally too expensive in processing time to use in production. The technique is so powerful that compositing supervisor Rafael Silva at Soho VFX could not help being drawn to it. With CopyCat, he found a valid approach that justifies the effort required to set it up, as it produces both enormous precision for keying & color grading and a remarkable (temporally) stable result, when inferred in Nuke.
Unmixing-Based Soft Color Segmentation
In 2017 Disney Research Zurich presented a paper on Unmixing-Based Soft Color Segmentation for Image Manipulation. The central idea was to decompose an image into a set of soft color segments that are analogous to color layers with alpha channels that one might find in Photoshop.
Above the image (a) is decomposed into a set of soft segments (b). These soft segments can be treated as layers. Using this one can achieve compelling results in color editing (c), compositing (d), and many other VFX tasks.
The process can be thought of as energy preserving as the image is decomposed into a set of layers that add back to the original image. Given this equation-like approach, the operator can favor more or less of any specific aspect of the image, isolating in the process.
AOVs for live action.
Intrinsic image decomposition is a special class of layer decomposition. The main goal is to decompose an input photograph into an albedo and its irradiance layers. This specific work allows for relighting and is related to work that is now being extended by a range of other research teams. There are many combinations of color-blend modes and blended colors, which create various effects and allow for different types of control. There is no single correct way for such layering decomposition, but this is the process’s main strength, a Nuke artist can choose the way the image is decomposed.
The original Disney paper produces tremendous control for not only color grading but sky, background replacement, and green screen keying. It proved so powerful that, at Pacific Graphic 2018, Yuki Koyama & Masataka Goto from the National Institute of Advanced Industrial Science and Technology, extended it to not only produce layers that were added together but that could also be combined with advanced color-blend modes such as hard-light and multiply. A clip or photography can be reduced to something that is similar to a set of 3D AOVs very much like 3D-generated images from a renderer such as V-RAY. The Nuke artist is presented with a set of compositing semi-transparent layers that can use advanced color-blend modes, such as color-burn, multiply, or screen, which also allows for very interesting and powerful non-linear color effects.
As powerful as the process is, it is also awfully expensive and time-consuming to do. Rafael Silva, a skilled Nuke compositor, was fascinated by the control the process offered, and he struck upon a novel approach. Instead of solving every frame, solve just a few frames and use CopyCat in Nuke to process the entire clip. CopyCat uses machine learning to first learn and then infer a result. This new approach provides the enormous power of soft segmentation without the vast processing time.
In the image above the two distinct stages of the process can be seen, the training stage where CopyCat effectively learns what is wanted and the Inference stage where this is applied to a whole clip. The training does take significant time, but once trained, the inference stage can be relatively fast.
In this shot, the landscape outside the window of the car is composited or rather inferred based on just a couple of sample frames. The actual approach of CopyCat uses none of the Nuke approaches that initially solved the shot. It is not using a Nuke script to solve the shot in the way it was originally solved. Rather the new final frames are directly inferred from the example training frames. The new inferred/composited shot uses the tremendous power of the CopyCat node inside Foundry’s Nuke, – for more on CopyCat itself see our previous fxguide story.
Rafael Silva is a lead compositor at Soho VFX, the studio was founded in 2002 and it has moved from being a boutique visual effects studio, into an innovative and talented team of artists, supervisors, and developers. Silva has just finished work on a major film, (yet to be released) and he is keen to try this new approach in production. The approach is not appropriate for every shot, but the spectacular ability to control and separate the most complex imagery into useful layers makes it a viable option for incredibly complex work and produces far greater control than roto or chroma-based keying.
More CopyCat examples: Foundry panel discussion.
CopyCat is proving so significant that the Foundry is hosting their own panel to discuss CopyCat’s use in the production. The panel consists of the founder of Kognat: Sam Hodge, founder of Kaldera OÜ: Hendrik Proosa and Mads Hagbarth Damsbo, who is the owner of Higx. These Nuke experts will be going through their own experiments using CopyCat in Nuke 13.0, best practices, and what they think the future holds for machine learning.
The event is Tuesday, June 8th, 2021 – 5:00 PM (AEST) For more info click here.