THE FIRST REAL TIME LIVE @ SIGGRAPH ASIA
For the first time, Real Time Live (RTL) was included in SIGGRAPH Asia. The event was slightly different from the main North American conference as the event was by invitation and the curated content included some projects that had already been shown in Vancouver.
The event closed out the SIGGRAPH Asia 2018 conference in Tokyo last week and included eight real time interactive projects.
The selection criteria for RTL SIGGRAPH Asia requires that the presentations be entertaining, realtime /interactive and innovative.
SIGGRAPH requires that the teams present work that is
- Innovation and pushes Real Time or immersive techniques,
- Interactivity: with both controlled and quality interaction,
- Exhibits creativity and originality,
- Is of broad interest and entertainment value, and finally
- Has high production values.
The program of eight teams was made up of some experienced RTL teams such as Pinscreen and Dreamflux and some new local teams such as GREE which were showing their Virtual YouTubers or Vtubers.
Preparing for the event the teams needed to bring their own specialist equipment. The event did not go off without a couple of technical AV hick-ups, but the audience was incredibly supportive of the event. There is something about a live event that makes any minor problems ‘amusing’, rather than annoying.
The teams each had about 15 mins to present, which was longer than Vancouver and will most likely be reduced for next year’s Real Time Live at SIGGRAPH Asia 2019 in Brisbane (Nov 2019).
The first presentation was the Dreamflux project. This had been seen in Vancouver and involved placing CGI items into natural 360 video scenes with plausible lighting, shadows and reflections.
Dreamflux blends 3D virtual objects into live streamed 360 videos in real-time, providing the illusion of interacting with objects in the video. From a standard 360° video, the team showed automatically extracting important lighting details such as the Sun and then illuminate virtual objects composited into the video.
Their MR360 toolkit runs in real-time and is integrated into game engines, enabling content creators to conveniently build interactive mixed reality applications. They also demonstrated applications for “augmented teleportation” using the 360° videos. This application allows VR users to travel to different 360° videos. Using a live streaming 360° camera, they team traveled across the Real-Time Live! stage.The project comes from the Computational Media Innovation Centre, at Victoria University of Wellington in New Zealand.
Square Enix presented interaction in a virtual environment with natural language speech. They presented an architecture for immersive VR interaction with an animated emotional AI character. Their live-demo introduced a method, that they believe, leads to more expressive and lively agents, that can interact with a player. The live-demo had an alien help repair a rocket ship and was presented by Gautier Boeda and Yuta Mizuno, both from Square Enix. This project currently only works in Japanese but the team are planning to expand it to also work with English.
Reality : Be yourself that you want to be
Gree presented two interactive lively live anime characters talking and singing with the audience.
Gree’s virtual YouTubers are a very new enterprise, just 6 months old. The ‘broadcasting’ studios in Tokyo has multiple stages where the presenters in Ikinema Orion suits go ‘narrowcast’ live over the internet and to Gree’s own App. Unfortunately the App is not yet widely available (especially outside Japan). The business model is free, but with the active economy of allowing viewers to buy props and presents for the presenters in real time during the sessions.
The Gree presenters normally use the new IKINEMA Orion low-cost, Motion Capture system. The Orion uses hardware technology by Vive. With just 6 or 8 tracking points and minimum calibration, Orion users are able to capture realistic, full-body movements in real-time with a very low cost setup. The product was released quietly a little while ago and Gree have adopted it as part of their standard pipeline. At RTL the Gree team decided to use the Xsens suits (see below), and then stream and retarget that data via the IKINEMA LiveAction plugin.
In addition to showing the relatively new Ikinema suit, (more on this in an upcoming fxguide story), Atbin Ebrahimpour, technical artist from Ikinema was on hand to have the two suits running in realtime at 30 fps, producing the body motion capture data for the two Virtual YouTuber on screen.
Part of the GREE presentation was a demonstration of the StretchSense Smart Gloves. These gloves (~$3000 a pair) provide articulated finger movement for the performers. Each glove contains a stretch sensor sewn into each finger sleeve, along with two additional sensors around the thumb for multi-axial capture of thumb movement. The sensors communicate with a sensing circuit to transmit data wirelessly to a mobile device running the motion capture app.
The GREE technology is similar to the winner of SIGGRAPH Vancouver’s Real Time Live, in that it uses an Apple iphone X on a head mounted rig as the facial input device. The company has built a very successful business in Tokyo in a short amount of time, and is a world leader in Virtual Youtubers (VTubers) and online real time digital influencers.
Mimic productions in Germany had planned to produce a fully body person for RTL, but instead choose to focus just on the face of their character, based on the actual face of Mimic Productions CEO Hermione Mitford. In addition to running the company, Mitford is a performance artist in her own right and she seeks to explore both artist and technical possibilities with virtual humans.
The Mimic double of Mitford nicknamed EM was somewhat androgynous due to the lack of digital hair, and being voiced by Ben Scott, one of the male Mimic team members. Like Gree, the Mimic team use an iphone depth front facing camera to drive the face, but the team goes beyond the standard 51 blendshapes that Apple supports and the team use hundreds of corrective blendshapes to get the best match to the original performance.
The MIMIC character is rendered in Unreal UE4, and it is driven by an iPhone using ARKit. But the phone is not on a head mounted rig. The input iphone was clamped to the table. While the team was not showing beyond the head, they also do use X-Sens suits, and other body motion capture solutions for full body Mo-Cap. Each frame of EM was rendered at 12 milliseconds a frame, on two GTX 1080 Graphics cards from NVIDIA.
MIMIC Productions was co-founded by David Bennett. A well known facial expert, Bennett’s facial experience extends back to Beowulf, Monster House and Polar Express at SPI. He was also Facial capture manager at Weta Digital for over 5 years, working on Avatar, TinTin and Rise of the Planet of the Apes.
The MIMIC model was built on some 300 scans and (unpolarised) photos of Mitford’s face with their own multiple DSLR capture rig. The MIMIC face pipeline is a FACS based approach and the scans are done ear to ear inclusively and covering the subjects neck. The correspondence between static scans was solved with a combination “of optical flow and alignment / optimisation software that combines any two photogrammetry poses together” explained Bennett. From the initial photogrammetry, the model then goes into ZBrush, for fine details such as pores to be added. The team built their own animation facial rig and they are directly driving the model in UE4.
“I think what is interesting about our project is we have a great base, right now – and with Hermione being part of the company, she is always available” comments Bennett. “So while we are not perfect yet, it is always improving”.
The video above is a pre SIGGRAPH test and is MUTE.
Mitford explained that the team built hair for the model, but rejected using it for both technical reasons , “and we think she just looks cooler” adds Bennett.
Leading the move to real time applications on mobile devices was the team from Pinscreen in LA. While they had previously contributed and demonstrated at Real Time Live in Vancouver, their performance in Tokyo centred around a new version of their App. (Which is now available in the iTunes Store).
The Pinscreen technology is a combination of advanced markerless facial tracking and solving combined with their paGAN engine for realtime face simulation and manipulation. The App lets users make their own Avatar from a single JPEG image and then not only drive the face but place their avatar on a range of male and female bodies and in a variety of locations from exteriors to Udon Noodle bars!
Researcher Koki Nagano drove the presentation on his iphone, both animating himself and impersonating various world leaders.
See our major fxguide story on Pinscreen here.
Bandai Namco Studios: BanaCAST
One of the strongest presentations on the night was from Bandai Namco Studios. They also presented manga style characters, but their anime characters were driven from a temporary motion capture volume on stage, that the company has design to be portable to be able to take on location.
Led by Naohiko Morimoto with Jun Ohsone and Shoko Doi, the all female presenters showed nearly flawless motion captured singing and entertaining Japanese animated characters ‘BanaCASTing’.
The two realtime characters were identical but driven by two performers. The system can cope with up to 6 characters in the capture volume, at once.
In addition to dancing and singing the performers interacted with various props such as stairs and chairs.
Jean-Colas Prunier demonstrated collaborative film making, in a live demonstration of real time cinematic collaboration between Tokyo and Paris. In a practical demonstration, three artists worked on the one animated sequence together and interactively. During the demo models and animation were imported, blocked, lit and edited into a sequence. The team had the local machine running the new NVIDIA RTX card, allowing for a very impressive final animation, all done in less than 15 minutes. Of course the demo was ‘canned’ and rehearsed but the interaction was genuine and impressive. The team could all be seen to be working on different things at the same time and very effectively.
The final presentation of the night was a live replay game movie using the recorded play data from Gran Turismo. The team’s real-time technologies enabled movie editing with creative camerawork and visual effects while reproducing the race scene with high quality graphics from the play data.
The work was presented by Masamichi Sugihara, JapanHiroki Kashiwagi, and Tatsuya Matsue, from Polyphony Digital, Japan. The team produced an impressive cinematic based on original game play in a way that is increasingly becoming important in high end gaming.