The Lytro Immerge is the key to live action acting in VR…and here’s why.
…but first some background.
Most of us know Lytro from its revolutionary stills camera which allowed for an image to be adjusted in post as never before – it allowed focus to be changed. It did this by capturing a Lightfield and it seemed to offer a glimpse into the future of cameras built on a cross of new technology and the exciting field of computational photography.
Why then did the camera fail? Heck, we sold ours about 8 months after buying it.
Lightfield technology did allow for the image to be adjusted in terms of depth or focus in post, but many soon found that this was just delaying a decision from on location. If you wanted to send someone a Lytro image you almost always just picked the focus and sent a flat .jpeg. The only alternative was to send them a file which required a special viewer. The problem with the later was simple, someone else ‘finished’ taking your photo for you – you had no control. It was delaying an on set focus decision to the point that you never decided at all! The problem with the former, ie. rendering a jpeg, was that the actual image was not better than one could get from a good Canon or Nikon, actually it was a bit worse as the optics for Lightfield could not outgun your trusty Canon 5D.
In summary: the problem was we did not have a reason to not want to lock down the image. Lightfield was a solution looking for a problem. We needed somewhere it made sense to not ‘lock down’ the image and keep it ‘alive’ for the end user.
Enter VR – it is the solution that Lightfield solves.
Currently much of the VR that is cutting edge is computer generated – the rigs that incorporate head movement can understand you are moving your head to the side and it renders the right pair of images for your eyes. While a live action capture will allow you to spin on the spot and see in all directions, a live action capture did not (until now) allow you to lean to one side to miss a slow motion bullet traveling right at you the way a CG scene could.
Live action was stereo and 360 but there was no parallax. If you wanted to see around a thing…you couldn’t. There are some key exceptions such as 8i which have managed to capture video from multiple cameras and then allow a live action playback with head tracking, parallax and the full six degrees of motion, thus becoming dramatically more immersive. However, 8i is a specialist rig which is effectively a concave wall or bank of cameras around someone, a few meters back from them. The new Immerge from Lytro is different – it is a ball of cameras on a stick.
Lytro Immerge seems to be the world’s first commercial professional Lightfield solution for cinematic VR, which will capture ‘video’ from many points of view at once and thereby provide a more lifelike presence for live action VR through six degrees of freedom. It is built from the ground up as a full workflow, camera, storage and even NUKE compositing to color grading pipeline. This allows the blending of live action and computer graphics (CG) using Lightfield data, although details on how you will render your CGI to match the Lightfield captured data is still unclear.
With this configurable capture and playback system, any of the appropriate display head rigs should support the new storytelling approach, since at the headgear end, there is no new format, all the heavy lifting is done earlier in the pipeline.
How does it work?
The only solution dynamic six degrees of freedom is to render the live action and CGI as needed, in response to the head units render requests. In effect you have a render volume. Imagine a meter square box within which you can move your head freely. Once the data is captured the system can solve for any stereo pair anywhere in the 3D volume. Conceptually, this is not that different from what happens now for live action stereo. Most VR rigs capture images from a set of camera and then resolve a ‘virtual’ stereo pair from the 360 overlapping imagery. It is hard to do but if you think of the level 360 panorama view as a strip that is like a 360 degree mini-cinema screen that sits around you like a level ribbon of continuous imagery, then you just need to find the right places to interpolate between camera view.
Of course, if the cameras had captured the world as a nodal pan there would be no stereo to see. But no camera rig does this – given the physical size of cameras all sitting in a circle… a camera to the left of another sees a slightly different view and that offset, that difference in parallax, is your stereo. So if solving off the horizontal offset around a ring is the secret to stereo VR live action, then the Lytro Immerge does this not just around the outside ring but anywhere in the cube volume. Instead of interpolating between camera views it builds up a vast set of views from its custom lenses and then virtualizes the correct view from anywhere.
Actually it even goes further. You can move outside the ‘perfect’ volume, but at this point it will start to not have previously obstructed scene information. So if you look at some trees, and then move your head inside the volume, you can see perfectly around one to another. But if you move too far there will be some part of the back forest that was never captured and hence can’t be used or provided in the real time experience, in a sense you have an elegant fall off in fidelity as you ‘brake the viewing cube’.
VR was already a lot of data, but once you move to Lightfield capture it is vastly more, which is why Lytro has developed a special server, which will feed into editing pipelines and tools such as NUKE and which can record and hold one hour of footage. The server has a touch-screen interface, designed to make professional cinematographers feel at home. PCmag reports that it allows for control over camera functions via a panel interface, and “even though the underlying capture technology differs from a cinema camera, the controls—ISO, shutter angle, focal length, and the like—remain the same.”
Doesn’t this seem like a lot of work just for head tracking?
The best way to explain this is to say, it must have seemed like a lot of work to make B/W films become color…but it added so much there was no going back. You could see someone in black and white and read a good performance, but in color there was a richer experience, closer to the real world we inhabit.
With six degrees of freedom, the world comes alive. Having seen prototype and experimental Lightfield VR experiences all I can say is that it does make a huge difference. A good example comes from an experimental piece done by Otoy. Working with USC-ICT and Dr Paul Debevec they made a rig that effectively scanned a room. Instead of rows and rows of cameras in a circle and stacked on top of one another virtually, the team created a vast data set for Lightfield generation by having the one camera swung around 360 at one height – then lifted up and swung around again, and again all with a robotic arm. This sweeping meant a series of circular camera data sets that in total added up to a ball of data.
Unlike the new Lytro approach, this works only on a static scene, a huge limitation compared to the Immerge, but still a valid data set. This ball of data is however conceptually similar to the ball of data that is at the core of the Lytro limitation, but unlike the Lytro this was an experimental piece and as such was completed earlier this year. What is significant is just how different this experience is over a normal stereo VR experience. For example, even though the room is static, as you move your head the specular highlights change and you can much more accurately sense the nature of the materials being used. In a stereo rig, I was no better able to tell you what a bench top was made of than looking at a good quality still, but in a Lightfield you adjust your head, see the subtle spec shift and break up and you are immediately informed as to what something might feel like. Again spec highlights seem trivial but it is one of the key things we use to read faces. And this brings us to the core of why the Lytro Immerge is so vastly important, people.
VR can be boring. It may be unpopular to say so but it is the truth. For all the whizz bang uber tech, it can lack story telling. Has anyone ever sent you a killer timelapse show reel? As a friend of mine once confessed, no matter how technically impressive, no matter how much you know it would have been really hard to make, after a short while you fast forward through the timelapse to the end of the video. VR is just like this. You want to sit still and watch it but it is not possible to hang in there for too long as it just gets dull – after you get the set up…amazing environment, wow…look around…wow, ok I am done now.
What would make the difference is story, and what we need for story is actors – acting. There is nothing stopping someone from filming VR now, and most VR is live action, but you can’t film actors talking and fighting, punching and laughing – and move your head to see more of what is happening – you can only look around, and then more often than not, look around in mono.
The new Lytro Immerge and the cameras that will follow it offer us professional kit that allows professional full immersive storytelling.
Right now an Oculus Rift DK2 is not actually that sharp to the eye. The image is OK but the next generation of head set gear have vastly better screens and this will make the Lightfield technology even more important. Subtle but real spec changes are not relevant when you can’t make out a face that well due to low res screens, but the prototype new Sony, Oculus and Valve systems are going to scream out for such detail.
Sure they’ll be expensive, but then an original Sony F900 HDCAM was $75,000 when it came out and now my iPhone does better video. Initially, you might only even think about buying one if you had either a stack of confirmed paid work, or a major rental market to service, but hopefully the camera will validate the approach and provide a much needed professional solution for better stories.
How much and when?
No news on when the production units will actually ship, many of the images released for the launch are actually concept renderings, but the company has one of the only track records for shipping actual Lightfield cameras so the expectation is very positive about them pulling the Immerge off technically and delivering.
In Verge, Vrse co-founder and CTO Aaron Koblin commented that “light field technology is probably going to be at the core of most narrative VR” When a prototype version comes out in the first quarter of 2016, it’ll cost “multiple hundreds of thousands of dollars” and is intended for rental.
Lytro CEO Jason Rosenthal says the new cameras actually contain “multiple hundreds” of cameras and sensors and went on to suggest that the company may upgrade the camera quarterly.