S4C

We are rarely presented with something truly new and inventive, but the work of Minivegas is really worthy of the term ‘groundbreaking’. For client S4C they have created a set of promos that automatically animates in realtime to the exact pitch of the presenter’s voice. Every day each promo is unique. This fully automated solution is both technically and creatively impressive, and also often a load of laughs.

The promise is simple enough: every promo is different, every promo is unique and all done only off the recorded script. Given that S4C is a bilingual broadcaster, each promo would also react differently based on the language and even sex of the presenter.

To prove the point we made three sample promos for fxguide.com:

movielink(08Mar/s4c/lights_small.mov, Here is fxguide promo1)
movielink(08Mar/s4c/museum_small.mov, Here is fxguide promo2)
movielink(08Mar/s4c/snowdon_small.mov, Here is fxguide promo3)

The idents are all real scenes and not synthesized graphics. “The biggest challenge was to make things move to the voice, especially the animation,” explains Minivegas co-founder Luc Schurgers. The idents are shot on 35mm using an ARRI 535B over the space of five days. Post wise, “we worked on this with 3 people. We spent 1 year on coding and R&D and a couple of months on 2D and 3D. A lot of time in 2D was spent on the cinch mark removal that happened on the 1st day of shooting due to a dodgy magazine.”

The system is installed at the Welsh broadcaster S4C and can be controlled just like a digital betacam deck. “We wanted to make it easy for them to integrate. So instead of them adopting to our kit we made it act as something they are familiar with,” says Schurgers. The machine operates as server or controlled completely via UI. It could record, but now it actually goes out completely live. “It’s patched in into the voice over booth and goes on air straight away. The voice over artists are allowed to do use unconventional ways of delivering their lines. So that’s pretty good fun” he adds.

The company Minivegas is a collection of live action and animation directors. They met at university, then moved to London and created their first video for Plaid while working log nights at Glassworks, a leading UK post house. They have done videos for Bloc Party, John Cale and Elektros, being nominated for best new directors in 2006 (CADS). Key members worked on the ill fated Sony Socratto project.

What makes the clips so amazing is that subject matter, each spot is a seemingly normal slices of life, such as someone changing a light bulb in a lighting shop. But this is then

fxg: What is the destination target format of the idents?

Minivegas: The are played out at 16×9 PAL full height anaphoric. Most of them are 20 seconds long, but the Ice cream one is 30 secs. Initially, we discussed having infinite, continuously playing idents. But we were using real filmed footage, so we stuck to a 20 second format. It’s also not too much of a challenge for the announcer – they can just treat them like a normal 20 second ident and say as little or as much as they want. All the idents stand on their own without voice input, but speech breathes life and a degree of strangeness into the scenes.

fxg: How long did it take to be developed?

Minivegas: It took 13 months for concept to completion. A lot of time was spent researching and testing. In the beginning we had no clue how to tackle this job so we had to do a lot of tests and mock ups. We can do this much quicker now.

fxg: What are your backgrounds?

Minivegas: our backgrounds are VFX and programming. We both worked in a software development company called Nucoda. We worked on the Socratto project there and later on Film Master, but we were doing totally different things. Luc was product specialist and Dan was engineer.

After leaving Nucoda, Luc started making videos and animations and Dan started working on his own application. So before Dan joined Minivegas he had pretty much already written his own real-time compositing application. There was obviously a lot of material there that would be useful for the S4C job – video decoding, compositing, colour correction etc. Some of that knowledge became the base for the software. But apart from the open-source libraries we’ve used, the S4C application is written completely from scratch.

fxg: How many are there in total?

Minivegas: In total we made 10 idents, but each ident has quite a few different elements. Especially the Ice cream and Houses idents. Here are some more details on all 10 of them:

Houses:
This was the very first ident that we shot. We had just been commissioned this job and were still working out how we were going to do this. As a result we went completely over board. We created way too many audio responsive elements so that the result was too busy and that you couldn’t really see what was driven by the voice and music. We even created an audio sequencer that created a unique music clip each time the ident was played. The sequencer also triggered a balloon coming from the bottom each time a certain frequency in the music file was hit. In the beginning we had the rainbow, 4 types of balloons, seagulls and the grading of the houses hooked up to the voice. This was way too much and we had to strip things down considerably.

Cows:
We shot this in the very early morning behind the steel factory in Port Talbot. We had to do a serious amount of cow wrangling in order to get the cows to do what we wanted.

Again here there are various different back plates. The first one is the lonely cow that wakes up as soon as the announcer starts speaking. The cow is held in a loop and we’ve added tail movement and ear twitching back on top. Once the voice comes in the cow’s head starts moving and looking around. The second on is the stretchy cow; a cow that stretches its body as he walks in front of the screen. The last one is a bit of an Easter-egg as it only plays every 25th play-out. In this one the steel factory walks off, as if it’s an Imperial Walker (from Star Wars) and the cows in the foreground freak out and run off. Putting in Easter-eggs is another advantage of dynamic idents as you can tell the program what to do on certain times. On top of this ident we do a dynamic grade that is linked to the computer clock and will give a different result depending on what time of the day the ident is played out.
movielink(08Mar/s4c/mv_s4c_cows-w_h264.mp4, Download the cows promo)

Lighthouse:
As you can see in the movement of the grass there was a very strong wind on the day of shooting. It was filmed on location at South Stack under gail force 6 wind conditions. This is the time when seagull shit can become lethal and we experience that first hand. It flew everywhere. We had to stabilize all the plates and add the poles and the wires. We created a physical model for the wires that reacts to gravity, air friction and wire friction. The amplitude of the voice ripples through the wires into the distance.

Museum:
Shot in the museum of Wales on our favorite rectilinear wide angle lens. In this ident the power cord wires wiggle in response to the audio, we plotted the wires at specific points, painted them out and then ran voice data through the points to allow the software to draw the wires in real-time. We experimented with different physical models in order to make the wires both responsible and believable, finally adding a real-time motion-blur algorithm was added to enhance the integration with the backplate.

Funfair:
Shot in an abandoned fun fair in Rhyl. The problem that we ran into with this one is that when you’re creating a real scene instead of something abstract, the viewer expects it to obey the laws of physics and whereas a human animator can animate slightly ahead of the event with real-time animation we never know what is going to happen till just after the event has happened. So in the animation you always need to cheat and pretend that you knew about it. You’ll always be slightly behind, and it’s a real balancing act to get this to look plausible.

Although this ident was based on a simple concept the technical problem of compression schemes and data throughput made it a tricky one. From a technical point of view we had to cramp 12GB of data into 2GB or RAM! We had shot the individual element on high speed so that we could speed ramp the footage. We ended up with 1500 frames per lift plus the roto plates that needed to play completely from random access because there is no way you can stream it of the disk fast enough.

We had to create a custom compression scheme for this as we tried various open source ones, but the problem was that you can’t compress it too much otherwise it puts too much of a strain onto the CPU. This took a couple of months to get right. We’re pretty pleased that this heavy ident now play flawless on an old Dell laptop. Also putting the lens flares back on top of the composite was pretty tricky as the client insisted shooting right into the sun.
movielink(08Mar/s4c/mv_s4c_funfair-w_h264.mp4, Download the funfare promo)

Ice-cream:
On this ident we did some heavy 3d and some serious compositing. It was a good exercise as we leaned a lot about animation and how to interpret and affect pre-rendered animation in real time. Again here there were 3 different versions of the ident. We have the Ice-cream van on the tracks, one with a digger arm and one full cg one that acts like a crab.

The final comps were rendered as one long sequence of 1500 frames and then we play back the full the sequence in our software. The player head is being rocked between preset key frames of the composites so that it looks like that there is movement even when the announcer is taking a pause.

The big problem is that we’re dealing with pre-rendered animation and on top of that the real-time animation. It took a lot of testing to get something decent out of this. The animation stops at a different point depending on when the announcer stops speaking. The real-time composite also adds floor dents, clouds and dust. Appropriate sound effects are triggered in real-time. For example the caterpillar’s pitch changes when it moves faster. We can also hear the sound of the impact of the claws on the floor.

We are now experimenting with importing proper 3d geometry and animating that in real-time via proper animation curves.

movielink(08Mar/s4c/mv_s4c_icecream-w_h264.mp4, Download the ice cream promo)

Lights:
This was shot in Cardiff at the department store Howells. We shot a clean plate and then without moving things in the scene we shot each light individually and then calculated the spill by subtracting the individual lights from the unlit clean plate. Once we had created all the maps of the spill then we were able to add these light maps in any combination to the unlit clean plates. Again we have created various different backplates such as a walking security guard, sleeping security guard, man on ladder, care taker going up the ladder and a completely empty scene. For all the moving elements a hold out matte was created so that the spill wouldn’t affect the action. It was a very boring shoot as each individual light had to be switched on and off. Fortunately there were some of those massage chairs in the shop that we extensively tested during the shoot!

movielink(08Mar/s4c/mv_s4c_lights-w_h264.mp4, Download the Lights promo)

Wedding:
A lighting based ident. We created light maps that could be added on top of various backplates with different actions. There are 6 variations in total including different ways of picking up the glass and spilling it and various dog actions.

The lights are actually moving and are not single frames. We have lasers gobos and other animated lights in this ident. The action is excluded from the light spill with hold out mattes. We had to roto al the humans and cheat certain light maps in order to create a realistic effect.We shot this in the Canton Liberal club. It’s a place with very cheap beer and a striking interior. As a matter of fact not much set dressing had to be done for this ident.

Welding:
The welding and grinding ident is another lighting based ident, but it’s different in a way as The welding and grinding actions respond to the fluctuations, peaks and drops of the voice, but the animation of the lights is faster, more random and more irregular then the other lighting based ones. The work men in the backplates are going into a loop of actions of movements that they would be doing if they were welding. There were no sparks when they were doing the loops.

During the shoot we captured hundreds of welding and grinding sparks at different speeds that we then keyed out and made little loops out of it. Each action had a start, middle and end. The middle bit is loop-able and once we put this back on the top of the action then it looks like they are welding. Again we used a rectilinear wide angle lens for this so that we got a nice looking image out of an ugly welding shed.

Carpark:
Again we have a couple of different plates on this one. There is one with a van driving, another with busses etc. The cool thing here is that the van moves on top of the animated text and markings and the cones fall over once a certain volume levels is reached.

We had to do some serious cleaning work as it was raining pretty badly on the day and the floor was very uneven. We had to remove all the wet patches and all the original writing so that we could recreate it in software. This ident was definitely the hardest on to crack as we really wanted the markings to look realistic. The first thing we did for this was to make tool that would let you trace all the words and markings on the original frame. The tool then created a vector drawing from this by un-projecting the trace so that it fitted onto a square and so that it was easier to see. The un-projected could then be transformed and then re-projected in a pleasing way with anti aliasing and hold out areas where the water and holes were.

We now have an image that looked exactly like the original, but we could do anything we wanted to the text and the markings as we had them as vector information. We now analyze the parameters in the voice and use this information to modulate certain aspects of the vectors. It kind of works like an illustrator file in which you can rotate and scale things, but again here in our software the transformations happen in real-time.

The following stages happen in realtime on the gfx card:
analyze the voice > transformation of vectors > project vectors into scene > composite operations of projected vectors > add car > add cones > on top of backplate

fxg: Could you tell us how the workflow is structured for production of a new clip ? How do to prepare?

Minivegas: At the start of the project we were very enthusiastic, but didn’t really have a clue as to actually execute them. Once the locations were locked down we sat down with Proud and S4C and made a huge list of ideas of locations, objects and animals that could be voice-reactive in each ident.

It was a fairly inclusive list, with entries such as “cow pooing with dung falling in time to voice”. I think there were 20 something pages. When we shot the first three idents and went a little crazy. The first ident had 6 different elements, some reacting to the voice, some to the music, some CG, some using filmed elements. The music was generated randomly via our custom made sequencer… it was pretty ambitious. But the voice-reactivity was somewhat drowned out. So for the final seven idents, we stripped things down with just one strong reactive visual element for each ident, and a selection of background action plates for variation and mood.

The first three shoots were a case of “get as much cool looking plates as possible”. Nobody really knew what to expect and we didn’t really know how we were going to do this By the time we’d got to the second set of shoots, we were more surgical. We did animation test before the shoot and we had a pretty clear idea of what we needed to make the clips work.

The most involved shoots were the lighting-based ones – the Lighting Store, the Welding Shop and the Wedding DJ. They involved turning hundreds of lights on and off, trying to keep the background conditions as fixed as possible. With the Welding Shop, we needed to get enough footage of each welder so that we could comfortably build a loop. These idents took most of a day to shoot. Some were a lot simpler. Some idents involved placing 3D elements in the scene – the Lighthouse, the floor-polisher wires in the Museum and the Car Park. We took plenty of measurements for these, scrubbed out the original elements from the plates and then just worked at it until we got code which was fast enough to replace those elements in real-time and still look realistic.

fxg: Could tell us how the station makes a new promo?

Minivegas: We’ve built a box that behaves like a digi deck and installed it at the broadcaster’s office, it can communicate with their scheduling system, switch audio feeds from a number of sources, and output synchronized digital video and audio using broadcast standard signals. It’s a turnkey system that can be cued via their IBIS scheduling system and manually via the UI on the machine. They can select what ident they want to play and what elements they want to sync to the voice.

This was actually quite a tough part of the job. You can imagine there’s been a large team of people focused on the creative aspects of the idents; planning, production, grading, compositing, writing the code for each ident and getting it signed off. But after all that, we still needed to deliver something that could sit in a broadcast environment, be stable and run without crashing, have a user interface that’s easy enough for the presentation team to use, and a technical interface that’s extensive enough for the engineers to use. Getting a piece of software to that point where you can deliver it and leave it somewhere without blowing up takes a long time and a lot of effort.

System
HP xw8400 Workstation, Dual-Core Intel® Xeon® processor 5110 1.60 GHz 4 MB L2 cache 1066 MHz front side bus with SATA array and 2GB RAM
Video: NVIDIA Quadro FX 4500 SDI 512MB Memory PCI Express
Audio: The Marian TRACE D4 SRC with 4 AES/EBU stereo inputs and 4 stereo AES/EBU outputs (balanced XLR)

fxg: How have they been received by the audience?

Minivegas: Not too sure really… We’ve been told that the initial reports suggest that they have been well received. That said it is early days as the idents are being played out gradually in and around their previous idents. We’ve had incredibly good feedback from the industry though. We’ve been swamped with odd interactive projects from various different advertising agencies, so that’s been pretty good for us.

Does the audience ever fail to notice?

Minivegas: The audio idents have been specifically designed to surprise, for example with a female announcer the reactive element can be very subtle – then you see the same piece with a more pronounced male voice and the elements appear to go wild: at that point the viewer may just go – did that just move with the voice? Also both English and Welsh languages affect the animation of the elements completely different as the rhythmical qualities are completely different. This was a big sale for the channel as they are a bilingual broadcaster. The idea is that these idents are subtle and that the viewer gets a feeling of surprise at the point that they do notice the visuals reacting to the voice, this may be the first viewing but equally it may be the fourth time.

fxg: How do the imagery connect with the audience? Most station idents don’t film car parks and cleaners?

Minivegas: The concept is about showing everyday scenes of Wales in a new light. The strategy was clear: younger viewers see their Wales remixed in a new and vibrant way and older audiences see an inner vision of the real Wales they know and love, the carpark for example has the famous Mount Snowdon in the background, and the cleaners are in the reception of Cardiff’s National Museum, so to Welsh viewers these places are very significant.

fxg: Is the use of “a moment of everyday life shot wide angle” a theme that might expand?

Minivegas: Life seems sweeter through a rectilinear wide angle lens so who knows 😉 We have just started a project for a different client that involves moving cameras and cuts. This is quite cool as we’re able to move away from the static wide shots.

fxg: How many frames are there in each promo package to draw on to make the final promo – space must have been an issue?

Minivegas: Yep this was a huge challenge. We were developing on a machine with 2GB of RAM and we also wanted the application to run on our laptops so there were certainly some technical limitations. For example the Ice Cream van and Scissor Lifts involved cramming over 12GB of slow-motion footage into 2GB of RAM memory, then compositing 4 or 5 streams of video in real-time. We devised several cunning compression and playback schemes until we finally managed to crack it.

Most of the other idents stream an uncompressed AVI as backplate and holdout matte and on top of that we draw geometry or we play back compressed elements.

fxg: Could you discuss the HDR for the lighting shop promo and the bit depth of the files for output?

Minivegas: We generated HDRI lighting maps in the Open EXR file format from the log-density film footage we shot around Wales.

We shot a clean plate and then with out moving things in the scene we shot each light individually and then calculated the spill by subtracting the individual lights from the unlit clean plate. Once we had created all the maps of the spill then we were able to add these light maps in any combination to the unlit clean plates. These image files are 16-bit floating-point files and in some cases over 50, were then applied on top of the backplates with various animation curves to make the lighting scenarios come alive. Since we’re dealing with live action footage you need you need to store, shift and calculate HUGE amounts of data. For every frame, the software needs to quickly analyze the audio data, read in the backplate from disk, decompress and animate the foreground elements and then finally composite everything, all within 4 hundredths of a second. Agaiun in this clip we had various different backplates (walking security guard, sleeping security guard, man on ladder, concierge going up the ladder, empty scene)
For all the moving elements a hold out matte was created so that the spill wouldn’t affect the action.

fxg: It must have been hard to animate for procedure production?

Minivegas: One of the challenges of live action as opposed to graphics – something we didn’t realize at the start – was moving things believably. If you’re creating a real scene then the viewer expects it to obey the laws of physics. Graphically you’d be free to do anything, but the real world imposes many constraints – how fast things can move, how they move and so on. Initially we were worried about pulling off realistic composites in realtime, but by the end the animation had proved equally daunting.

For example, if a forklift responds to a loud cough, a human animator can animate slightly ahead of the event, giving the forks the momentum to rise to the top at the exact time of the cough. With real-time animation, you don’t know about the cough until just after it’s happened. If the forks move to the top instantly, it looks wrong. You need to cheat and pretend you knew about it, rising as fast as you can. You’ll always be slightly behind, and it’s a real balancing act to get this to look plausible.

fxg: How were the shots graded?

Minivegas: The first shots we graded at a big London post company, but that was wrong as we couldn’t do much to the shots afterwards as there was little latitude process the graded footage. Grading typically happens at the very end of the post-production pipeline, but we were this extra element that happened before it. So after the second shoot, we just grabbed all the data we could from the film scans.

The grade is then applied in the software, in real-time using our basic color grading setup in the software. This let us do things like darken or lighten the sky in response to the voice. In one ident we use the computer clock to give a dynamic grade to the sky. So when it’s played in the morning it’s graded blue-ish and in the evening red-ish and various shades in between.

It also gave us the latitude to composite over 50 different
lighting plates in one shot – they are loaded onto the computer as high-dynamic range elements and modified in real-time.

fxg: Where items such as the power cords, procedurally generated?

Minivegas: We plotted the wires at specific points in time, painted them out and then ran voice data through the points to allow the software to draw the wires in real-time. Then on top of this we added the voice responsiveness.
At first we drew the wires by using proper physics, but the physical models didn’t work as the nuances in the voice were to slow. In order to solve this problem we did some research and came across a paper from Pixar on how they animated the slinky toy in Toy Story. It was done in a clever way as it was aesthetically pleasing, but not physically correct. We did the same thing and basically multiplied a bunch of sine waves on top of each other and have the parameters of the voice affecting the sine waves.
During the shoot we shot some reference plates and we wiggled the wires ourselves. This way we had a good reference to work to. It quickly became apparent that we had to create some motion blur in order to integrate the wires into the scene. For this we created a real-time motion blur algorithm.

fxg: What was the development environment?

Minivegas: Writing code is like writing text. You write it once, it’s messy. Then you re-write it again and again, perfect it, chuck stuff out etc. The source code repository says the first line of code was checked in June 1. There are about 50,000 lines of code in the final product. But there were many unrecorded revisions and throwaway prototypes before that. At the beginning, Dan took a lot of the code he’d written previously for his compositor, and wrapped it up so it could be used from a scripting language.

We used Python to create prototypes for most of the idents using these wrapped modules. This allowed a very fast turnaround, with the ability to tweak things and try out new techniques easily. We did this as early on as possible, working with whatever we had available. So for Car Park for instance, at the beginning we just took some photos of a car park. Then when the rushes came in we used them. Then when the film plates came in we used them. At last we dropped in the polished final elements. So the coding processing started way up front and followed the production and post process until the end.

A fair amount of time was taken writing tools – tools to process the scanned film files, tools to compress the elements. We even built a Quicktime compressor into the application so that we could quickly generate sample renders for client review.

As the prototypes were signed off, we re-coded them in C , inserting them into the final, broadcast-spec application. This was painful, but needed to be done for real-time performance. We don’t write in any “software” as such. Apart from Python, a C compiler, a text editor and a terminal, the only other components are a bunch of open-source libraries. We developed the software on Linux, but delivered it on Windows. It could run on a Mac fairly easily too – it’s all pretty agnostic.

Since at least half of the code was written to support the integration aspect of the project. We’d really like to see an open platform for this kind of thing. Maybe a piece of software that handles all the integration aspects, the audio and video input/output etc. Then people could just drop in modules; a bunch of media and some code, so they could just concentrate on the creative aspect. We could be making fully interactive ads!

Oh yeah. If somebody who knows their gfx coding ends up reading this is excited by these types of projects they should get in touch with us as we could do with a pair of extra hands.

Posted by Mike Seymour ON April 3, 2008