The VFX and games industry have long sort to solve digital humans for everything from stunt double replacements to virtual characters. But as this technology expands and matures, other sectors are adopting the technology and using it across a wide variety of industries. Ex ILM, Kiran Bhat is CTO of Loom.ai which was recently bought by Roblox. EX Weta Digital, Mark Sagar formed Soul Machines which produces digital humans for a range of industries from financial advisors to WHO health agents and ex ILM Hao Li is CEO of Pinscreen which is extensively exploring digital humans, including building a range of virtual people for fashion e-retailer ZOZO in Japan.
The fashion online retailer (ZOZOTOWN) is part of Yahoo Japan Corp (Z Holdings Corp) following a 2019 $3.7 billion (400 billion yen) deal. As of 2018, ZOZO controled around 50% of the market for mid to high-end online fashion e-commerce in Japan. The total Japanese online fashion space is estimated to be close to 1.8 trillion yen according to Reuters. ZOZOTOWN established itself as a go-to user-friendly website over ten years ago at a time of scepticism about whether Japanese consumers would buy clothes online.
The company is now looking to an extensive rollout of various use cases of digital humans as virtual influencers, clothing models, and shop assistants. While the team believes the digital humans can increase traffic and user engagement, at its core, the company believes digital humans will reduce costs, and fundamentally that is what drives the business decision to move in this direction.
ZOZO has a strong innovative history. The ZOZOSUIT was offered as a special marker-based body suit to allow computer vision cloth fittings that would let consumers order custom-fit clothes. The company has since scaled back that effort, but a shoe version is still being used called, ZOZOMAT. Buying shoes online has always been challenging as customers want to try on shoes before they buy for fear of choosing the wrong size. The free printed ZOZOMAT was launched 12 months ago and allows users to use a smartphone app to accurately scan their feet in 3D for precise shoe fittings. The scan has an audio user interface that guides the user through the scanning process. The ZOZOMAT generates a 3D model of the user’s foot with millimeter accuracy using numerous printed fiducial markers. Once the scan is complete, an interactive 3D model of the foot is produced along with many detailed measurements including foot length, width, and girth. This technology was developed completely in-house. As an example of reducing costs, the ZOZOMAT has dramatically reduced returns of online shoe sales. Accurate and easy-to-use shoe measuring technology has reduced their online shoe returns by around a third. Virtual try-ons hold the potential for major cost savings to more than shoes, Adweek estimate that online returns cost US businesses $550 billion a year. Items bought online are returned 25% of the time versus 8% for in-store purchases. The most returned goods, in addition to shoes, are clothing and accessories. According to CNBC in 2019 these online returns in the US market can run as high as 30% to 40%, and both categories have the potential to be virtually ‘tried on’ before buying.
Kazuma Takahashi, is the VP of the Innovation Division at ZOZO Technologies and Yudai Tamamura is the in-house Creative Director driving and directing the project. The mission of ZOZO Technologies is to provide operational services and to develop technology that will help improve the entire ZOZO Group, to ‘scientifically define and predicting fashion’. The Technologies group is a separate group from the main e-commerce site and it employs between 400 to 450 people, doubling in the last two years alone. “The digital technologies group is basically the DevOps team for ZOZOTOWN, .. it is responsible for maintaining the platform, updating and developing those new features, such as textiles innovation, recommendation algorithms and also longer-term developments such as virtual humans or digital cloths fittings,” explains Takahashi. Before he joined ZOZO, Takahashi. worked at Amazon, helping to launch Amazon Fresh Japan and in Japanese logistics. He had studied in the UK, moving from Fintech to key Paris based broader e-commerce startups, one of which led to joining Amazon. “I learned a lot from working at Amazon. When I left, I moved to ZOZO Technologies last year and started talking with more and more top technology people, and I found digital humans really interesting and something I thought we should invest in and build,” he adds. The actual virtual humans team inside ZOZO works with outside tech startups such as Pinscreen in LA for the core digital human, AI algorithms, and innovations. The company is only now revealing its first public demonstration of this new form of retail innovation. Tamamura is also trained and educated in the West having studied and gone to University in both Iowa and Hawaii.
Takahashi thinks e-commerce sites can do a much better job at appealing individually to their customers. Too many sites display clothes in isolation, much like an old fashion catalog. His vision is for consumers to be presented with clothes and accessories in an aspirational way, similar to how fashion magazines present their clothes, but with the models and settings being personalized to the consumer. “When I want to buy shoes they should be presented using my personal preferences. The models and scenes should be based on what I like,” he explains. “When I see products, for example with shoes, I should see a cool Asian guy wearing those pair of shoes, running seaside or something, that maps to my highest conversion rate. That’s going to increase the customer’s engagement”.
Providing personalized digital humans also drives competition on more than just price. Increasingly globally companies are building brand apps as e-commerce destinations, with specialist services and innovations. The logic is compelling economically. When shopping in a brand-specific e-commerce app, the customer can browse a variety of items but stay within the one company’s bounded range. For that customer experience to be ‘sticky’ there needs to be special features or individual aspects that keep the customer from drifting away. That could be an understanding of size and fit, but this alone is not enough, since many styles such as t-shirts, can be worn fitted or equally oversized and loose. Of course, in fashion, the brand itself and the buzz around it is a major factor and this is why ZOZO is also very focused on digital humans as social influencers.
There are already many digital influencers, such as Imma, Shudu Gram, Blawko, and others. Lil Miquela has her own clothing line and has appeared in brand campaigns alongside Bella Hadid, Millie Bobby Brown, and Steve Aoki. Carefully curated online virtual influencers, such as Lil Miquela, have allowed Brud to raise over $125M from investors. Virtual mentors like Digital Deepak Chopra and the AI Foundation raised over $10M. Companies such as Didimo, Malivar studios all offer a technological pathway to creating one’s own influencers. But in the case of ZOZO, the team is being even more insightful than just building a brand mascot they seek to trend on social media. ZOZO has teamed up with Pinscreen in LA and together the companies are working on opening up the ZOZO digital humans to the public. While it is possible for any company to explore using digital humans as influencers, Pinscreen CEO Hao Li is working on building a gateway to support brand evangelists. At the heart of many fashion styles and decisions is not just the garment and the person wearing it, but how it is styled with other matching pieces of clothing and accessories. It is common to have someone such as a sales assistant remark that an item would look great when paired with something else, for example, “that shirt would look great with jeans and a blue jacket”. Personalized digital e-commerce environments can allow the mixing and matching of a range of items. ZOZO and Pinscreen are exploring not only algorithmically offering matching clothing, but also allowing regular people to build looks and ensembles and then post these to social media. This would allow people to build their own social media currency while orchestrating fully digital fashion shoots that combine their own taste in mixing multiple pieces of clothing on models in locations that make sense to the narrative that match the fashion-forward social posts of ZOZO power users. This would allow ZOZO to be supporting other people’s social media activities and benefiting from the referral traffic generated. “I think the idea of having the ability to customize everything the way you want,.. and styling clothes in combinations for a specific person is a really interesting framework that will allow us to do basically anything in terms of like creating fashion models.”
Hao Li defines the direction of their virtual human fashion research as having three distinct components. First, as the virtual fashion companion, “which is basically the virtual assistant that interacts with customers, based on various personalized preferences to make product recommendations,” he explains. Next is the idea of virtual try-on, the idea of people digitizing themselves to create their own avatars. Pinscreen heavily researched two years ago “but we have decided that we’re still a few years from fully being able for people to create an avatar at this quality that we want,” Li explains. The third is creating high quality dedicated fashion models that “we spend more time digitizing and creating, but that have the ability to accurately model clothing.” A varied group of models of different ethnic backgrounds, gender, and ages is being explored to allow for proper customer segmentation and customer engagement. Pinscreen is producing a database of different avatars or digital agents and each is designed for movement to really showcase how the fabrics in the clothes move and respond. All of the digital humans will have a common underlying structure and rigging allowing for standardization but most importantly they are based on digitizing real models and not just producing generic digital humans. The team is focused on producing extremely high-quality models with accurate and interesting faces. The Pinscreen team is using their proprietary PaGAN technology to get a faithful likeness.
As the models need to move all of the digital humans are being created for real-time rendering and are being produced in the Unreal Engine.” Real-time is really important, so everything we do is in Unreal,” explains Li. Takahashi agrees that real-time viewing of how the fabric moves is critical, commenting that “Real-time is the ultimate goal, hundred percent it is super important”. For the cloth simulation to work, not only does the rendering need to be in real-time but the digital simulation needs to accurately map to the actual garments. Interestingly, many if not all high street modern fashions are designed and tailored in computers, the team is, therefore, able to interface to a series of common formats with standardized forms, textures, and material weights. Many companies use CLO 3D, Marvelous Designer, etc for ePattern delivery using the DXF-AAMA/ASTM file format developed by the American Apparel Manufacturers Association. ZOZO’s supply chain is therefore already digital, although it is unlikely that the data is perfectly formatted for digital human fittings. However, this data combined with machine learning is expected to be able to faithfully reproduce the correct impression of the real tailoring and stitching. The main addition that the team is implementing a standardized fabric weight or cloth thickness.
“We use PaGAN to build a pipeline that allows us to go to individual brands and say, ‘Hey, if you want digital humans, this is the fastest way to create virtual influencers and to put them into scenes that you want.” Each digital human has a background persona and so effectively every digital ’employee’, has a persona matched to the ‘job description’, but then they can be art directed in the way that one would on a shoot with a real model.
The Pinscreen team did do a proof of concept with the ZOZOSUIT but their subsequent research and development have moved away from needing any physical type of suit to build a body profile in the ZOZO eco-system. The lead machine learning researcher is Dr. Cosimo Wei, CTO and pipeline architect of the ZOZO digital humans. Wei and Li worked together previously at the USC Viterbi School of Engineering, where Li also ended up running the USC ICT Vision & Graphics Lab. While Pinscreen is known for its machine learning algorithmic research, it also great respect for the artistry of this type of work and Aviral Agarwal is the project’s digital artist and technical art lead.
ZOZO continues to innovate across the company, building on the ZOZOMAT, the company is now addressing skincare and cosmetics. ZOZOCOSME is ZOZO’s new platform for beauty and cosmetics which will launch on ZOZOTOWN in March and will include over 500 curated Japanese and overseas brands. It is powered by ZOZOGLASS, a skin tone capturing device and it aims to remove the traditional hurdles for buying cosmetics online. Roughly 70% of ZOZOTOWN’s active members are female, and ZOZOTOWN members’ average spend is 43,809 JPY / year (roughly US$420). ZOZOTOWN’s core customers are GenZ (Ages 16-24) and Millennials (Ages 28-38) with exceptionally high engagement and a 78% repeat purchase rate for both GenZ and Millennials. New technology and innovations are at the core of the company’s plans moving forward and their digital human research is still only in its infancy for the multi-billion dollar company.