AI TV Reporters Invade Japan; OpenAI & Alibaba's High-Stakes LLM Showdown
November Week 4: Nov 26 - Dec 2
Hi friends 👋,
In this week’s edition of Coconut Capitalists, we’re diving into:
The Dawn of Japan’s AI Television Reporters
Alibaba’s Open-Source version of OpenAI’s #1 LLM
Quick-Fire Startup News from Around Asia
Startup of the Week with Shein, Singapore’s Top E-Commerce Powerhouse
Let’s get into it.
The Dawn of Japan’s AI Television Reporters
The Scoop
This week, the South Korean startup DeepBrain AI struck a multi-million dollar deal to bring AI Television Reporters to one of Japan's largest weather broadcasting companies - a company that reaches over 1.46 million viewers per week.
The deal, which kicks off in January of 2025, highlights an emergent use-case for AI, which some refer to as “B2B Deepfakes”, “Digital Twins”, and most often “AI Avatars”. To put it simply, platforms like DeepBrain AI, specialize in creating videos of photorealistic, but fake humans, that can talk & move, just like normal people.
Moreover, if we transition into a financial analysis of DeepBrain AI and examine their funding history, we quickly notice that they’ve raised over $44 million in venture capital (coming primarily from Korean & Japanese investors). Yet, they’re far from the only venture-backed startup in this category: In the UK, a strikingly similar startup known as Synthesia has raised over $150 million; In the US, HeyGen (which is backed by Benchmark Capital, Conviction, Thrive Capital, & Elad Gil) has raised $74 million; and in Southeast Asia, Gan AI (backed by Peak XV) has raised $6 million.
But in a sea of AI companies, many of which have higher burn rates than actual revenue, why are companies in the “B2B Deepfakes” category breaking out from the pack? Well, if we take a step back: large enterprises like Hyundai, T-Mobile, & Samsung (DeepBrain AI’s customers), buy from startups for one of two reasons. The first is to increase revenue (think products like Salesforce, Shopify, or Google Adwords that drive new customers), or the second is to save on costs (think AWS, UiPath, or Snowflake that replace expensive infrastructure). It’s really that simple.
So with the above paragraph in mind, if we now jump back into the weather broadcasting example, and further conduct some back-of-the-envelope math, we see why companies took the bait: using AI Avatars provides significant savings on cost. For instance, in Japan, the average salary for a TV weatherman/weatherwoman is $100,000 USD. Meanwhile, producing a 10-minute video using DeepBrain’s software only costs $24. And if we can go a step further with the numbers: let’s say a TV anchor does 5 shows per week, at 10 minutes per show; that's approximately 260 shows per year. 260 shows times $24 USD per show (cost of DeepBrain AI) amounts to $6,240 USD. This comes while the alternative, hiring a real person, costs $100,000 USD, per year. Based on our rough calculations, adopting DeepBrain AI saves this weather broadcasting company 94% in annual expenses.
And in the eyes of the company, this AI broadcaster comes with many perks: they never get sick, never ask for a raise, and there’s the presumption that many Japanese viewers might still believe they're watching a real person...
The Technology
If we get very slightly nerdy for a second, we can breakdown just how these "B2B deepfake" videos are made. Essentially, it's a mix of three different types of AI models/systems of models all working in tandem (language, text-to-speech, & image-to-avatar). Let’s run through the broadcasting example:
First, you take an off-the-shelf LLM (like OpenAI GPT-4 or Llama 3) and prompt the model to write a broadcasting script based on today's weather data. This data can be autonomously pulled from the Bing, Google, or Baidu News APIs.
Second, we pass this LLM script into a text-to-speech model (like Eleven Labs or Speechify) that takes this script and turns it into a pitch-perfect MP3 audio file.
Third, we take the audio file and a reference image of a real person (a full body photo), and make an API inference call against a large "Frankenstein" system of models to create our Virtual AI Avatar. This is essentially a mashup of "photo to animation” models, "lip movement” models (similar to Wav2Lip), and “motion generation” models that are typically based on a reference video (you can find many reference videos on Hugging Face). The final output is a "virtual twin" that looks and sounds just like the actual human.
Why it Matters
So is the weather broadcasting example just a one-off instance of AI replacing human jobs? Or is this real-life example foreshadowing the future of the entertainment industry at large? Will actors, singers, and influencers all just become AIs? Well, if our answer to this question relies solely on cost savings data, the answer is a resounding YES. Every industry wants 94% in cost savings; it’s a no-brainer.
But, if we steel-man the case against generative AI in entertainment, and propose that the broadcasting example is just a one-off, there are many compelling arguments to be made. One argument is that we can split the entertainment industry into two distinct categories: The first category is pure information consumption, like the weather, where the actual entertainment value of the person communicating isn’t incredibly relevant. In this case, the viewer just wants to obtain the information quickly & efficiently, to then move on with their day.
The second category (which many would argue is less likely to be disrupted by AI), is “parasocial” entertainment. This is where the relationship between the audience and the presenter (no! not that kind of relationship) goes beyond just the information being communicated. Here, the value lies in the emotional or personal bond that fans develop with real-life individuals. For example, if you’re a fan of Blackpink, it’s not just about the music—it’s about following their Instagram and TikTok updates, engaging with their personalities, and dreaming of seeing them live. This blend of online interactions and in-person experiences, such as attending a concert, creates a uniquely human connection that is challenging for AI to replicate.
It’s important to note that companies like DeepBrain AI, HeyGen, & Synthesia, are currently making most of their revenue from B2B use-cases, such as product demos & onboarding videos for new employees, and the many consumer-oriented use-cases are yet to be demonstrated to a broad audience.
Alibaba’s Open-Source version of OpenAI’s #1 LLM
The Scoop
A famous Alibaba internal team known as the International Digital Commerce Group, released an open-source competitor to OpenAI's leading model, o1. And you can download this Alibaba model right now on Hugging Face! It's called Marco-o1, and its logo is a strawberry emoji like "🍓". If you followed the OpenAI Twitter 'lore,' 'strawberry' was the internal codename for OpenAI's chain-of-thought-based LLM.
Okay, so what is Marco-o1? It's a fine-tuned LLM within Alibaba's family of models known as Qwen. Think of the Qwen Family similar to how Anthropic has Claudes (Sonnet, Opus, and Haiku) or how OpenAI has GPTs (3.5, 4, 4o, o1, etc.). Each model in the family has various tradeoffs regarding inference speed and performance.
It’s worth noting that Marco-o1 uses a similar chain-of-thought mechanism to OpenAI's o1 model, where it breaks down complex reasoning tasks into smaller sequential steps. Essentially, this is where the model verbalizes the answer in each step back to itself, similar to how humans utilize their inner monologue to think through complicated problems step by step.
Now, it's important to highlight that this 7B Qwen fine-tune doesn’t have the same performance capabilities (as of today) compared to OpenAI’s o1 model. This is primarily due to the fact that it’s a much smaller model with only 7 billion parameters, whereas OpenAI’s o1 model reportedly has 1.7 trillion parameters. Given this, we can assume that Alibaba releasing Marco-o1 was meant to serve as an example, of what could be, if the same fine-tuning technique was applied to a much more sophisticated & much larger Foundation Model.
The Performance
For the ML researchers who read this newsletter (and anyone else that’s “AI curious”), with regards to the actual performance evaluations for Marco-o1, the team was able to increase the accuracy rate from 85% to 92%. And these results came solely from this new in-house chain-of-thought fine-tuning approach. For more context, the performance evaluations were conducted through the MGSM LLM benchmark, which measures mathematical reasoning ability through word problems.
It’s worth noting that the 7% improvement (in isolation) might not feel like a step-change in model capabilities. But given that this was simply a fine-tuning approach, and there was no overhaul of the architecture code, 7% is actually quite significant.
Why it Matters
There’s a healthy debate going on in the AI industry regarding open-source versus closed-source language models - especially as the industry begins to consolidate around a few large players that continue to keep their models quite secretive.
Just look at the US as a case-study, there were previously dozens of large language model companies, with valuations far into the billions: including Adept, Character AI, & Inflection; with each of these companies competing fiercely against the other; now it’s down to X.AI, Anthropic, and OpenAI as the big three. Technically Adept, Character AI, & Inflection aren’t dead (and still have a few 100x researchers at the companies), but they’re operating in a zombie-like state given that their founders have left. If you need a quick refresher on the undead, I’d recommend “Train to Busan”!
Anyways, the point is: as these AI companies become increasingly powerful, it’s crucial to maintain a path for startups & researchers to compete; And by providing an open-source option for these groups, the increased competition will be positive for users in the long run. And publishing findings like Marco-o1 is critical for the industry.
If you don’t see a downside scenario of only closed-source models (I haven’t highlighted one so far), let’s paint a hypothetical picture: Imagine for a second you’re a founder that’s attempting to “make the world a better place”, and you create a startup that provides AI legal services. You decide to build your product on top of OpenAI’s Foundation Models (think GPT-4). At first glance, this seems like a perfectly fine strategy; and when your startup begins to gain momentum & eventually move into hyper-scale, you realize that you need to increase your OpenAI API rate limit (request more inference compute power) to serve all these new customers flowing in. But instead of a cheerful email reply from the support team, you receive a message stating: “Sorry your request has been denied…”
So what happened? Well, what you weren’t aware of is that OpenAI is actually negatively impacted by your startup’s success. And it’s not that OpenAI itself competes in your category - they don’t. But OpenAI’s Venture Fund does (yes, OpenAI does have a vc fund). And given that in our scenario the fund has a significant stake in a competitive AI legal startup; they now see your startup as a threat to one of their portfolio companies.
I don't mean to pick on OpenAI specifically; the example is just a hypothetical and for illustration purposes only. But it demonstrates the point, that unfortunately, a platform company has the ability to throttle startups that are building on top of their infrastructure and their closed-source models. And conflicts of interest, like the one highlighted above, can bubble up to the surface if there’s a financial stake at risk.
There's nothing illegal about this (some might argue it’s a little unethical). But, it's a very common practice in China, where platforms (think WeChat/Tencent, for example) act as kingpins and invest in a select few of their customers (e.g., Pinduoduo, Meituan, and DiDi). Tencent, for instance, invested billions of dollars in each of these startups - while also giving them exclusive distribution channels to millions of users.
Please don't come after me OpenAI friends - this was just for fun. A few of y'all do read this newsletter (and are close pals of mine), so let me off with a warning - haha!
🇯🇵 Japan News
The SoftBank Vision Fund announced its plan to buy $1.5 Billion worth of shares from OpenAI employees by the end of December 2024. This comes just weeks after Masayoshi Son's SoftBank Vision Fund invested $500 million into OpenAI alongside Joshua Kushner's Thrive Capital (the firm led by Donald Trump's son-in-law's brother), Microsoft, Nvidia, and Khosla Ventures.
In other OpenAI news, this investment announcement came just a day after a group of artists leaked OpenAI's secretive video model "Sora". Now when we say leaked, this requires additional context. The actual model weights and architecture source code (aka the crown jewels) were not leaked. Instead, a group of artists who were initially given API access by OpenAI to provide feedback on the product decided to release the endpoints on a public internet forum. In a statement, the group claimed that the leak was an act of rebellion - that OpenAI has allegedly been leveraging its collaborations with the artist community solely for positive publicity and branding, despite the artists' strong belief that Sora will ultimately replace their jobs.
Jack Dorsey, the founder of social media apps Twitter/X & now Bluesky, saw his new platform achieve a 500% increase in Japanese user downloads during the month of November 2024.
It’s worth noting that, although Jack Dorsey was the original founder of Bluesky, he officially cut all ties to the app in May 2024. As a bit of history, Bluesky was developed through a small incubation within Twitter/X's San Francisco HQ before being spun out as a separate company. Moreover, in Japan, Twitter/X currently has 6.7 million DAUs, Threads has 1.5 million DAUs, and Bluesky has 500,000 DAUs.
🇨🇳 China News
ByteDance has filed a $1.1 million lawsuit against a 19-year-old intern who had previously worked as a software engineer at its Beijing HQ. Reportedly, the intern diverted computing resources from 8,000 GPUs, which were originally meant to train ByteDance's latest LLM. Why did a 19-year-old need 8,000 GPUs? Perhaps an AI to slide into his crush's TikTok DMs? Who knows - but hopefully nothing more sinister…
Jack Ma, the founder of Alibaba with a net worth of $25.6 billion, whose wife has been buying up much of the real estate within a 1-mile radius of my apartment in Singapore (thank you, Mrs. Ma...), has returned for a visit to Alibaba's Hangzhou HQ.
It's difficult to know the reason for the visit. Is Jack returning to Alibaba full-time, in a similar fashion as Sergey Brin recently returning to Google as a technical lead (I heard Sergey missed the free breakfast), or did Jack stop by simply to boost company morale—given that Jack is one of China's top entrepreneurial stars?
It's worth noting that his visit comes at a time when the Chinese government is trying to boost confidence in the country's private sector, as the world's second-largest economy faces an onslaught of upcoming challenges: including (many) venture capitalists moving their money out of China, Chinese entrepreneurs leaving to start companies in Japan & Singapore, as well as the recent geopolitical headwinds, in the form of import tariffs, that are being threatened by U.S. President-elect Donald Trump.
🇰🇷 Korea News
Seoul will be hosting two of the world's largest startup events in the upcoming two weeks. The first is "COMEUP 2024," which will be hosted on December 11th-12th and is expected to draw over 70,000 attendees. In the words of the organizers, "The event is a global startup festival to meet early-stage startups as well as the people who are nurturing these early-stage startups towards success, including investors & corporate partners." The second event is "AI SUMMIT SEOUL 2024," which is geared towards AI Startups, Researchers, Professors, and anyone who is "AI Curious". This second, event is taking place on December 10th-11th. Conveniently, both events are occurring at the Iconic Coex Complex near the Gangnam District.
Taking a look at the programming & speakers for both events, it’s quite stellar. For many people, they get a bit icked out at conferences geared towards "networking only", but both of these events stand out by offering a strong focus on technical education for attendees. The speakers include Cristóbal Valenzuela, the CEO of RunwayML ($4 Billion Valuation), Peggy Johnson (an ex-Microsoft EVP & my former boss), as well as Sunghyun Park, the CEO of Rebellions ($700 Million Valuation).
🇹🇼 Taiwan News
TSMC's founder Morris Chang released a memoir on November 29th. The biggest news from the book was that more than 10 years ago, Morris Chang asked Nvidia's Jensen Huang to be the CEO of the Taiwanese chip giant, but Huang turned him down within 10 minutes.
This conversation between Chang and Huang occurred shortly after Chang had returned as the TSMC CEO (after Rick Tsai stepped down). Tsai served as TSMC CEO from 2005 to 2009 and is now the CEO of MediaTek, which is worth $61 billion USD.
🇸🇬 Singapore News
Figma, a $20 billion Silicon Valley darling that creates design software, is now suing a Singaporean design startup, claiming they have stolen large chunks of Figma's proprietary source code.
This week, a closed-door hearing – where matters like witness details, confidentiality orders, and documents to be produced are discussed before a trial – was held within Singapore's Superior Court. In public court documents, Figma claims that Motiff's products don't merely "share similar interfaces and commands," but their products "perform identically" to previous versions of the Figma Platform.
A spokesperson for Motiff said that these accusations were "unfounded" and rejected any suggestions that it had copied Figma's source code. The spokesperson added that "Our code was developed in-house and independently of other organizations."
While it’s not publicly disclosed how much Figma is suing Motiff for in Singapore, looking across the pond to the US, court records show Figma is seeking compensation of at least $75,000. Huh, only $75,000 for a company you claim has stolen your source code? That's the price tag of a one-day corporate retreat for the Figma Engineering Team. Based on this low figure for what is such a serious allegation, it's likely we don’t have the full story, and many more documents are yet to be disclosed.
🇮🇳 India News
Andrew Ng, the world-renowned Stanford AI professor, who (fun fact) also spent much of his childhood living in Hong Kong and Singapore, has made his first venture investment in India via an AI healthcare company. This came in the form of a reported investment of $3 million through his $120 million venture fund, AI Fund II.
Jivi AI creates language models for India, with the goal of expanding into AI-powered co-pilots for doctors and consumers. And for a startup that has only been around a little over a year, their technology is incredibly impressive. For instance, their latest model scored an average of 91.65 on a healthcare leaderboard. In comparison, both OpenAI's GPT-4 and Google's Med-PaLM 2 scored in the low 80s.
Shein, Singapore’s Top E-Commerce Powerhouse
The Scoop
For many of us, especially those in the USA, we've bought more clothes on Shein than we'd ever like to announce publicly. The Shein app is like a digital baby between TikTok and Fast Fashion, resembling companies like H&M, Zara, and Forever 21, but offering even lower prices, styles that are on-trend, and instead of a brick & mortar retail experience, users make purchases via a few clicks on an iPhone app. For the year 2024, the company is on track to hit $40 Billion in revenue - yet for a company of its size, the actual business maintains a low profile. As just one example, Shein’s CEO has no social media & there’s only a single photo of him on the internet.
If we go back in history, Shein was founded in 2008 in Nanjing, China, as a small online retailer that sourced clothing from local wholesale markets in Guangzhou and resold the clothing on its online store at a 20-30% markup.
A turning point came in 2014 when Shein became large enough to acquire a smaller retailer, but with its own sophisticated manufacturing and supply chain capabilities. This acquisition transformed the company from an e-commerce storefront selling "other people's stuff" into a vertically integrated online retailer.
The most drastic shift in Shein’s strategy came when it entered the US market in 2017, recognizing that it was American consumers who most wanted to refresh their style every week with what’s currently trendy (versus other markets, which prefer to reuse the same clothes for years at a time). With their built-up supply chain and manufacturing advantage, they put their foot on the pedal, adding $300 million in US revenue, in just their first year.
Their advantage—being the supply chain—is best illustrated through a real-life example of how the company can go from concept to production, clothing in the hands of consumers, in a matter of days. So imagine the influencer Kim Kardashian attends the Oscars on a Monday night, wearing a trendy outfit that cost around $2,000. Now, just like magic, by Tuesday morning, a designer at Shein has already recreated the entire look - and with a price-tag of $40. And by Tuesday night, the outfit has been pushed out and is available for purchase on the Shein app.
Behind the scenes, Shein’s strategy is to initially launch new products in small batches of 100 to 200 units to gauge consumer interest. And once orders start pouring in, the company can quickly scale up production to meet the surging demand. This allows Shein to go from The Oscars to a Person’s Wardrobe in a matter of days.
Singapore Washing
It’s worth noting that in 2022, Shein moved its headquarters from Nanjing to Singapore, in a practice that many have dubbed "Singapore washing". This was meant to distance itself from China amid geopolitical tensions. During this time, Shein faced scrutiny over labor practices, environmental impact, and alleged copyright infringement, leading to calls for boycotts and bans in certain Western markets. By shifting its corporate base to business-friendly Singapore, an ally of both Western countries and China, Shein aimed to sidestep negative sentiment and improve its chances of accessing Western capital markets (via an IPO)…
Furthermore, in 2023, Shein confidentially filed for a US IPO, but SEC regulators intervened, citing alleged unethical labor practices in China. Shein quickly withdrew its IPO paperwork and shifted its sights to the London Stock Exchange, targeting a 2025 IPO timeline.
On another note, it’s important to highlight that in 2025, the Texas Stock Exchange (TSE), which is endorsed by Elon Musk, is planned to launch. The TSE will require less reporting to access capital markets, potentially providing a new method for global companies, including those in China and other markets, to achieve liquidity without the same level of paperwork and regulatory hurdles.
Bull Case
In 2024, Shein is estimated to have a gross revenue of $40 billion USD, with the USA alone representing over 35% of its sales. The company has gone global - with 7% of sales in Germany, 6% in France, and 5% in Japan.
One of Shein’s clear advantages is the company’s employment of a large number of computer science graduates as AI Engineers. These engineers build sophisticated AI sentiment models that scrape the internet for fashion trends with thousands of unique data points, being collected every second.
This allows Shein to develop new SKUs almost instantaneously to meet consumer tastes. While Lululemon has 120-day lead times from design to manufacturing, Nike has 60 days, Zara has 21 days, and Shein averages 4 days or less. Despite many Western shoppers publicly stating their concern for sustainability and the environment on social media, Shein's large revenue percentage in these markets suggests that, for many, this could be a form of virtue signaling.
Bear Case
Shein's U.S. market, contributing 35% of its projected $40 billion in 2024 revenue (~$14 billion), is under threat. Donald Trump's proposed round of tariffs, as high as 60%, could raise costs by billions annually. Additionally, closing the de minimis rule, which allows imports under $800 to avoid duties, could add $100 million or more in taxes annually. These changes would decimate Shein's price advantage, potentially jeopardizing its dominant position in the U.S. market.
Additionally, despite Shein's high revenue, its future profitability could suffer under a new round of tariffs. In 2023, Shein reportedly had gross margins of 35%, which is significantly lower than Zara's 57%, despite operating at a much larger scale. While the company reported $2 billion in net profit in 2023, the proposed 60% tariffs and the closure of the de minimis rule would cause much of this profit to evaporate.
It’s worth noting that although Shein's HQ is in Singapore, the majority of its supply chain is in China and would therefore face pressure from Trump administration tariffs, threatening the company's entire business model.