What We Are Consuming丨2024 Vol.01

We’re excited to launch "What We Are Consuming", a new monthly column in our newsletter! Each edition will bring you curated insights from top VC investors, founders, and industry thought leaders.

Get ready to dive into our first episode, which will be featured in today’s news digest and next week’s edition. Stay tuned and enjoy!

00. TL;DR

1/ Thoughts on Next Major Breakthrough in GenAI

2/ Thoughts on AI Moat

01. Venture News

1/ Thoughts on Next Major Breakthrough in GenAI

Agents as the next major breakthrough. Large Language Models (LLMs) are approaching diminishing returns, raising questions about whether their progress can justify the substantial training costs. In response, leading LLM providers are increasingly focusing on agentic applications. Sam Altman, CEO of OpenAI, shared his perspective on Reddit, stating: “We will have better and better models, but I think the thing that will feel like the next giant breakthrough will be agents.”

Reflecting this shift, OpenAI is reportedly set to debut an AI agent in January that can autonomously control computers and execute tasks. Anthropic has already introduced computer-use capabilities through its upgraded Claude 3.5 Sonnet model, while Google is also gearing up for a similar product launch.

Sarah Tavel, General Partner at Benchmark, advises AI applications built around LLM APIs to prepare for increased competition from AI agents developed by foundational LLM providers. She notes that as the underlying LLMs get more powerful, they will get to a place where they can power ‘drop-in’ async AI workers that will act like super-intelligent remote employees with code-creating superpowers. And enterprise customers may face a pivotal choice: adopt specialized AI applications or AI agents offered by foundational model providers. The latter is likely to prevail, as it may seamlessly integrate with customers' existing systems and deliver more automated functionality.

2/ Thoughts on AI Moat

Building an AI moat: go vertical, circumvent legacy software, and leverage proprietary data and evals. In the race to build defensible AI applications, several investors and thought leaders have shared insights on creating a strong moat. With competition intensifying due to the rapid iteration of technology and the dominance of foundation models, these strategies are more relevant than ever. From proprietary datasets to vertical specialization, here are some key takeways:

  • Vertical AI opportunities

    Prominent VC firms like a16z, Triangle Tweener Fund, and Benchmark have highlighted the potential of vertical AI.

    As Sarah Tavel of Benchmark aptly noted “The foundation model companies will inevitably focus on the big markets. It’s hard to imagine them developing a GTM strategy and packaged offerings for smaller (but still sizable) verticals that require more care and customization for less sophisticated customers.”

    Vertical AI startups can thrive by targeting theses underserved market, carving out niches where foundation models are unlikely to compete directly.

  • “Drink legacy software’s Milkshakes”

    Joe Schmidt, partner at a16z, coined this provocative phrase to emphasize a key strategy for AI startups: integrating deeply into customers’ existing workflows to capture and create proprietary datasets. The idea is simple: the more data customers generate and leave on your platform, the more loyal they become to your product.

    For example, AI voice sales and customer support tools can bypass legacy CRM providers by automating CRM data input from customer interactions. This eliminates manual input processes and enables startups to own the newly created data. A16z-backed startup 11x, which builds AI sales agents, exemplifies this strategy. By tackling the entire sales pipeline—from sourcing leads to managing CRM—11x circumvents legacy sales software’s data ownership and creates its own valuable database of potential clients.

  • Leveraging proprietary datasets through RAG and evals

    Another approach to leverage proprietary dataset is using RAG or implementing test-driven development framework like evals during product development.

    Casetext, a startup acquired by Thomson Reuters for $650 million, demonstrates the power of evals in AI legal counsel development. Casetext begins by identifying user goals, such as legal research or document review. They then break these tasks into discrete steps, each supported by tailored prompts. For each prompt, they develop "gold standard" test cases representing ideal outputs for given inputs. These prompts are iteratively refined and tested until they achieve high accuracy, ensuring reliability and trustworthiness in the final product.

    Similarly, Giga ML reduced the error rate of their AI customer support from 70% to 5% by applying OpenAI’s o1-preview model alongside evals. This rigorous testing framework allowed them to cover 85% of edge cases in customer support, significantly improving usability and adoption