Author(s): Towards AI Editorial Team Originally published on Towards AI. What happened this week in AI by Louie This week, we watched developments in the next generation of AI supercomputers with Nvidia GTC and Broadcom AI in Infrastructure events. It was also another eventful week for leadership drama in the AI startup ecosystem, as two of Inflection AI’s co-founders and much of the team left to join Microsoft, and Emad Mostaque stepped down as Stability AI CEO. Anticipation is heavy for the next generation of LLM models, such as GPT 4.5/5 or Gemini Ultra 2.0. To some extent, the bottleneck in how far they will progress comes down to the availability of AI compute towards a single training run. GPT-4 is widely believed to have been trained on 25,000 Nvidia A100s. GPT-5 will likely be trained on H100s with roughly 3x more compute per GPU. However, whether OpenAI/Microsoft Azure has the capacity for a 50,000 or 150,000 GPU single training cluster remains unclear. Similarly, Google accelerated the deployment of TPUv5 chips (designed in partnership with Broadcom) in late 2023, and its Gemini 2.0 model is likely to take advantage of this. While it will not begin production until late this year, Nvidia’s newly announced B100 series of GPUs will take capabilities even further — potentially 4x in training capacity (vs. H100) and even over 20x for inference in some situations. In this context — Broadcom’s presentation of its custom AI chip and AI supercomputer infrastructure capabilities was also interesting. “Now, we know we can do 10,000, 20,000, 30,000, 60,000, 100,000 today, but there was also this consortium founded by Broadcom and a couple of others about two years ago. And the idea is, let’s actually take this to a million plus nodes.” We think both custom AI chip and GPU training clusters are likely to scale to 1 million chips in the coming years, and indeed, SemiAnalysis revealed that Microsoft and Google already have plans for “larger than Gigawatt class” training clusters. Separate from scaling compute — it is also important for the next generation of models to continue scaling data and implementing algorithmic improvements and breakthroughs. On this topic, Jason Huang’s comments on the next generation of models at GTC were also fascinating — stating they would include “insanely large context windows, state space vectors, synthetic data generation, essentially models talking to themselves, reinforcement learning, essentially AlphaGo of large language models, Tree Search.” Debate remains strong in the AI community on exactly how intelligent LLMs really are and what the limitations of the LLM transformer architecture will be. In the next year or two, we should see if scaling compute hits a dead end and what reasoning capabilities can be introduced via the new methods above! Why should you care? The amount now being invested into LLMs is truly staggering — and maybe some factor in the drama at AI startups. The choice for many to leave Inflection AI to join Microsoft after already raising over $1bn for the startup and only weeks after launching their new model seems a strange one on paper. However, with Inflection 2.5 coming up short relative to GPT-4 and the prospect of a new generation of AI chips soon needed to remain competitive, perhaps the prospect of competing with the budget of big tech companies in foundation model development looked too daunting. In open source AI, the departure of Emad Mostaque as Stability AI CEO followed Mistral’s surprise release of a closed model last month — and perhaps is also a sign of AI Venture Capital investors putting more pressure on monetization and fears over the ability to compete with the increasing AI budgets at big tech. In our view, this increasing centralization and barriers to entry would be a concerning development. However, there is still plenty of room for startups and individuals to build vertical products on top of the latest foundation models and work with smaller open-source models for specific applications. – Louie Peters — Towards AI Co-founder and CEO Building AI for Production (E-book): Available for Pre-orders! https://www.amazon.com/dp/B0CYQKKJGP Our book, ‘Building AI for Production: Enhancing LLM Abilities and Reliability with Fine-Tuning and RAG,’ is now available on Amazon for pre-orders. It is a roadmap to the future tech stack, offering advanced techniques in Prompt Engineering, Fine-Tuning, and RAG, curated by experts from Towards AI, LlamaIndex, Activeloop, Mila, and more. This e-book focuses on adapting large language models (LLMs) to specific use cases by leveraging Prompt Engineering, Fine-Tuning, and Retrieval Augmented Generation (RAG), tailored for readers with an intermediate knowledge of Python. It is an end-to-end resource for anyone looking to enhance their skills or dive into the world of AI for the first time as a programmer or software student, with over 500 pages, several Colab notebooks, hands-on projects, community access, and our AI Tutor. The e-book is a journey through creating LLM products ready for production, leveraging the potential of AI across various industries. Pre-ordering your copy now and take the first step towards your AI journey! Hottest News 1.OpenAI Is Expected To Release a ‘Materially Better’ GPT-5 for Its Chatbot Mid-Year, Sources Say OpenAI is preparing to release GPT-5 around mid-year, offering significant improvements over GPT-4, particularly in enhanced performance for business applications. Although the launch date is not fixed due to continued training and safety evaluations, preliminary demonstrations to enterprise clients suggest new features and capabilities, raising anticipation for GPT-5’s impact on the generative AI landscape. 2. ‘We Created a Processor for the Generative AI Era,’ NVIDIA CEO Says At the GTC conference, NVIDIA CEO Jensen Huang announced the NVIDIA Blackwell computing platform. The platform aims to advance generative AI with superior training and inference capabilities. It includes enhanced interconnects for better performance and scalability. NVIDIA also launched NIM microservices for tailored AI deployment and Omniverse Cloud APIs for sophisticated simulation, signaling a transformative impact on sectors like healthcare and robotics. 3. Stability AI CEO Resigns To “Pursue Decentralized AI” Emad Mostaque has stepped down as […]
↧