This AI newsletter is all you need #96

Author(s): Towards AI Editorial Team Originally published on Towards AI. What happened this week in AI by Louie We are glad to say this was a week for Open-Source AI and small LLMs, with the release of LLama 3 by META and Microsoft’s announcement of Phi-3. LLama 3 is a big win for open-source and cheap and fast smaller models, but it has some limitations. The company chose to focus the model on text format, English language, and a shorter context window (8k). LLama 3 is a very similar model architecture to LLama 2 — the key difference with v3 is a more intelligent and aggressive training data filter (including the use of llama 2 as a data classifier), 7x more data (now a massive 15 trillion tokens) and improved and scaled use of human feedback in fine-tuning. The breakthrough is huge jumps in model capabilities and benchmark scores for small model formats (8bn and 70bn parameters) and huge jumps in capabilities of the best open-source models. The speed advantage of these smaller models will be particularly important for agent workflows where latency per call can stack up. LLama 3 8B and 70B models can be run at home or fine-tuned to specific use cases. They can also be accessed on the cloud, such as on Together.ai, for $0.2 and $0.9 per million tokens, respectively, relative to GPT-3.5-Turbo and GPT-4-Turbo at an average (using 3–1 input vs output) of $0.75 and $15. Grok also offers LLama 3, with 70B at an average $0.64 cost per million tokens with faster inference speed. With LLama 3, we think the biggest gains relative to existing models are likely coming from better training data filtering. META also chose to push hard on training data quantity relative to model parameter size. This is a sub-optimal choice for training cost vs. intelligence (very far from Chinchilla optimal, and more intelligence per unit of training compute would have come from extra parameters rather than extra training tokens). However, the choice is geared towards improved inference costs, creating a smarter, smaller model that will be cheaper to run. Microsoft’s release of Phi-3 3.8B, 7B, and 14B has even more impressive benchmark scores relative to model size. The models were trained on highly filtered web data and synthetic data (3.3T to 4.8T tokens) and traveled further along the path of data quality prioritization. We await more details on the model release, real-world testing, and whether it is fully open source. Current costs and key KPIs of leading LLMs Source: Towards AI, Company websites. Why should you care? When choosing the best LLM for your application, there are many trade-offs and priorities to choose between. Superior model affordability and response speed generally come together with smaller models. At the same time, intelligence, coding skills, multi-modality, and larger context lengths are usually things you pay more for with larger models. We think LLama 3 and Phi-3 will change the game for smaller, faster, cheaper models and will be a great choice for many LLM use cases. Particularly given that it is open-source and flexible, it can be fine-tuned and tailored to specific use cases. It is incredible how far we have come with LLMs in less than two years! In August 2022, the best model available was da-Vinci-002 from OpenAI for $60 per million tokens, scoring 60% on the MMLU test (16k questions across 57 tasks with human experts at 89.8%). Now, Lllama 3 8B costs an average of $0.2 or 300x cheaper while scoring 68.4% MMLU. The most capable models (GPT-4 & Opus) are now at 86.8% on MMLU while multimodal and have 50–100x larger context length. Now, there are a large number of models that are competitive for certain use cases. We expect this to accelerate innovation and adoption of LLMs even further. – Louie Peters — Towards AI Co-founder and CEO Hottest News 1.FineWeb: 15 Trillion Tokens of High-Quality Web Data The FineWeb dataset consists of over 15 Trillion tokens of cleaned and deduplicated English web data from CommonCrawl between 2013 and 2024. Models trained on FineWeb outperform RefinedWeb, C4, DolmaV1.6, The Pile, and SlimPajama. It is accessible on HuggingFace. 2. Meta Introduced Meta Llama 3 Meta has launched Llama 3, the newest addition to its Llama series, accessible on Hugging Face. It is available in 8B and 70B versions, each with base and instruction-tuned variants featuring enhanced multilingual tokenization. Llama 3 is designed for easy deployment on platforms like Google Cloud and Amazon SageMaker. 3. Mistral AI Launched Mixtral 8x22B Mistral unveiled Mixtral 8x22B, an efficient sparse Mixture-of-Experts model with 39B active out of 141B total parameters. It specializes in multilingual communication, coding, and mathematics and excels in reasoning and knowledge tasks. The model has a 64K token context window, is compatible with multiple platforms, and is available under the open-source Apache 2.0 license. 4. Adobe To Add AI Video Generators Sora, Runway, and Pika to Premiere Pro Adobe announced that it aims to update Premiere Pro to add plug-ins to emerging third-party AI video generator models, including OpenAI’s Sora, Runway ML’s Gen-2, and Pika 1.0. With this addition, Premiere Pro users would be able to edit and work with live-action video captured on traditional cameras alongside and intermixed with AI footage. 5. Google’s New Chips Look To Challenge Nvidia, Microsoft and Amazon Google has unveiled the Cloud TPU v5p, an AI chip that delivers nearly triple the training speed of its predecessor, the TPU v4, reinforcing its position in AI services and hardware. Additionally, Google introduced the Google Axion CPU, an Arm-based processor that competes with similar offerings from Microsoft and Amazon, boasting a 30% performance improvement and better energy efficiency. Five 5-minute reads/videos to keep you learning 1.OpenAI or DIY? Unveiling the True Cost of Self-Hosting LLMs The article examines the financial considerations of leveraging OpenAI’s API versus self-hosting LLMs. It highlights the trade-off between the greater control over data achieved through self-hosting, which comes with higher costs for fine-tuning and maintenance, and the […]

This AI newsletter is all you need #96

Trending Articles

Practice Sheet of Right form of verbs for HSC Students

Download: FK ft Shenky – Nakuyewa ”Prod by: Shenky”

How to win at Markstrat (Markstrat Tips and Tricks) – Vodites

Ominde Commission Report and Recommendations – Ominde Report of 1964

Bureau of Internal Revenue: Regional Offices (Directory)

GO 53 on Enhancement of Ex-gratia upto 5 Lakhs Toddy Tappers in Telangana

Cakewalk CA-2A Leveling Amplifier v2.0.1.97 WiN, v2.0.1.96 OSX Incl Keygen

Mp3 Download: Mdu - Kunjenjenjena

How the kill the job , when DTP request running for long hours.

Microsoft Intune から展開しているアプリのアップデートについて

18-year-old girl was beaten for half an hour by two Northampton men in 'an...

Car crash in Dunton Bassett leaves driver in critical condition

Macky 2, Two Others In Road Accident

Application log 00000000000000089514: Could not convert queue DLVST90CLNT

Detroit mafia: D’Anna Brothers agree to plea deal

Delivery block field greyed out using VA02

Muloraki Au

【個人撮影】スマホのプライベート映像♪「中に出さないで///」カラオケ屋での生ハメ撮りが流出ｗ【リベンジポルノ】＠PornHub

BREAKING NEWS: Diamond Platnumz Is Reported Dead After Ghastly Car Accident

FIAT 500 B0111 B0112