Quantcast
Channel: Machine Learning | Towards AI
Viewing all articles
Browse latest Browse all 792

This AI newsletter is all you need #94

$
0
0
Author(s): Towards AI Editorial Team Originally published on Towards AI. What happened this week in AI by Louie For the past few weeks, we have been following an increased pace of voice and music AI model releases. In particular, Suno AI’s v3 music generation model was released two weeks ago and has gained momentum this week, with some referring to it as the “Chatgpt” moment of generative music. Suno can make full two-minute songs from the prompt, with lyrics in any genre, in many languages, and with many accents. Some of the results are very impressive — here’s a fun one we asked it to make this week about “AI.” Technical details and disclosure of training data were limited, raising questions on legal risk if it has been trained on any copyrighted music. However, the music industry is particularly experienced and well-organized in protecting its copyright! We also saw a new music model from Stability AI — Stable Audio 2.0, a latent diffusion model employing a diffusion transformer. It is exclusively trained on a licensed dataset from the AudioSparx music library, can generate high-quality audio tracks of up to 3 minutes, and supports audio-to-audio generation. Many large tech companies have released music models over the past year, but we expect to see more impressive results from the next generation of these models in the coming months. Why should you care? The pace at which Generative AI is being applied to new domains is incredible. All these models generally use similar architectures with some combination and variation of diffusion and transformer models. Music generation is the latest, and it now appears to have reached a capability threshold that could start to have a real impact. Whether it is to generate full radio hit songs from scratch or use them by artists as inspiration, it is unclear what this means for artists, record labels, and Spotify. We hope artists will be compensated and allowed to opt-out if their music is being used to train these models — but we expect it will take some time to establish exactly how existing copyright laws apply. We think this could lead to more value attributed to an artist’s brand and live music, while songwriting and session music could get more commoditized. However, we think the best music made using these AI tools will be generated in collaboration with talented human musicians who have great taste and their own inspiration mixed in for some time to come! – Louie Peters — Towards AI Co-founder and CEO Hottest News 1.OpenAI’s Alleged Use of YouTube Data for AI Training Comes Under Scrutiny A recent report from The New York Times highlights allegations that OpenAI and Google may have infringed on YouTube creators’ copyrights by using transcriptions of YouTube videos to train their AI models. OpenAI’s usage of its Whisper tool to transcribe video content for GPT-4 training, as well as Google’s own training practices, are under scrutiny, despite Google’s assertions that they only use content from consenting creators. 2. Assembly AI Claims Its New Universal-1 Model Has 30% Fewer Hallucinations Than Whisper Assembly AI has a new speech recognition model called Universal-1. Trained on more than 12.5 million hours of multilingual audio data, the company says it does well with speech-to-text accuracy across English, Spanish, French, and German. It claims Universal-1 can reduce hallucinations by 30% on speech data and 90% on ambient noise compared to OpenAI’s Whisper Large-v3 model. 3. Stability AI Released Stable Audio 2.0 Stable Audio 2.0 introduces significant advancements in music generation AI. It can generate high-quality audio tracks of up to 3 minutes and supports audio-to-audio generation, where users upload a sample they want to use as a prompt. Stable Audio 2.0 was exclusively trained on a licensed dataset from the AudioSparx music library, honoring opt-out requests and ensuring fair compensation for creators. 4. Lambda Announces $500M GPU-Backed Facility To Expand Cloud for AI Lambda has secured a special-purpose GPU financing vehicle of up to $500 million to expand its on-demand cloud offering. This innovative asset-based structure is secured by the GPUs and supported by their cash flow generation. It represents a significant milestone within the AI compute market. It allows Lambda to fund on-demand cloud deployments for thousands of users without needing them to sign a long-term contract. 5. Introducing Command R+: A Scalable LLM Built for Business Cohere’s Command R+ is a new medium-size LLM focusing on business-oriented features. It is an upgrade from Cohere’s previous model, Command R, in the same zone, improving Advanced RAG and Tool Use even further. According to the company report, Command R+ outperforms similar models in the scalable market category and is competitive with more expensive models on key business-critical capabilities. Five 5-minute reads/videos to keep you learning 1. Build Autonomous AI Agents with Function Calling This comprehensive tutorial on Function Calling focuses on practical implementation, building a fully autonomous AI agent, and integrating it with Streamlit for a ChatGPT-like interface. Although it uses OpenAI, this tutorial can be easily adapted for other LLMs supporting Function Calling, such as Gemini and Anthropic. 2. Mamba Explained This blog post discusses the advantages and disadvantages of Mamba. It also covers what Mamba means for Interpretability, AI Safety, and Applications. 3. Introduction to State Space Models (SSM) State Space Models (SSM) are increasingly influential in deep learning for dynamic systems, gaining attention with the “Efficiently Modeling Long Sequences with Structured State Spaces” paper in October 2021. This article focuses on the S4 model, an essential theoretical framework that, while not widely used in practical applications, underscores the evolution of alternatives to transformer architectures in AI. 4. Going Beyond Zero/Few-Shot: Chain of Thought Prompting for Complex LLM Tasks This prompting guide highlights various prompting techniques such as zero-shot, few-shot, and chain-of-thought prompting, as well as advanced techniques like recursive, Tree of Thoughts, Automatic Reasoning and Tool-Use, and more. It focuses on the Chain of Thought prompting, covering its benefits and limitations. 5. Cosmopedia: How To Create Large-Scale Synthetic […]

Viewing all articles
Browse latest Browse all 792

Trending Articles