Last week in AI [#1]

Here are the latest updates :

Open Source AI Cookbook: The AI community received a treasure trove of resources with the launch of the Open Source AI Cookbook. This collection of notebooks, powered by open-source tools, is designed to empower AI builders with practical, hands-on experience. Dive into the repository and start exploring! Read more.

TRL 0.7.11 Release: The latest version of TRL, 0.7.11, brings forth improvements like DPO + IPO fixes, enhanced data processing speeds in multi-GPU environments, and more. This update is a game-changer for developers working in high-demand AI projects. Discover the details.

3D Demos with Gradio: Making 3D demos has never been easier, thanks to Gradio’s gaussian splatting support. This 30-second tutorial will have you creating interactive demos in no time. Check it out.

llama.cpp CLI Enhancements: Fetching models from the 🤗 Hugging Face Hub is now more straightforward with the latest quality-of-life improvements to llama.cpp CLI. This update simplifies the process, making model integration seamless. Learn more.

Transformers Speedup: Experience a 4x speed increase with Transformers using torch.compile. This enhancement is set to revolutionize model training and inference times, marking a significant leap forward in efficiency. Explore the update.

Nanotron v0.2 Launch: The new version of Nanotron introduces more sparse MoE models and expert parallelism support, optimizing training efficiency and model performance. Find out more.

Google’s Gemma Open-Source LLMs: Google unveiled its latest family of Large Language Models, Gemma, including 2B and 7B variants. These models promise exceptional performance, focusing on responsibility and accessibility. Read the announcement.

LPU – A New AI Compute Paradigm: The introduction of the Language Processing Unit (LPU) by Groq marks a pivotal shift in AI compute, challenging traditional GPU dominance and setting new performance records. Dive into the innovation.

Google’s Gemini 1.5: The release of Gemini 1.5 by Google introduces a highly capable multimodal model with a staggering 10M token context length, boasting over 99.7% recall in needle-in-a-haystack tests across various modalities. Explore Gemini 1.5.

Introducing Sora, OpenAI Text-to-Video Model: Sora is a groundbreaking text-to-video model capable of generating videos up to a minute long. This isn’t just about creating videos; it’s about crafting visual narratives that maintain high visual quality and stay true to the user’s prompts. From educational content to creative storytelling, Sora opens up new possibilities for visualizing ideas and concepts. Learn more about Sora and see it in action.

