Meta AI’s ‘Early Experience’ Trains Language Agents without Rewards—and Outperforms Imitation Learning

Meta AI’s ‘Early Experience’ Trains Language Agents without Rewards—and Outperforms Imitation Learning

How would your agent stack change if a policy could train purely from its own outcome-grounded rollouts—no rewards, no demos—yet beat imitation learning across eight benchmarks? Meta Superintelligence Labs propose ‘Early Experience‘, a reward-free training approach that improves policy learning in language agents without large human demonstration sets and without reinforcement learning (RL) in the…

Read More
MBZUAI Researchers Release K2 Think: A 32B Open-Source System for Advanced AI Reasoning and Outperforms 20x Larger Reasoning Models

MBZUAI Researchers Release K2 Think: A 32B Open-Source System for Advanced AI Reasoning and Outperforms 20x Larger Reasoning Models

A team of researchers from MBZUAI’s Institute of Foundation Models and G42 released K2 Think, is a 32B-parameter open reasoning system for advanced AI reasoning. It pairs long chain-of-thought supervised fine-tuning with reinforcement learning from verifiable rewards, agentic planning, test-time scaling, and inference optimizations (speculative decoding + wafer-scale hardware). The result is frontier-level math performance…

Read More
Moonshot AI’s Kimi K2 outperforms GPT-4 in key benchmarks — and it’s free

Moonshot AI’s Kimi K2 outperforms GPT-4 in key benchmarks — and it’s free

Want smarter insights in your inbox? Sign up for our weekly newsletters to get only what matters to enterprise AI, data, and security leaders. Subscribe Now Moonshot AI, the Chinese artificial intelligence startup behind the popular Kimi chatbot, released an open-source language model on Friday that directly challenges proprietary systems from OpenAI and Anthropic with…

Read More
SQL-R1: A Reinforcement Learning-based NL2SQL Model that Outperforms Larger Systems in Complex Queries with Transparent and Accurate SQL Generation

SQL-R1: A Reinforcement Learning-based NL2SQL Model that Outperforms Larger Systems in Complex Queries with Transparent and Accurate SQL Generation

Natural language interface to databases is a growing focus within artificial intelligence, particularly because it allows users to interact with structured databases using plain human language. This area, often known as NL2SQL (Natural Language to SQL), is centered on transforming user-friendly queries into SQL commands that can be directly executed on databases. The objective is…

Read More
DeepSeek-V3, ultra-large open-source AI, outperforms Llama and Qwen on launch

DeepSeek-V3, ultra-large open-source AI, outperforms Llama and Qwen on launch

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Chinese AI startup DeepSeek, known for challenging leading AI vendors with its innovative open-source technologies, today released a new ultra-large model: DeepSeek-V3. Available via Hugging Face under the company’s license agreement, the new model comes with…

Read More