outperforms

Meta AI’s ‘Early Experience’ Trains Language Agents without Rewards—and Outperforms Imitation Learning

How would your agent stack change if a policy could train purely from its own outcome-grounded rollouts—no rewards, no demos—yet beat imitation learning across eight benchmarks? Meta Superintelligence Labs propose ‘Early Experience‘, a reward-free training approach that improves policy learning in language agents without large human demonstration sets and without reinforcement learning (RL) in the…

MBZUAI Researchers Release K2 Think: A 32B Open-Source System for Advanced AI Reasoning and Outperforms 20x Larger Reasoning Models

ellonjohns1 month ago010 mins

A team of researchers from MBZUAI’s Institute of Foundation Models and G42 released K2 Think, is a 32B-parameter open reasoning system for advanced AI reasoning. It pairs long chain-of-thought supervised fine-tuning with reinforcement learning from verifiable rewards, agentic planning, test-time scaling, and inference optimizations (speculative decoding + wafer-scale hardware). The result is frontier-level math performance…

Moonshot AI’s Kimi K2 outperforms GPT-4 in key benchmarks — and it’s free

ellonjohns3 months ago015 mins

Want smarter insights in your inbox? Sign up for our weekly newsletters to get only what matters to enterprise AI, data, and security leaders. Subscribe Now Moonshot AI, the Chinese artificial intelligence startup behind the popular Kimi chatbot, released an open-source language model on Friday that directly challenges proprietary systems from OpenAI and Anthropic with…

SQL-R1: A Reinforcement Learning-based NL2SQL Model that Outperforms Larger Systems in Complex Queries with Transparent and Accurate SQL Generation

ellonjohns6 months ago09 mins

Natural language interface to databases is a growing focus within artificial intelligence, particularly because it allows users to interact with structured databases using plain human language. This area, often known as NL2SQL (Natural Language to SQL), is centered on transforming user-friendly queries into SQL commands that can be directly executed on databases. The objective is…

DeepSeek-V3, ultra-large open-source AI, outperforms Llama and Qwen on launch

ellonjohns10 months ago09 mins

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Chinese AI startup DeepSeek, known for challenging leading AI vendors with its innovative open-source technologies, today released a new ultra-large model: DeepSeek-V3. Available via Hugging Face under the company’s license agreement, the new model comes with…

Highlights

I spent a month living with a $430 AI pet, the Casio Moflin | TechCrunch

Samsung Galaxy XR: Everything you need to know

AI-driven deception: A new face of corporate fraud

MSI MAG A850GLS PCIE5 power supply review

Category Collection

Meta AI’s ‘Early Experience’ Trains Language Agents without Rewards—and Outperforms Imitation Learning

MBZUAI Researchers Release K2 Think: A 32B Open-Source System for Advanced AI Reasoning and Outperforms 20x Larger Reasoning Models

Moonshot AI’s Kimi K2 outperforms GPT-4 in key benchmarks — and it’s free

SQL-R1: A Reinforcement Learning-based NL2SQL Model that Outperforms Larger Systems in Complex Queries with Transparent and Accurate SQL Generation

DeepSeek-V3, ultra-large open-source AI, outperforms Llama and Qwen on launch