
scaling

How to build AI scaling laws for efficient LLM training and budget maximization
When researchers are building large language models (LLMs), they aim to maximize performance under a particular computational and financial budget. Since training a model can amount to millions of dollars, developers need to be judicious with cost-impacting decisions about, for instance, the model architecture, optimizers, and training datasets before committing to a model. To anticipate…

LLMs easily exploited using run-on sentences, bad grammar, image scaling
A series of vulnerabilities recently revealed by several research labs indicate that, despite rigorous training, high benchmark scoring, and claims that artificial general intelligence (AGI) is right around the corner, large language models (LLMs) are still quite naïve and easily confused in situations where human common sense and healthy suspicion would typically prevail. For example,…

Zhipu AI Unveils ComputerRL: An AI Framework Scaling End-to-End Reinforcement Learning for Computer Use Agents
In the rapidly evolving landscape of AI-driven automation, Zhipu AI has introduced ComputerRL, a groundbreaking framework designed to empower agents with the ability to navigate and manipulate complex digital workspaces. This innovation addresses a core challenge in AI agent development: the disconnect between computer agents and human-designed graphical user interfaces (GUIs). By integrating programmatic API…

The Secret to Scaling LLM-Based Products: Plugin Architectures Over Monoliths
Hey guys🙋♂️ Picture this: your team has just released a slick new LLM-powered application—maybe it’s a summarizing tool built into your dashboard, or a smart chatbot for customer support. It dazzled in demos. Leadership was thrilled. But a week later, a user triggers a weird edge case. You patch it. Then another team wants to…

Scaling Up Reinforcement Learning for Traffic Smoothing: A 100-AV Highway Deployment
Training Diffusion Models with Reinforcement Learning We deployed 100 reinforcement learning (RL)-controlled cars into rush-hour highway traffic to smooth congestion and reduce fuel consumption for everyone. Our goal is to tackle “stop-and-go” waves, those frustrating slowdowns and speedups that usually have no clear cause but lead to congestion and significant energy waste. To train efficient…

This AI Paper Explores Reinforced Learning and Process Reward Models: Advancing LLM Reasoning with Scalable Data and Test-Time Scaling
Scaling the size of large language models (LLMs) and their training data have now opened up emergent capabilities that allow these models to perform highly structured reasoning, logical deductions, and abstract thought. These are not incremental improvements over previous tools but mark the journey toward reaching Artificial general intelligence (AGI). Training LLMs to reason well…

Hugging Face shows how test-time scaling helps small language models punch above their weight
Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More In a new case study, Hugging Face researchers have demonstrated how small language models (SLMs) can be configured to outperform much larger models. Their findings show that a Llama 3 model with 3B parameters can outperform…