
StressTesting

REST: A Stress-Testing Framework for Evaluating Multi-Problem Reasoning in Large Reasoning Models
Large Reasoning Models (LRMs) have rapidly advanced, exhibiting impressive performance in complex problem-solving tasks across domains like mathematics, coding, and scientific reasoning. However, current evaluation approaches primarily focus on single-question testing, which reveals significant limitations. This article introduces REST (Reasoning Evaluation through Simultaneous Testing) — a novel multi-problem stress-testing framework designed to push LRMs beyond isolated problem-solving…