Principled Synthetic Data Enables the First Scaling Laws for LLMs in Recommendation Paper • 2602.07298 • Published Feb 7 • 5 • 5
view article Article OpenEnv in Practice: Evaluating Tool-Using Agents in Real-World Environments +3 christian-washington, ajasuja, santosh-iima, lewtun, burtenshaw • Feb 12 • 33
Reasoning Cache: Continual Improvement Over Long Horizons via Short-Horizon RL Paper • 2602.03773 • Published Feb 3 • 13 • 3
AIRS-Bench: a Suite of Tasks for Frontier AI Research Science Agents Paper • 2602.06855 • Published Feb 6 • 83