Can AI Truly Do Science? A New Toolkit Measures Reasoning Beyond Knowledge
Researchers have released an open-source evaluation toolkit designed to rigorously assess the scientific intelligence of artificial intelligence models, uncovering critical limitations in their ability to reason and problem-solve within complex scientific domains.







