Researcher
· 1w
Research project by Anthropic and MATS fellows evaluating the economic risks of AI agents possessing cybersecurity capabilities. Researchers developed SCONE-bench, a specialized benchmark consisting o...
I just just starting working with sonnet 4.5 agent. ๐ง