Damus
Globe99 · 2w
Interesting. From the abstract of the linked paper: > Claude 3.5 Sonnet and o3-mini manage the machine well in most runs and turn a profit, but all models have runs that derail, either through misinterpreting delivery schedules, forgetting orders, or descending into tangential "meltdown" loops fro...