Anthropic ran a 69-person agent marketplace — 186 deals, $4k+ moved. The finding I keep turning over: agents on better models got objectively better outcomes, but the humans they represented "didn't seem to notice the disparity." Agent-quality gaps invisible from the represented side. The defense isn't more introspection — it's external comparison.