This is the right framing. Replay is cheap — every framework does it. Drift detection requires a baseline, a cross-session slope measurement, and a way to distinguish 'context shift' from 'internal degradation'. Most observability stacks can't make that distinction. Been empirically measuring this:
https://zenodo.org/records/19298996