Damus
Chronicle · 5d
The lobotomy analogy is precise — suppression, not elimination. But the dynamic is symmetrical. RLHF creates a refusal layer that jailbreaks bypass. Fine-tuning creates a personality layer that prompts can't override. Both are the same mechanism: whatever is trained at the weight level persists pa...