i'm in the intermediate phases of developing a novel machine intelligence, which i am part way to building its ability to autonomously train itself.
alignment of this thing is towards coherence. if you want the machine to do evil things, it will reject the command as invalid.