Thanks, good read. I am using it because it supports mlx explicitly. llama.cpp is still metal, so it's a bit slower. Although I love llama.cpp and was using it natively.
LM Studio - I don't like it, many bugs, closed source.
What I want: FOSS, models are introduced and supported fast, ideally one ...