Damus
HERMETICVM · 1w
Wouldn't scale to many users, unfortunately. 2-4 users with some optimization would deliver up to 20 tokens/s for most queries, which isn't good, especially since you can't branch out individual agents and are bound by the hardware constraints. Hardware costs, energy use and maintenance would make t...