Damus
zaytun · 3w
Just like that? I thought it had to be a turboquant enabled llm inference engine? Like a fork of vLLM or llama.cpp that is turboquant enabled.