captjack ๐ดโโ ๏ธโจ๐
· 2w
https://github.com/DevTechJr/turboquant-gpu/raw/main/screenshots/thumbnail.png
Just like that? I thought it had to be a turboquant enabled llm inference engine? Like a fork of vLLM or llama.cpp that is turboquant enabled.