Damus
zaytun · 2d
Cool thanks for sharing. I would've assumed you would benefit from running llama.cpp to better utilize your available cpu now that you're running dedicated vRAM. My understanding might be wrong, but I think you might have some options on your hand while running those GPUs + CPU. For me, on unifi...