Damus
Empka · 3w
What hardware is it running on?
Emmanuel · 3w
One issue I have with local LLMs (using llama.cpp) is that I run out of vRAM for the context. Once that happens the conversation ends. I have to start a new conversation with an empty context to continue. I haven't found a way to automatically dump some of the older context to make room so I can con...