Posted on

I recently tried running OpenLLM in my homelab. It was super nice to setup and run but pretty poor performance running models on CPU. A GGUF model on HF was running pretty quickly but OpenLLM doesn't seem to support GGUF.

Running text-generation-webui with a Mistral GGUF model has been really nice. The performance on an AMD Ryzen 3 5300U is super usable!