kubernetes

vLLM + k8s on Bare Metal

November 21, 2025

#llm#vllm#ai#kubernetes#k8s

Ollama on Ubuntu was an easy enough install, with the addition of OpenWebUI - we had an easy to use in-house tool - but we stopped short of voice chat, due to lack of TLS.

The next project is to use a production grade tool - vllm on kubernetes.

All of this is going to require our own DNS server - why haven't we done this already?

https://pi-hole.net/

kubernetes on Ubuntu

http://microk8s.io/

vllm

https://docs.vllm.ai/en/latest/getting_started/quickstart/
https://ploomber.io/blog/vllm-deploy/
https://www.linkedin.com/posts/satyamallick_vllm-deploying-llms-at-scale-like-openai-activity-7397281270063542273-HPXm/
https://github.com/vllm-project/vllm

Here's to the Thanksgiving break, which gives me time to dive in.

Update:

DNS - now running Technitium for the home network. Added ad blocking lists to clean things up. Running on nanostack server.
Microk8s running a 4 node cluster. 1 CP node, 3 worker nodes.
vLLM took a backseat over Turkey day, back at it now.
Added Ollama running as a service with Open Web UI on ubuntu - system76 server. Adding nginx proxy as a frontend so we can enable TLS for voice chats.

here's to more fun.

Comments

Be the first to comment.

Comments

Leave a Comment