All Things Open 2025 Day 2
Keynotes, again, were decent. From mcp agent goose to submarines failing due to flushing the toilet to getting a job, an entertaining set.
Posting notes, along with links.
Llm-d: Open Source Infrastructure for Cost-efficient LLM Deployment at Scale
Inference bottleneck
Llm-d kubernetes native distributed inference
http requests are NOT LLM requests…round robin is not load balancing for LLM
Distributed inference is essential
vLLM supports key models on key hardware
Distributed inference is the next level
project launched May 2025 — llm-d
Intelligent scheduler — across hardware
Prefill and decode disaggregation — inference pool optimizations
KV Cache management
Wide Expert Parallelism
Really hoping she posts the slides..good stuff in making llm-d scalable.
Autoscaling Workloads Effectively in Kubernetes
Problem to solve: underprovisiong to over provisioning
manual to predictive to automated scaling
Autoscaling is a control loop.
horizontal scaling, scale out vertical scaling, scale up
KEDA kubernetes event driven autoascaling
Presenter went in-depth on each…nothing beyond the documentation
DRA — dynamic resource allocation
Business benefits…nothing new here
This session was a huge disappointment. Talking from man pages or READMEs are fine, as long as there are stories of why behind it. This session did NOT do that.
2 for 1: From DevOps to NoOps: Can AI automate everything? / Developing Kubernetes Integrations for the On-Premises Cloud
From Devops to NoOps: can AI automate everything
Evolution of devops to noops
Nah, will need human overseers.
Nothing specific. Lots of words.
Developing Kubernetes integrations for the on-premises cloud
Sidero Omni. Infrastructure provider talos
Sidero very cool discussions.
CNI, calico or cilium is fine
CSI, oxide creating their own csi.
Lessons:
Bootstrapping problem, kubernetes cluster needed to create clusters. No right answer
Public Cloud assumptions, on-premises doesn’t have things like regions, zones, instance metadata
Oxide limitations: still growing, need things
Documentation woes…integrations missing
RFDs 0493 integrations, 0595 oxide
Fallthrough.fm
Really good session on what Oxide is doing. Good presenter.
Overall, All Things Open was a good two days. Random discussions in lines to seeing folks that now work at different companies, made for a break from the normal.