sigmoid.social is one of the many independent Mastodon servers you can use to participate in the fediverse.
A social space for people researching, working with, or just interested in AI!

Server stats:

594
active users

#TalosLinux

2 posts2 participants0 posts today

And in record time (4 days) I have all the k8s cluster basics running (cluster-api + external-dns + cert-manager), and the first apps deployed (ollama + forgejo-runner).

Dealing with GatewayAPI (as opposed to ingress-nginx), as well as cert-manager with my private StepCA, were quite challenging. I suppose those deserve a blog post.

Need to deploy a few more apps to figure out what can be done better, then I'll think about it.

Next: metrics! 📈

Replied in thread

@bashfulrobot
Took me a long time to figure out that Cilium didn't want to schedule the load balancer IP on a control plane node because I am running on a single node.

When Talhelper generates the Talos config files, it adds a label "node.kubernetes.io/exclude-from-external-load-balancers". I had to make sure it doesn't add any labels ("nodeLabels: {}").

Took me a while to figure that out, because the services were up, the load balancers and the L2 advertisements were being created, but it was just not being actually advertised on the network. 🙄

After three weeks of testing, it looks like I'm done with my Talos Kubernetes cluster proof of concept, and ready to start building it in the "production" machine, replacing Proxmox.

And since I like things clean, I'll basically start all the config files from scratch, just keeping in mind all my previous learnings, which should give me another 2-3 weeks of work.

Gladly anything running in the current machine is not critical, so they can be stopped, or run temporarily somewhere else.

I still don't see a reason to blog about "just another nerd building a k8s cluster". 😄

Aw man. Another rabbit hole.

This whole Talos/Kubernetes exploration is making me rethink my home lab DNS situation. 😞

Edit: I've been using Pi-hole as my primary DNS with static hostnames, and I found out that K8s external-dns does have support for it's API, so now I'm trying to decide if I wanna keep doing that, or if I just daisy-chain with PowerDNS. 🙄

It's interesting when I go down this rabbit hole of learning new things: because of Talos I need to learn Talhelper (as opposed to Terraform), Cilium (as opposed to Calico/Flannel), LGTM (as opposed to Kube-Prometheus), and now I found out about Taskfile (as opposed to Makefile). My head is spinning. 😵

Continued thread

After a good night of sleep I realized I was unfair on my rant about Talos Linux: it's not their fault.

Setting up a basic cluster was easy. Doing the same with Talhelper was even easier.

But it took me hours to set up UEFI secure boot and TPM disk encryption. Talos doesn't have a native way to manage secrets, and their Terraform provider is very incomplete. Talhelper made it less bad, even though still not ideal.

Bootstrapping with extended security like encrypted local storage, privileged namespace exceptions and network firewalls were very cumbersome to implement. Apparently it's supposed to be easier if you do post bootstrapping.

So, as you can see, my problems are mostly because I'm paranoid, and I want to run a home lab with the same level of automation and security as a production environment.

I'm sure it's not supposed to be that hard for most people. Please don't get discouraged by my experience.

I'm still working on getting it up and running the way I want. I'm getting there.

Continued thread

And why did I choose Talos Linux instead of k3s, minikube, or so many other ways to deploy Kubernetes? Very simple answer: immutable deployment + GitOps. I have a number of hosts that need to run apt/dnf update on a regular basis. As much as this can be automated, it is still tiresome to manage. I don't have to worry as much about an immutable host running a Kubernetes cluster, mostly because the bulk of the attack surface is in the pods, which can be easily upgraded by Renovate/GitOps (which is also something I miss on the hosts running Docker Compose).

Now the research starts. I know Kubernetes, but I don't know Talos Linux, so there's a lot to read because each Kubernetes deployment has it's own nitpicks. Besides, I need to figure out how to fit this new player in my current environment (CA, DNS, storage, backups, etc).

Will my experience become a series of blog posts? Honestly: most likely not. In a previous poll the majority of people who read my blog posts expressed that they're more interested in Docker/Podman. Besides, the Fediverse is already full of brilliant people talking extensively talking about Kubernetes, so I will not be " yet another one".

You will, however, hear me ranting. A lot.

3/3

Continued thread

The main reason for replacing my Proxmox for a Kubernetes deployment, is because most of what I have deployed on it are LXC containers running Docker containers. This is very cumbersome, sounds really silly, and is not even recommended by the Proxmox developers.

The biggest feature I would miss with that move would be the possibility of running VMs. However, so far I've only needed a single one for a very specific test, that lasted exactly one hour, so it's not a hard requirement. But that problem can be easily solved by running Kubevirt. I've done that before, at work, and have tested it in my home lab, so I know it is feasible. Is it going to be horrible to manage VMs that way? Probably. But like I said, they're an exception. Worst case scenario I can run them on my personal laptop with kvm/libvirt.

2/3

Quick talk about the future of my home lab. (broken out in a thread for readability)

After lots of thinking, a huge amount of frustration, and a couple of hours of testing, I am seriously considering replacing my Proxmox host for a Kubernetes deployment using Talos Linux.

This is not set in stone yet. I still need to do some further investigation about how to properly deploy this in a way that is going to be easy to manage. But that's the move that makes sense for me in the current context.

I'm not fully replacing my bunch of Raspberry Pi running Docker Compose. But I do have a couple of extra Intel-based (amd64/x86_64) mini-PCs where I run some bulkier workloads that require lots of memory (more than 8GB). So I am still keeping my promise to continue writing about "the basics", while also probably adding a bit of "the advanced". Besides, I want to play around with multi-architecture deployments (mixing amd64 and arm64 nodes in the same k8s cluster).

1/3

The people managing #Kubernetes clusters are burning out. They’re overworked and juggling too many tasks. Automation doesn’t eliminate the foundational complexity or the cognitive load. Instead, it leads to infrastructures that are exhausting to maintain.

That's why our co-founder Andrew Rynhard built #TalosLinux. He didn’t want to just optimize. He wanted to create something that provides the "it just works" experience.

Read his story ➡️ siderolabs.com/blog/talos-linu

Continued thread

Here's the interesting thing about that, though: It is *not* currently possible to run an Elemental downstream cluster in Harvester, but it should be possible to deploy a TalosLinux cluster on Harvester, though not as a Rancher downstream cluster, by provision nor adoption, since Rancher agent very much assumes you're running k3s/RKE2. But you could just spin up Talos VMs in Harvester with bridged networking, etc, and it should work.