Scaleway Kosmos, salvation turned dead end

July 22, 2024

Scaleway Kosmos, salvation turned dead end

We were on the verge of greatness, we were this close

Reinvigorated after initial failings with LXD, I looked to creating a Scaleway Kosmos Kubernetes cluster. The setup would allow me to have a managed control plane while providing my own compute for the workers. It seemed like the best of both worlds, but unfortunately I encountered an issue that while might be fine for some to continue on, would prove to be a dead end for me.

Configuring the cluster control plane in general is quite straightforward. I did attempt the Terraform route first as I try to do with my infrastructure on AWS, but I elected to just try and get some things working first. One has to pick the size of your control plane, which along with some other features, determines the price you’re paying. The initial starter option was going to be more than plenty for my plans. Once that was chosen, spinning up the control plane was easy.

Then came the compute options. You’re able to have Scaleway managed compute, which is the same as their regular Kubernetes option, Kapsule. But obviously the reason you choose Kosmos is to connect your own compute. It’s worth mentioning here as an aside that I believe Scaleway’s bare metal options to be very appealing, especially the MacOS infrastructure and deep integration with the rest of their cloud in general. That was something I found lacking with OVHcloud. Worth noting both companies here are from France and while I alluded to at the start quite a critical issue with the Kosmos product, I feel that Scaleway has a bit of a lead on OVHcloud in terms of setup and product experience.

Anyway, back to compute. Connecting your own metal involves creating a node pool and then downloading an agent binary. There is both x86 and ARM64 binaries which was handy for me as I was planning to connect my Pi 4’s to the cluster also. You need some credentials to connect the agent, Scaleway has some nice IAM controls, reminiscent of AWS’s setup it was very easy to create some least privilege access. After that it was just running the agent binary, which seems to add a bunch of apt repos and packages which in the end, your machine will be connected to your Kubernetes cluster.

This is where, two critical issues start to form. The first, is what I alluded to being fine for others but not for me. While my metal was able to be connected and appearing healthy, I started trying to schedule workloads to the cluster. I was constantly seeing errors around I/O timeouts and an IP address that did not belong to Scaleway. It took a while for me to realise that the IP address in question was my home router address. In effect because there was no port forwarding on my router, Scaleway could not reach my machines on ports besides I believe, 443. Since the Kubernetes API is over 443, it was perfectly fine for my to try and schedule workloads, but literally anything else, did not seem to work.

Researching this issue also proved frustrating. Scaleway at the time, seem to be of the mindset that their binary for connecting metal, is internal and proprietary. Which to them implies, having virtually no documentation for how it works. This lead to debugging being quite annoying. My saving grace, was a blog post from a French individual that I don’t have the link to right now. Thanks to Google Translate I learned that they came across similar issues and had similar complaints. In particular, Kosmos deploys what is effectively VPN Pods, Konnectivity, to handle communication between your metal and the control plane. In theory, this would surely get around the port forwarding of the router requirements. But, due to that lack of documentation described around the agent, researching what every possible port is needed, would be annoying.

I think these issues would be most likely non issues if this was a different environment or setup. But personally, while I can open ports on our router, I don’t want to do that. Lately I’ve just been embracing Tailscale and letting it do its thing. Going about opening ports in a router feels like a step backwards. So after reading the aforementioned blog post and seeing my own issues, I went ahead and looked to decommission the Kosmos cluster.

I still needed a Kubernetes cluster though. I felt a bit demotivated as this was going to be the ultimate solution to this problem. However I elected to push through and went ahead and spun up a Kapsule based cluster. Considering I was paying for the Kosmos control plane and the starter Kapsule control plane was free, I felt like I was just reallocating cost to worker nodes so it was not the end of the world. Ultimately this cluster will be a temporary solution (so, 18 months at least of deployment) while I work towards repaving and scaling my local infrastructure to be a local Kubernetes cluster. I have some ideas with regards the flavour of Kubernetes to use, but I’m also using the Kapsule cluster as a place to experiment with tooling that I’ll run at home so I can hopefully get it right the first time.

Next time, I will talk about how I am embracing Tailscale even further in my self hosting world with this Kubernetes cluster and how thus far it has proven to be one of the best decisions I have made in terms of software to use.

Thank you!

You could of consumed content on any website, but you went ahead and consumed my content, so I'm very grateful! If you liked this, then you might like this other piece of content I worked on.

The initial post in this journey

Photographer

I've no real claim to fame when it comes to good photos, so it's why the header photo for this post was shot by Timelab Pro . You can find some more photos from them on Unsplash. Unsplash is a great place to source photos for your website, presentation and more! But it wouldn't be anything without the photographers who put in the work.

Find Them On Unsplash

Support what I do

I write for the love and passion I have for technology. Just reading and sharing my articles is more than enough. But if you want to offer more direct support, then you can support the running costs of my website by donating via Stripe. Only do so if you feel I have truly delivered value, but as I said, your readership is more than enough already. Thank you :)

Support My Work

GitHub Profile

Visit My GitHub

LinkedIn

Connect With Me

Support my content

Support What I Do!

My CV / Resume

Download Here

Email

contact at evanday dot dev

Client Agreement

Read Here