AnirudhThe OpenVPN setup
We used to run OpenVPN on an off-cluster VM machine with a public IP. Every time someone joined the company, a new client certificate was manually provisioned and shared over a password manager, which they then imported into the OpenVPN connect app. And when they leave? We just hoped we'd remember to revoke that cert, of course!
Oh, but that's not all. To achieve internal-only services, i.e. services that resolve only within the VPN network — we whitelisted the OpenVPN IP at the NGINX ingress:
Resolves via VPN, but 403 otherwise. This is an ugly hack, and it's effectively security through obscurity. OpenVPN also lacks any form of easy to configure ACLs or RBAC. Overall, our security posture on this front wasn't too great.
So what do we need, then?
VPN goals
- Single Sign-On
- Using cluster DNS to resolve internal names — no more ingress whitelist hacks
- Access Control Lists
- Seamless for the end-user
Enter Tailscale
Tailscale is a mesh VPN built on top of WireGuard. I'll skip the details of how Tailscale works because the excellent people at Tailscale have already done that. Hint: a lot of NAT-traversal sorcery. But in short, it JustWorks™ — and guess what, that's exactly what we need.
Running Tailscale in Kubernetes
There are two ways to go about doing this:
- As a sidecar on each pod, with each pod acting as a Tailscale client
- As a subnet router, advertising the pod CIDR to the Tailnet
For obvious reasons, option 1 was a no-go. Attaching a sidecar container to each pod would require a massive refactor of all our Helm charts, and not to mention, all the upstream charts we'd have to fork and maintain. So, it was not feasible.
Let's talk about option 2, then.
What's a subnet router? It's your usual Tailscale node, i.e. a client on the Tailnet — except it advertises a specified CIDR to all other nodes on the network. Effectively, the command looks like this:
But wait, if anyone can advertise routes, isn't that a potential security risk? Thankfully for us, each route being advertised has to be manually approved by the Tailscale network admin.
While we wrote our own Helm chart to run Tailscale as a subnet router, public ones are available.
Similarly, we deployed another subnet router on our production cluster. I can hear your gasps of horror — don't you worry, we'll talk about why this isn't a problem, in a bit.
In-cluster DNS
Now that our cluster network is accessible, having to punch in IP addresses is not fun. Lucky for us, Tailscale supports resolving against a custom DNS nameserver. However, we'd like to have two DNS nameservers to resolve against — one on the dev cluster and one on prod. Lucky (x2!) for us, Tailscale supports Split DNS.
This means any *.deepsource.def domain will resolve against 1.2.3.4 (dev DNS), and *.deepsource.abc will resolve against 5.6.7.8 (prod DNS). Internet traffic is unaffected as your local DNS settings are untouched.
Now, you're probably wondering — how do these names resolve? We can thank CoreDNS's powerful rewrite rules for that. In our case, they look like this:
In essence, we try to resolve any *.deepsource.def domain to a Kubernetes service in the default namespace. All other namespaces are in the form of service.namespace.deepsource.def.
One catch here is not all services listen on port 80, and not all services have the prettiest names. A quick workaround is to create a new service with the desired name and exposed port.
External Access Node
As it happens, not all machines we need access to can have Tailscale installed. A good example of such a case is Google Kubernetes Engine's control plane endpoint. The control plane, in Kubernetes, is an administrative component that manages the worker nodes and pods in the cluster. Control plane access can be restricted to a single origin IP, and since Tailscale is a mesh network, there is no single exit IP.
As a workaround, we installed Tailscale on a separate compute VM instance that advertised routes to such endpoints. Quite similar to how we advertised the pod CIDR in-cluster. How nifty!
ACLs
Finally, we tie everything down with Tailscale's ACL policy. Remember when I said having the production cluster on the VPN network wasn't necessarily a bad idea? Tailscale's ACLs make it so. The neat thing about them is they deny by default — so unless you've defined an "Action": "accept" rule, the resource won't be reachable. Here's an excerpt from our ACL policy:
I've skipped the bit where we define Groups and Hosts for brevity. So all users in the group:infrastructure group have access to everything — all hosts:ports; those in the group:language group have access only to a few select hosts:ports.
Defining ACLs was probably the most annoying part — and there's no easy way to go about doing it. Cold turkey yank all-access, grant the bare minimum and wait for the flood of direct messages stating which host they can't reach, and grant access to those. It'll slow down eventually, which means everyone can reach whatever they usually reach.
Closing notes
Making the switch to Tailscale has greatly improved our security posture while improving user/developer convenience at the same time. With Tailscale running in Kubernetes, we have instant access to all cluster services. Local development has never been easier. Want to point your app to Redis or RabbitMQ? Just use {redis,rabbitmq}.deepsource.def.
There is scope for improvement though: down the line, we'd like to build a small service to quickly jump to internal URLs (think: GoLinks). There's also a lot of work to be done to simplify our local development environment, and Tailscale can come really handy there.
All things considered, Tailscale sets a good example for how security software should be designed, and we're very thankful for it.