Architecture Suggestions #127
Replies: 2 comments
-
It's quite funny you write this, as separating the control plane is something I've been thinking about doing with wag for quite a while now. I am a tad ill right at this second, so my thoughts might a tad unorganised. First things first, while I think this is definitely in my brain to move to, now that we've got a solid way of syncronising account data and other sensitive information across cluster members (etcd). I would say that wag needs some TLC in a couple places. Notably, we need to move away from using eBPF. Which is a bit of a shame because I think its a fun technology to play with, however if you go have a brief read of tailscales implementation of a userland wireguard firewall thing, is much better from a portability and codebase management perspective. And less notably the admin UI is a bit clunky in terms of how it interfaces with the rest of the project. So I'd like to redo it. So in short. This idea is good as it offers horizontal scaling for basically no cost, reduces the edge endpoint complexity (which is always a win for security) and is just where wag was going to begin with. Thanks for the fun discussion tho! I'll be able to engage with it more fully when my brain isnt completely mucked up |
Beta Was this translation helpful? Give feedback.
-
Hope your feeling better! I had a bit of a look at tailscale (and headscale). I wonder if this project would benefit from portability of the wireguard implementation in the same way as tail/headscale does. That is to say, those projects set out to provide mesh networks using wireguard interconnects to form the mesh. To achieve the mesh, it makes sense that they would want to be able to get their agent thing onto arbitrary device types, and they achieve that at the expense of requiring a userspace implementation of wireguard. I wonder if this project requires the same degree of portability or if it can continue to just require a modern linux with kernel support for wireguard, which is arguably more performant and better supported upstream. Then it can also keep the cool eBPF policy enforcement bit :) |
Beta Was this translation helpful? Give feedback.
-
This is just a discussion starter around some ideas to adjust the architecture to make it more cloud friendly and separation of concerns
Forward
This is just a big consolidated suggestion based of cursory understanding of how things are working under the hood - feel free to disregard. I see that eBPF is being used to managed policy application (awesome!). I'm hoping that these suggestions can use all that logic almost verbatim. The big difference here is to separate out the core policy management engine from an arbitrary number of Access Gateways - which, among other things, allows the privileged part of the system to be separated from the hosts out at the edge of the network.
Design
The basic premise is that the Access Gateways are completely stateless. The subscribe to device status updates (device add/remove, key updates, authentication (i.e user level auth) updates). The only thing that would need thought is how does a gateway pull a complete picture of the all device statuses when they first come online - maybe a message is published in response to a new subscription...
The WAG Control is a separate process that can be run elsewhere, probably more internal to the network and provides the Admin UI, APIs, and also the Auth UI. Notably the auth UI would not be restricted to a client facing interface. This would allow it to be easily fronted by, e.g. a AWS ALB, which can host a valid TLS cert - which is practically essential for OAUTH flows. The Access Gateways would be configured with the IP address of the Auth UI (or its load balancer) to be allowed without user level auth.
User Level Auth
I think currently there is the concept of MFA which is another means of authenticating the device above the wireguard keys. This concept of User Level Auth is a more traditional concept of a user, i.e. human. The model of the user would be something like:
The values for username and password would be very dependent on the authentication mechanism in use. Local authentication would mean that the username is an email address or similar, and the password is the password hash. The user when accessing the Auth UI would enter the username and password, which would be validated against the username and password in the database.
If the authentication mechanism is OAUTH, then the username would be some unique identifier provided by the OAUTH provider and the display name would be populated by the appropriate claim. The password would be blank (having the effect of also disabling local auth for that user if it were to be that multiple auth mechanisms were available). The challenge with oauth would be the initial creation of the user. It could be that:
When user level auth is completed successfully, a message is published to ZMQ that is propogated to all subscribed access gateways such that they can update their firewalls.
WAG Access Gateways
These are stateless processes that rely on messages from ZMQ topics to maintain current config. Being stateless both in terms of config and also because wireguard is stateless, should mean that they can be loadbalanced for HA and load sharing (would require the Access Gateways to NAT client traffic into the network I imagine).
Beta Was this translation helpful? Give feedback.
All reactions