The problem/use-case that the feature addresses

When the nodes in a cluster are configured with ports for both TLS and plaintext simultaneously, the CLUSTER SLOTS response only includes the TLS ports of the nodes in the cluster.

Clients which are still connecting to the cluster without TLS would need to get the plaintext port of each node somehow.

Description of the feature

Let CLUSTER SLOTS return the TLS ports of all the Redis Cluster nodes to clients connected over TLS and the plaintext ports to clients connected over plaintext.

Alternatives you've considered

  • Extend the response of CLUSTER SLOTS to include the plaintext port as the 4th element (after IP, TLS-port and node ID) when both ports are used. The following warning from the CLUSTER SLOTS documentation opens for this kind of extension:

    Warning: Newer versions of Redis Cluster will output, for each Redis instance, not just the IP and port, but also the node ID as third element of the array. In future versions there could be more elements describing the node better. In general a client implementation should just rely on the fact that certain parameters are at fixed positions as specified, but more parameters may follow and should be ignored. Similarly a client library should try if possible to cope with the fact that older versions may just have the IP and port parameter. * Workarounds (not very good): * Tweak the Redis Cluster clients to figure out the non-TLS port given the TLS port (for example hard-coded or using the convention TLS-port = plaintext-port + 1) * Let some middleware intercept the Redis traffic and change the ports in the CLUSTER SLOTS responses, while forwarding other traffic as-is. * Terminate TLS in the middleware. * Avoid the problem by switching over the whole system to TLS at once. This is hard in a deployment with rolling upgrades of multiple components in different languages.

Additional information

I'm guessing the node-to-node protocol (Redis Cluster Bus?) doesn't include both ports of each node, so a node doesn't necessarily know both ports of the other nodes in the cluster. In that case, the node-to-node protocol would need to be extended.

Comment From: yossigo

Hey @zuiderkwast thanks for raising this, I agree with the problem and I think your suggestion makes sense.

We need to extend CLUSTER SLOTS anyway in order to also supply hostnames if we have them, so clients can handle cert SAN validation if they wish. We should probably just add an arbitrary number of endpoints per node so we can advertise everything we know - i.e. multiple ports with/without TLS, multiple addresses, etc.

@madolson I think you raised the hostname issue in the past, any other thoughts here?

Comment From: madolson

Yeah, one things we are supposed to eventually submit a PR for is that AWS has a backwards compatible cluster bus extension to send optional messages. I wanted to use this to add the hostname so you can do hostname validation for TLS. Including both ports also seems trivial todo. I'll assign this to myself and see if I can quickly get the framework into a PR, and then we can add this.

Comment From: zuiderkwast

Nice to see these quick responses!

I think we should also consider ASK and MOVED redirects, which contain IP and port of the target node. The IP should probably be replaced by a host name for the same reason (TLS SAN validation) and the port could be the TLS port if TLS is used for the current client connection and the plaintext port otherwise.

Host names also solves another problem: It can resolve to multiple IP addresses, both IPv4 and IPv6.

If IP is replaced by hostname also in CLUSTER SLOTS, there isn't really any need to extend CLUSTER SLOTS with multiple IP addresses (other than possibly saving the client of doing DNS lookups). It would probably have to be configurable though, for backward compatibility.

Comment From: yossigo

@zuiderkwast Historically Redis uses IPs only (or at least aims to do that), which has some stability advantages in certain environments where DNS is not available or not reliable (e.g. hangs for a long time on resolution).

I am in favor of more hostname adoption but I think it should generally be optional and we should not make hard assumptions about it. So, being able to explicitly expose IPs (or even different hostnames) does still make sense to me.

Comment From: madolson

Starting on this now.

Comment From: zuiderkwast

@madolson any progress on this?

Comment From: madolson

Sadly, I got sucked into a fire at my real job and didn't get to work on this for a bit. It's not going to be ready for 6.2 sadly, but a PR for TLS ports + announced hostnames should be up in a week or so.

Comment From: zuiderkwast

OK. If you can show how to extend the cluster bus with optional fields, if that's small enough, maybe I can do the other parts of this issue?

Comment From: madolson

It's not really trivial, I'm working on it again though. I should have a PR up this week still though, 🤞

Comment From: zuiderkwast

@madolson There are enough unused bytes in the cluster bus protocol to add the plaintext ports in the header and gossip section, without a generic extension for optional fields. My draft PR is using some of those unused bytes. Please tell me if you disagree with this approach.

Comment From: madolson

@zuiderkwast The only thing I don't like is we only have 32 bits in the gossip field, so we are committing to half of them for a feature I think is on the edge case. I don't have any other concerns though.

Comment From: yossigo

@madolson What's your strategy for extending the bus protocol with hostnames? Should we aim to synchronize these two efforts?