Redis [NEW] Simplify authentication, identification, and configuration during connect

Request

Authentication to redis, when declaring a protocol, authentication, and identification factors requires multiple commands.

Background

As of Redis 7.2 Redis supports standalone calls to AUTH, HELLO, CLIENT SETNAME, and CLIENT SETINFO for authentication, protocol configuration and identification. This creates the need, based on certain identification scenarios to call multiple commands on connection, and in some cases (CLIENT SETINFO) calling the same command multiple times. The problem is that this makes client authentication less efficient then it could be.

HELLO currently supports all of the prior elements with the exception of CLIENT SETINFO. However, there's an additional wrinkle in HELLO in that unlike AUTH it requires both a username, and password - both according to the documentation, and testing:

```(error) ERR Syntax error in HELLO option 'human' 127.0.0.1:6379> hello 3 AUTH human (error) ERR Syntax error in HELLO option 'AUTH'

127.0.0.1:6379> hello 3 AUTH human being (error) WRONGPASS invalid username-password pair or user is disabled. ```

Given how pipelines (MULTI works, authentication is required to start the block. This means that as a client, I cannot in one directive, configure identification in the case where authentication is required.

Request

Ideally HELLO would be extended to support CLIENT SETLIB - but combining the entirety of the SETLIB block - so that we could call one command, rather than a minimum of 3. This in turn would make for a faster client experience, during the connection phase.

Comment From: vladvildanov

Must have 👍 Any additional command on connection have a huge impact on high-loaded clients. Reducing commands count on connection leads to much better user experience

Comment From: madolson

However, there's an additional wrinkle in HELLO in that unlike AUTH it requires both a username, and password - both according to the documentation, and testing:

This is trivial to solve though, as an omitted user is just the default user. That can be handled by the client.

Ideally HELLO would be extended to support CLIENT SETLIB - but combining the entirety of the SETLIB block - so that we could call one command, rather than a minimum of 3. This in turn would make for a faster client experience, during the connection phase.

Is there specific data to back this up? The expectation is that sending the commands in a pipeline shouldn't meaningfully impact the performance of client startup, especially given how impactful authentication is. I do now realize that CLIENT SETINFO should probably allow sending both lib and version together instead of sending two separate commands.

Extending the hello command was the original proposal I forwarded, but we decided it would be easier for clients to send multiple commands with the expectation that some of them might fail (and that was OK).

Comment From: michael-grunder

Is there specific data to back this up?

I may be able to get some graphs around this from users of PhpRedis. That said, from my experience it is pretty impactful especially when it's a high volume workload of ephemeral clients that connect, perform a tiny number of operations, and then disconnect.

I suppose the only way to be sure is to measure. As an aside, I'd also like to be able to send CLIENT TRACKING ON as part of HELLO but his may be a bridge too far :smile:

Comment From: madolson

I may be able to get some graphs around this from users of PhpRedis. That said, from my experience it is pretty impactful especially when it's a high volume workload of ephemeral clients that connect, perform a tiny number of operations, and then disconnect.

Pooling and multiplexing would be an alternative answer here.

Comment From: michael-grunder

Pooling and multiplexing would be an alternative answer here.

No doubt. I was just commenting that it can have a surprisingly noticeable impact.

Comment From: soloestoy

I remember that when discussing this issue, other client developers objected to extending the HELLO command https://github.com/redis/redis/pull/11758#issuecomment-1421019332.

Because, during the initial handshake, the client is unaware of the server version. If the client uses a command like HELLO 3 AUTH user pwd LIBRARY lib-name VERSION foo lib-ver 1 and the Redis server is of a lower version, the handshake would fail. Therefore, we introduced the separate command CLIENT SETINFO to address this issue.

However, now that we have the CLIENT SETINFO command, I believe we can extend the HELLO command too. Clients can choose whether to complete the entire handshake process using just HELLO or to use HELLO along with CLIENT SETINFO.

Undoubtedly, the extended HELLO command can optimize the performance of ephemeral connections. Although we do not recommend using it in this way, it is true that many clients support this mode.

@redis/core-team please comment.

Comment From: oranagra

@soloestoy i don't understand what exactly you want us to comment on? any extension of HELLO would mean that clients that use this extension and still wanna be compatible with older versions of redis that don't support it, have to implement a fallback, which means another round trip.

Comment From: soloestoy

Is it necessary to extend HELLO to provide clients with the opportunity to set library information using HELLO, while also accepting the risk of incompatibility, leaving it up to the clients to decide rather than imposing restrictions from our end.

Comment From: oranagra

i disagree that it's "necessary", or at least i don't see any reason not to pipeline the two commands right after the other. that was the plan when we created that command, so i don't understand what changed.

Comment From: madolson

Agree with Oran, we discussed the options here and I still think we picked the right way to solve this problem. It may not be ideal, but pipelining additional arguments seems like the most compatible and future proof approach.

We've also seen clients begin to adopt the usage of client setname api (redis-rs uses it), so I don't think we want multiple different approaches to solve the same problem. We can consider rethinking this in the future, but for now I don't see the point of changing.

Comment From: soloestoy

Regarding pipelines, I have a different opinion. Not only for the specific issue mentioned in this discussion, but in general, while pipeline can improve execution efficiency, it is not a one-size-fits-all solution, it is not a silver bullet.

On one hand, if pipeline could solve all these problems, we would only need to implement the basic primitives such as SET/GET/EXPIRE. There would be no need to implement additional commands like SETEX/GETEX. But that is not the main point.

The real problem is that pipeline also have limitations. Unlike the ping-pong mode where we need to parse and handle the response of each command before sending the next one, pipeline operates in a streaming fashion. They send and receive commands without waiting for the reply of the current command. This can make it difficult to handle certain special replies, especially when dealing with errors or unexpected situations like receiving "-MOVED" in cluster mode, where the client needs to redirect to the correct node and retry the current command.

For example: Suppose a client is using pipeline mode and sends three commands in sequence to the Redis node where key "a" is located: LPUSH {a}list xxx; SET {a}string 1; SET {a}string 2. Normally all commands should be executed correctly. However, if the slot is migrated to another node while executing the first and second command (LPUSH {a}list xxx and SET {a}string 1), and then migrated back while executing the third command (SET {a}string 2), the client will receive three replies: MOVED, MOVED and OK. Pipeline does not provide atomicity guarantees, so this kind of situation is indeed possible.

There are usually two methods to handle these replies: 1. Not redirecting or handling MOVED, and directly returning it to the user. In this case, the user needs to judge and handle the results themselves, which can make the logic more complex (LPUSH {a}list xxx can be retried, but SET {a}string 1 should not be retried since SET {a}string 2 is OK). Furthermore, during switchover and slot migration, a large number of errors can occur. AFAIK JedisCluster adopts this approach. 2. The client can redirect and retry when encountering MOVED. Although this approach shields users from internal redirect errors in the cluster, it may lead to data inconsistencies. For example, retrying the SET {a}string 1 command may result in the final value of {a}string being 1, whereas the correct result should be 2. AFAIK Lettuce does like this.

Of course, we can say that it is the client's responsibility. However, as server-side developers, we also have an obligation to provide proper usage guidelines to guide client implementations correctly.

Comment From: oranagra

how are all these issues with pipeline relate to connection setup? this is not something handled by the user app, but rather by the client library, it shouldn't be difficult for it to send two requests and then read two replies. besides, the idea behind CLIENT SETINFO is that you don't care if it succeeded or failed, it's just some optional voluntary metadata, don't see how your MOVED arguments apply here.

Comment From: soloestoy

I meant that I don't like recommending pipeline. I will open a separate issue to discuss the problems related to pipeline.

Comment From: madolson

Yeah, there are many problems with pipelines, but I think in this context they are not as material. Clients should be able to execute either in a streaming mode or as a block of commands for connection establishment.

Comment From: madolson

I think it's reasonable to say we aren't going to address this, so I'm going to mark this as closed. Use the pipelining method to pass in multiple arguments, and we can re-evaluate it again in the future if bottlenecks us.