The SORT and SORT_RO commands support an option to sort by an external keys using a pattern. This functionality is disabled in cluster mode because there is no way to know the keys ahead of time, and such the corresponding function to getKeys from a command does not include them.
We should add an explicit check when accessing the keys in SORT and SORT_RO to make sure we have access to read from them before returning the request.
Comment From: ranshid
@madolson - I wondering your opinion regarding the expected behavior of this?
for example:
given the following selector: (+sort ~w[0-9] ~v1[0-9]* ~mylist)
mset w1 1 w2 2 w3 3 w4 4 w5 5 w6 6 w7 7 w8 8 w9 9 w10 10 w11 11
mset v1 1 v2 2 v3 3 v4 4 v5 5 v6 6 v7 7 v8 8 v9 9 v10 10 v11 11
lpush mylist 1 2 3 4 5 6 7 8 9 10 11
what should the output of
sort mylist by w* get v* ?
IMO there are 2 options: 1. fail the command and report some error when we identify there is at least 1 restricted read key (either "get" or "by") - this will probably make the dryrun implementation more complicated as we will have to access the data structure for each dryrun. - will prevent users from gaining output for the keyspace they ARE authorized for. 2. treat restricted keys as "no-exist" - in which case the null value is used internally. Example output for the example provided:
sort mylist by w* get v*
1) "10"
2) "11"
3) (nil)
4) (nil)
5) (nil)
6) (nil)
7) (nil)
8) (nil)
9) (nil)
10) (nil)
11) (nil)
Comment From: madolson
By dryrun are you referring to the ACL DRYRUN command? If so, I don't think we need to worry about correctness there.
I think the second proposed option makes the most sense to me. If we encounter a key that we can't access, we just keep processing. Otherwise we have to check all the keys first for access.
I also see a third option as well, which is that we add special logic just for SORT which is similar to the way we handle PSUBSCRIBE. We check to see if the user has access to exactly the patterns described in BY/GET. So in your example they would need access to either allkeys or w* and v*.
Comment From: madolson
Sort of related, we do something similar here I guess: https://github.com/redis/redis/issues/8152. For SCAN, we don't require the keys being matched to belong to any pattern you have permission on.
Comment From: ranshid
@madolson (liked the "Sort of related" :) ) I think for the keys and scan are somewhat different as they only provide keyspace data and no values. In the case of sort users can explicitly use the 'get' in order to access any key value so I still thin there should be a specific handling in the sort (do you agree?)
Comment From: madolson
@ranshid (I thought about bolding sort, but that seemed excessive :D)
I agree, it just brought up a potential alternative solution is that we could restrict SORT to only allow execution on the patterns (like we could with scan and keys, then flushall/flushdb/randomkey require * permission to access). It was just an interesting thought to me.
Comment From: ranshid
@madolson - following our private chat:
the #2 solution (treat restricted keys as "no-exist" - in which case the null value is used internally) is problematic since during evaluation of the weights vector and the get of the result values we must restrict ourselves to a specific selector.
So for example:
given this acl:
USER FOO (+sort ~v[0-9]*) (+sort ~v1[0-9]* ~mylist)
mset w1 1 w2 2 w3 3 w4 4 w5 5 w6 6 w7 7 w8 8 w9 9 w10 10 w11 11
mset v1 1 v2 2 v3 3 v4 4 v5 5 v6 6 v7 7 v8 8 v9 9 v10 10 v11 11
lpush mylist 1 2 3 4 5 6 7 8 9 10 11
running the following command:
sort mylist by w* get v*
will potentially access bot selectors which misses the idea of a selector being a complete set of acl rules.
I think I can offer some alternative: 1. in case a sort command is matched on a specific acl rule for the parameters and the command itself. we will continue to use this selector to verify the keys and maintain the proposed behavior(treat restricted keys as "no-exist"). so for the example given before the second selector (+sort ~v1[0-9] ~mylist) will be chosen since it contains the list key ~mylist. in which case the out put would be: sort mylist get v 1) (nil) 2) (nil) 3) (nil) 4) (nil) 5) (nil) 6) (nil) 7) (nil) 8) (nil) 9) (nil) 10) "10" 11) "11"
The down side for using this logic is that there can be a potentially "better" fit selector which is not verified the by/get keys against.
-
Similar to the previous suggestion but only match the "by"/"get" keys to the root permissions for the user (not inside '(..)') The downside of using this proposal is that it might force administrators to grant keys permissions in the root permissions which is not only in the scope of the current command (sort)
-
we can fallback to the original option /1 in which we will validate the entire key space accessed by the sort command before we reply with the response. the main downside of this option is that verifying un-authorized access will always require computing the sort command and verify authentication at the end.
Personally I am more in favor of the first suggestion as it seems more intuitive to me. but I would like to get your thoughts of how this will meet the user experience.
Comment From: madolson
I think I originally thought this might complicate the code because we would have to handle clients whose output is partially written to the CoB. Now that I've looked at the code more, it looks like we only lookup keys in the middle phase where we are sorting the data. I think it should be okay to throw an error at that point, so I suppose I think option 1 is good enough.