Functions on Redis Cluster

Redis Function PR introduces a new scripting approach called Functions. Unlike eval (see new terminology as explained on this PR), functions are considered part of the data. As such, functions are replicated and persisted. This issue will discuss how to handle functions on Redis cluster, how to make sure functions are located on all the nodes and how to handle cluster topology changes.

Uploading Functions to Redis Cluster

Functions are considered part of the data. It's the developer's responsibility to send his functions to Redis using the FUNCTION CREATE command. On a cluster, the FUNCTION CREATE command needs to somehow arrive at all the nodes. We suggest 2 ways to handle it: * The node, upon getting the FUNCTION CREATE command, will broadcast it to all the other nodes. * FUNCTION CREATE command will be broadcasted from outside by the client or admin to all the nodes.

The first approach is complicated and can be tackled as part of a bigger story where nodes have a global data they need to share with each other (like ACL or configuration). This will not be tackled on this issue, we believe option two is easier and gives the functionality needed to make functions usable on cluster.

To make the user life easier we can extend the broadcast option on redis-cli to receive the last parameter from the stding (the -x option), this way a user can upload his function to a cluster like this:

redis-cli -x --cluster call <host>:<port> FUNCTION CREATE LUA test < my_function.lua

Handling Cluster topology changes

The second issue with cluster is that the topology can change, a new node can join the cluster at any time. Joining a node to the cluster is an administrative operation. It requires joining the new node to the cluster and then making it responsible for some slot range. We suggest adding another administrator step that sends the functions to the new node before moving any data into it.

To make this operation easier, we suggest adding a new API that allows export functions from a Redis server. Exporting functions will be done using a new function sub-command: FUNCTION EXPORT. The new command will return a list of FUNCTION CREATE commands. Invoking this list on a new Redis server will result in having the same set of functions on the new Redis server.

An admin that adds a new node to the cluster will be able to call FUNCTION EXPORT on any of the already existing nodes, get the list of FUNCTION CREATE commands and then execute it on the new node before moving any slots into it.

Loading functions on start time

Though not directly related, we would like to use this issue to open the discussion about loading functions on start time. The use case is pure cache without persistence nor replication. In such use cases functions will not survive restarts and users will have to reload them. We see 2 ways to handle it: 1. Startup file: allow configure a startup file that contains a list of Redis commands. Redis will execute all the commands on the startup file only if it doesn't load any persistence file. It will be possible to put FUNCTION CREATE commands on the startup file to load functions on startup 2. Allow saving RDB with only functions (and maybe more meta data in the future). In this approach users will still need to configure persistence but it will be possible to persist only the functions without the keyspace.

After discussion with @oranagra @yossigo and @yoav-steinberg there is no consensus on which approach is better.

Personally, because functions are considered part of the data, I prefer approach 2. I believe that approach 1 turns functions to be a configuration, and makes it admin responsibility to load them instead of developer responsibility. Having RDB with functions only keeps the developer responsible for functions (loading/updating/deleting) and still provides a solution for cache use-cases.

Because the solution is not yet clear, I am not writing a full design and Would like to hear others opinions about this topic.

Comment From: yoav-steinberg

Allow saving RDB with only functions (and maybe more meta data in the future). In this approach users will still need to configure persistence but it will be possible to persist only the functions without the keyspace.

Another way to look at this is to configure redis not to persist volatile keys. Key "volatility" can be defined by the current eviction policy. So any key that can potentially be evicted would be excluded from the rdb. This makes it possible for the user to define a small subset of keys they do want to persist, and functions will be persisted by default because they are never volatile.

Comment From: zuiderkwast

My 2¢

Uploading Functions to Redis Cluster

  • The node, upon getting the FUNCTION CREATE command, will broadcast it to all the other nodes.
  • FUNCTION CREATE command will be broadcasted from outside by the client or admin to all the nodes.

If it's the responsibility of the cluster-client library, it would need to include fallback logic to call FUNCTION CREATE when a script is missing (similar to the EVALSHA -> EVAL fallback logic). How can this be done? I can see two ways: 1. The user needs to have the function code available and supply it to the client library for every cluster-client instance. Then, what did we gain compared to EVALSHA/EVAL? 2. If a function is missing, the cluster-client library can try to ask the other master nodes for a function and load it to the node where it's needed.

Now, what if the user wants to call a function on a replica where it doesn't exist (because its master doesn't have it yet)? And how does the client keep track of updates to functions and deleted functions?

If we want Functions to be adapted, I think they have to be easy to use. Maybe it means extending the cluster bus in this case. (It's not extremely hard, is it? Unknown cluster bus messages are ignored by old Redis instances anyway, right?)

Handling Cluster topology changes

Not only redis-cli is used for this. Any other cluster admin tool would need to be updated to support migration of functions.

If the cluster bus could handle it, it would be much better. I wouldn't want to end up in a situation where users don't want to change the topology of the cluster because it might break functions.

Loading functions on start time

Please make them pure data or pure configuration. Not something in between.

  • Startup file: allow configure a startup file that contains a list of Redis commands. (...)

No, please don't.

  • Allow saving RDB with only functions (and maybe more meta data in the future). In this approach users will still need to configure persistence but it will be possible to persist only the functions without the keyspace.

Good idea, but I don't think it's strictly necessary. If it's a pure cache, then the users might know that the functions are also volatile and they need to create the functions on demand.

A way to persist only non-volatile data would be a nice feature anyway.

Comment From: oranagra

If it's the responsibility of the cluster-client library, it would need to include fallback logic to call FUNCTION CREATE when a script is missing (similar to the EVALSHA -> EVAL fallback logic). How can this be done? I can see two ways:

I don't think we want to make it the responsibility of the cluster client / library. i think the whole point is saying the code is the responsibility of the developer, and deploying it is the responsibility of the admin (not the client app). Think of a module (i.e. MODULE LOAD), if the module is missing, the app can't work, and it's not the app's responsibility to load it. In fact we should explicitly state that it is not the client library responsibility. we don't want client libraries developing special code for that, this would slow the adoption rate of function and also break later or limit our flexibility.

I think the above statement answers a few of your other observations.

Cluster bus propagation is still a nice thing (i've been told it's too complex for now), but can arguably be added on a next version if we lay this task on the admin for now (again like modules).

The startup script suggestion is not strictly related to functions, so it doesn't violate the idea that functions are part of the data. it just comes to solve a specific case to make it easier for whoever's facing that case. i.e. they have other means to achieve the same thing today already (start redis un-bound, init it, and only then bind it).

I personally don't like the idea of persisting just part of the data, but i guess it's a valid one.

Comment From: zuiderkwast

Think of a module (i.e. MODULE LOAD), if the module is missing, the app can't work, and it's not the app's responsibility to load it.

Modules haven't been widely adopted though, so modelling Functions after Modules doesn't seem very promising to me. :-) Maybe this is because the admin and the developer often are different persons, teams or even different companies.

In fact we should explicitly state that it is not the client library responsibility.

OK, good to know. So it's the application dev's responsibility to send FUNCTION CREATE to all masters? Cluster clients (some of them) have send-to-all APIs which can be used for this.

If we compare functions with SQL stored procedures, they'll likely be created by some upgrade script when deploying a new version of an application. This is almost like an admin or config operation (but without admin privileges).

The startup script suggestion is not strictly related to functions

Right, it can be used for config and admin commands as well. Not only data.

Comment From: MeirShpilraien

I believe @oranagra already cover most of the points but I wanted to comment about this:

Good idea, but I don't think it's strictly necessary. If it's a pure cache, then the users might know that the functions are also volatile and they need to create the functions on demand.

Assuming we agree that client should not upload functions if they are missing (which is one of the thing that differentiate functions from eval). I believe it is mandatory. I have no doubt that functions are data. But I think that by introducing functions we basically introducing a new level of data which you want to persist and be available on restarts while the key space is something that you can accept to lose (your application can not work without the functions but it might be able work without the keys in case of cache). I also think that you might want to persist functions each time you change them.

Comment From: oranagra

@MeirShpilraien i think you made a good argument.

@zuiderkwast

Modules haven't been widely adopted though, so modelling Functions after Modules doesn't seem very promising to me. :-) Maybe this is because the admin and the developer often are different persons, teams or even different companies.

you're right, but the important difference is that modules are native code and put the host at risk, and functions are sandboxed. also, for modules you need to upload a file to the file system in order to load it, and functions can be just loaded by the client connection. i don't think they'll face the same adoption problems module have.

Comment From: madolson

I'm in favor of doing the function export + RDB format for restoring the functions.

Cluster bus propagation is still a nice thing (i've been told it's too complex for now), but can arguably be added on a next version if we lay this task on the admin for now (again like modules).

I think this should be a Redis 8 thing. Yossi and I have talked about driving everything through consensus (something like raft) so the cluster eventually agrees on the resulting state. I think this would help resolve a lot of esoteric issues within cluster mode, while still leaving the main data flow being best effort + eventually consistent to retain performance. It would probably also help a lot of proxy maintainers since they don't have to deal with it.

Allow saving RDB with only functions (and maybe more meta data in the future). In this approach users will still need to configure persistence but it will be possible to persist only the functions without the keyspace.

I still think we should treat functions more like config than data, but I think I've lost the battle on that one. If we are going to treat functions like data, I think this makes a lot of sense. The ephemeral workload basically then requires a seed RDB which can contain all of the functions. Perhaps FUNCTION EXPORT can be extended to include the format, so you could export it as a self contained RDB file.

Comment From: oranagra

We discussed this in a core-team meeting, the decisions are: * Functions are part of the data, but not part of the application, and we don't want client libraries or client apps to load or re-load them. (this needs to be documented properly somewhere) * in persisted and replicated cases, functions are part of the RDB file and that solves the problem. * in cluster, the user / admin is responsible to load them using redis-cli to all nodes. * in cluster resharding / joining new nodes, redis-cli as the cluster management tool should use FUNCTION EXPORT to get the functions from one node and implicitly add them to the new one (no special user redis-cli command is needed). * the above means that it could silently fail if user uses an old redis-cli, but only when functions are used (new apps), so it's not a backwards compatibility issue.

for cache or ephemeral deployments, we concluded that we can't really decide between a bootstrap script or an RDB without keyspace, but actually both of these are already possible without us wiring a single line of code in redis. user can do FUNCTION EXPORT and use some bootstrap bash script for one approach, or FLUSHDB, and SAVE to generate that ephemeral RDB for the other approach. so the conclusions is: * We will not (yet) do any change in redis to support these, but instead solve it with some tooling, i.e. add some script or tool that will let the user convert a bunch of functions (possibly generated with FUNCTION EXPORT) into an RDB file. (this needs to be documented properly somewhere)

Comment From: oranagra

I wanna take this opportunity to make something clear about the ephemeral / cache use case (not cluster), and the basic idea behind functions and what they come to solve (we'll make sure to state that in the documentation when we write it).

Functions were born to answer two common requests: 1. In some applications where multiple modules use the same scripts, it was hard to update them and make sure all pieces of the project use the new script (imagine part of the app written in go, running on one server, and another in python running on another), i.e. the script is not really part of the client that calls it, and it was hard to work with the SHA design. The solution to that problem was to name the scripts and persist them. 2. Users were asking for all sorts of things related to having a schema (e.g. have two keys of different types always updated together and remaining consistent with one another, or when one is deleted or evicted, another one must be updated. we came up with two possible solutions for that, one was #key annotations in which the user works with redis native API and the magic happens behind the scenes, and the other is Functions (in which the user should work only with a high level interface he's injecting into redis, which takes care of everything).

That's why functions are part of the data, they're what manages the schema, and when the schema changes, both the functions and the data should change. And that's also why i think it would be wrong to save functions separately from the data (like ACL SAVE), imagine a case where functions are updated and persisted ASAP, but the data isn't yet, and then a crash happens, and the data is rolled back but the functions do not, that creates inconsistency between the functions (that should make sure, and expect that key A is always consistent with key B), and the data (key A and B).

From that point of view, i think that for an ephemeral use case, where the functions are somehow loaded on startup (e.g. by some bootstrap script or a static RDB file that was put there when the pod was created), it is logical that when updating the schema, the old pod will be destroyed, and a new one will be created with the new functions there from the get go (i.e. not to update the functions on the server at runtime).

Comment From: MeirShpilraien

After discussion with @oranagra, summering action items and points to consider to achieve the the decisions above: 1. We want to give an easy way to export and import functions, the FUNCTION EXPORT suggested here is not good enough because the code might be a binary data. Suggesting replace FUNCTION EXPORT with FUNCTION DUMP and FUNCTION RESTORE that will follow the footsteps of DUMP and RESTORE commands. FUNCTION DUMP will return a binary blob that can be given to FUNCTION RESTORE to load the same set of functions to another Redis instance. 2. FUNCTION DUMP and FUNCTION RESTORE should be used by redis-cli on cluster management when adding a node to the cluster. 3. We want to give the ability to generate an RDB with functions only, so it can be used on ephemeral / cache use cases. Today redis-cli allows to generate an RDB from existing Redis instance using the --rdb option. We will add another option, --functions-rdb that will generate an RDB with functions only. To achieve that we will allow another REPLCONF option, function-only, that will return an RDB with only functions. redis-cli will be able to use this REPLCONF option to generate an RDB with only functions and save it locally. The implementation will reply on a fast fork and diskless replication which is a lot better than writing RDB generation code in redis-cli 4. We still want to allow ephemeral / cache use cases to create a readable text file with all the FUNCTION CREATE command (in case the user do not want to use RDB and want to create a startup script that will be managed by him). For this we will do the following changes to the API: * Allow specify WITHCODE option on to FUNCTION LIST command to dump all functions information in a single command. The user will be able to generate a startup script with FUNCTION CREATE commands from the given output. * After the above change FUNCTION LIST and FUNCTION INFO will be almost the same (FUNCTION INFO will just give information on a single function), we suggest adding a NAME <name> option to FUNCTION LIST to be able to generate information about a single function. The final API: FUNCTION INFO [WITHCODE] [NAME <name>]

Comment From: yoav-steinberg

Some thoughts: - Re NAME <name> option, I'd suggest making it a pattern: FUNCTION LIST PATTERN lib_do_stuff*. - Re WITHCODE, IIUC then we're assuming code might be binary so just make sure you have some decent binary escaping implemented here. - Re REPLCONF functoins-only: I feel this is too specific, there might be other things we'd like to include in such RDBs like global keys, aux data, etc. What I suggest is either: - Some filter syntax like: REPLCONF filter +functions,+key-pattern:aa*,-aux-data (this might be overly complex) - Simpler solution which might be useful for other needs: REPLCONF non-volatile which puts all non volatile data in the rdb (including functions, aux data, etc).

Comment From: oranagra

binary escaping

no need, we respond with nested RESP array, each code block is a "binary safe string", and if it'll be fed back to us, the tool that does so, needs to send them as bulk string arguments.. so all of the escaping is only needed by: 1. redis-cli (does that anyway when it prints things to tty) 2. some tool that will be creating a bootstrap script (out of the scope at the moment)

Regarding the REPLCONF, we can always extend it further and add additional filters. for now, we only need to create the redis-cli --rdb-functions tool so that we can serve ephemeral cases. we can decide to skip that completely, and create another tool that clones redis, calls FLUSHALL, and then SAVE, but considering we have redis-cli --rdb already, it seems faster and better to do the REPLCONF thing.

i don't mind switching REPLCONF functions-only to REPLCONF exclude-keys, so we can later (or now), also add REPLCONF exclude-volatile-keys, or other features like excluding a certain DB index, or keyname pattern, but i don't like to create some DSL now for that, and over design that feature.

Comment From: oranagra

since there are now outstanding PRs (waiting for merge) for each of the tasks outlined here, i'm closing this one. arguably it should have been linked to the FUNCTION DUMP / RESTORE PR (#9938 already merged)

the two others are: * redis-cli -x with --cluster call #9980 * redis-cli --functions-rdb #9968

Comment From: leonchen83

hi about REPLCONF functions-only REPLCONF rdb-only

Actually all these things can do it on client side for example, if we save remote rdb to local we can send SYNC command, and parse the stream on client side when we parsed RDB_OPCODE_EOF and next 8 bytes CRC64. client side can disconnect with remote and save that stream to file. no need add REPLCONF rdb-only. we can also filter keys on client side include functions filter.

redis-rdb-cli can do above things rdt --backup redis://127.0.0.1:6379 --out ./dump.rdb --key user.* --db 0

so I think redis-cli can do the same thing without REPLCONF functions-only and REPLCONF rdb-only

Comment From: yoav-steinberg

@leonchen83 both REPLCONF rdb-only and rdb-filter-only are used so the server can save memory and assume less work while creating the snapshot to be sent to the client. It's true that these things can be done only on the client side but they might take much longer and use up more memory on the server side.

Comment From: leonchen83

hi @yoav-steinberg REPLCONF rdb-only and rdb-filter-only designed for external tools like redis-cli. right? maybe in the future redis-cli may add new feature directly filter rdb file like following redis-cli --file /path/to/dump.rdb --keys some_key.* --output /path/to/filtered-dump.rdb if we implement above feature we need to parse rdb file too. because we can't use replication protocol REPLCONF rdb-only and rdb-filter-only to do these things. so I don't think for external tools add something to replication protocol is a good idea.

Comment From: oranagra

@leonchen83 we do have future plans for for a librdb.so so that redis-cli or other tools will be able to more easily parse rdb files, and in that sense we can add such feature to redis-cli. but that doesn't come to replace REPLCONF rdb-only and such. these are meant to reduce the load on the master redis, when you want to extract an RDB from a running instance, with minimal overhead on the master (like collecting replica output buffers for the command stream, and COW)

Comment From: leonchen83

that make sense BTW. function description can't save to rdb correctly

127.0.0.1:6379> function load LUA lib2 desc "description function" "local function test1() return 5 end redis.register_function('test1', test1)"
OK
127.0.0.1:6379> function list
1) 1) "library_name"
   2) "lib2"
   3) "engine"
   4) "LUA"
   5) "description"
   6) (nil)
   7) "functions"
   8) 1) 1) "name"
         2) "test1"
         3) "description"
         4) (nil)
         5) "flags"
         6) (empty array)
127.0.0.1:6379>

git revision number 4db4b434175b519e2e5a78f2d33a7627c483c367

Comment From: yoav-steinberg

BTW. function description can't save to rdb correctly

Can you open a new issue for this and mention @MeirShpilraien?

Comment From: MeirShpilraien

@leonchen83 it was changed to description:

127.0.0.1:6379> function load LUA lib2 description "description function" "local function test1() return 5 end redis.register_function('test1', test1)"
OK
127.0.0.1:6379> FUNCTION list
1) 1) "library_name"
   2) "lib2"
   3) "engine"
   4) "LUA"
   5) "description"
   6) "description function"
   7) "functions"
   8) 1) 1) "name"
         2) "test1"
         3) "description"
         4) (nil)
         5) "flags"
         6) (empty array)

The actual bug is that there is no error on unknown argument :)

Comment From: leonchen83

@MeirShpilraien another question could we use 1 REPLCONF command to represent above REPLCONF rdb-only, REPLCONF rdb-filter-only for example

use REPLCONF rdb-include-filter rdb to represent REPLCONF rdb-only 1 use REPLCONF rdb-include-filter function to represent REPLCONF rdb-filter-only functions

because REPLCONF rdb-filter-only functions return an rdb only with functions. it's also a kind of rdb. confused with REPLCONF rdb-only 1

and if we changed as above redis-cli could also unify --rdb <filename> and --functions-rdb to one

--rdb <filename> [filter]