Edit: implemented in: #9780, #9936, #9938, #10004, #10066, #9899, #9968

The following is a proposal for a new way mechanism to program Redis.

The new mechanism extends the existing Lua scripting approach that assumes scripts are part of the application and are only handed to the server for execution.

With the new approach, scripts are a part of the server. As such they are persistent, replicated, and named. These new scripts are called Redis Functions.

We also propose a modular implementation based on engines (languages), to make it easy to support several languages for Redis Functions.

Abstract

Today, Lua scripts are considered an extension of the client; they are:

  • Not managed by Redis
  • By contract, may disappear and need to be re-loaded by the client

This approach greatly simplifies the server side and is proven to be useful in many cases, but it also has several major drawbacks:

  • Scripts cannot be considered a way to extend Redis, like stored procedures
  • Clients can either use EVAL only (inefficient), or implement a more complex logic of trying EVALSHA and falling back to EVAL
  • Using EVALSHA inside MULTI is inherently risky
  • Meaningless SHAs are harder to identify and debug, e.g. in a MONITOR session
  • EVAL inadvertently promotes the wrong pattern of rendering scripts on the client side instead of using KEYS and ARGV.

These led to the following definition of Redis Functions.

The proposed Redis Functions are an evolution of Redis' scripts. They provide the same core functionality, but they are guaranteed to be persistent and replicated, so the user does not need to worry about them being missing.

Conceptually, if current scripts are treated as client code that runs on the server, then functions are extensions to the server logic that can be implemented by the user.

This new design also attempts to decouple the language in which functions are written. Lua is simple and easy to learn, but for many users it’s still an obstacle.

The design makes no assumptions about the programming language in which functions are implemented. Instead, we define an engine component that is responsible for executing functions. Engines can execute functions in any language as long as they respect certain rules like memory limits and the ability to kill a function's execution (the full list of engine capabilities will be covered in detail).

Naturally, the first engine that will be implemented is the Lua 5.1 engine, but we propose additional engines to support other languages such as JavaScript.

Important: As with the current scripting approach, functions are atomic. During function execution, Redis is blocked and doesn't accept any commands. This implies that functions are intended for short execution times, and not long running operations, just like Lua scripts today.

Function Life Cycle

A function needs to be created and named in Redis before it can be used.

To do this, the FUNCTION CREATE command is used. Function code is loaded into the specified engine that compiles and stores it.

After the function is created, it can be invoked using FUNCTION CALL command that executes the named function.

Created functions are also propagated to replicas and AOF, and are saved as part of the RDB file.

Function Engines

As mentioned above, the engine is the component that is responsible for executing functions.

We propose to support multiple engines in the same Redis process, in order to support multiple function programming languages.

Function Commands

INFO functions

A new proposed INFO functions section will be added to the INFO command. The section will show general information about the engines and functions.

Information about the engines will include the following:

  • Engine name
  • Used Memory
  • Max Memory
  • Function count
  • Information about the functions
  • Total number of functions

FUNCTION CREATE

Usage: FUNCTION CREATE ENGINE NAME [REPLACE] [ARGS_DESCRIPTOR <ARGS DESCRIPTOR>] [DESC <DESCRIPTION>]

  • ENGINE - The name of the engine to use to create the script.
  • NAME - the name of the function that can be used later to call the function using FUNCTION CALL command.
  • REPLACE - if given, replace the given function with existing function (if exists).
  • ARGS DESCRIPTOR - an optional argument that can be given for argument validation on run time, see Arguments Descriptor.
  • DESCRIPTION - optional argument describing the function and what it does
  • BLOB - A blob of data representing the function, usually it will be text (e.g., Lua code).

The command will return OK if created successfully or error in the following cases: * The given engine name does not exists * The function name is already taken and REPLACE was not used. * The given function failed to be created by the engine. In this case the engine should return a more precise error and this error will be returned as well.

FUNCTION CALL

Usage: FUNCTION CALL NAME NUM_KEYS key1 key2ARGS arg1 arg2

Call and execute the function specified by NAME. The function will receive all arguments given after NUM_KEYS. The return value from the function will be returned to the user as a result.

  • NAME - Name of the function to run.
  • The rest is as today with EVALSHA command.

The command will return an error in the following cases: * NAME does not exists * Function was loaded with a costume ARGS DESCRIPTOR and the given arguments (keys or args) are not matching the descriptor. * The function itself returned an error.

FUNCTION DELETE

Usage: FUNCTION DELETE NAME

Delete a function identified by NAME. Return OK on success or error on one of the following: * The given function does not exists

FUNCTION INFO

Usage: FUNCTION INFO NAME

List information about the functions.

If the NAME argument is not specified, all the functions will be listed with some shallow and general information about each function.

If the NAME argument was specified then detailed information about this function will be provided. An error is returned if the function does not exist.

Information about the function will include the following: * Function name * Engine name * Run count(how many times it run) * Total runtime * Description (if given on FUNCTION LOAD command) * Args descriptor (if given on FUNCTION LOAD command)

FUNCTION KILL

Usage: FUNCTION KILL

Kill the currently executing function. The command will fail if the function already initiated a write command or if the engine does not support stopping a function at runtime.

Other Issues

Nested Function Calls

Redis Functions cannot be nested, i.e. it is not possible to perform a FUNCTION CALL command before the previous FUNCTION CALL has completed.

Debugging

Redis Functions will not provide server-side debugging capabilities.

Development and debugging should be done outside the server, and function engines should provide a way to do that seamlessly.

Loading Function On Start

Created functions are persisted in RDB and AOF and will therefore be available after a restart. We propose an additional mechanism to preload functions from files at start time, to support two more scenarios:

  • Servers configured with no persistence
  • Making sure certain functions are immediately available when bootstrapping a new instance

Open Issues

Arguments Descriptor

Considering Redis Functions are a type of API, being able to describe the expected arguments serves several purposes:

  • Document the interface, to make it more accessible
  • Allow a built-in validation that is independent of the function implementation.

Sharing Functions

We would like to allow some level of code reuse between functions, or the ability to create "libraries" that define code that is accessible from different functions.

Comment From: oranagra

Regarding FUNCTION CALL: - i think it should not be a sub-command (i.e. FCALL), so that we can assign different flags to it and easier to handle with ACL. - i think we also wanna add a read-only variant, i.e. FCALL_RO, see #8537 - i think we wants the args and keys to support named arguments, see https://github.com/redis/redis/issues/8223#issuecomment-805731762

Regarding Loading Function On Start: - i propose that the config file can include a reference to a startup script (similar to acl.conf), which is in some ways similar to an AOF file, containing a list of command to be executed before redis is available to clients, these can be FUNCTION CREATE calls. - additionally, i think we need a FUNCTIONS REWRITE command (similar to CONFIG REWRITE and ACL SAVE), which will update / override this startup script with the current set of functions registered.

Comment From: yoav-steinberg

Some thoughts:

  • I think there's a contradiction between storing functions in the RDB/AOF and between storing them in a config file. Having the scripts in the RDB file means they are part of the data, we can assume the version of a piece of code executed is what's in the dataset. On the other hand having them in some external config means we're basically extending the server's functionality and it's the operator's (not the application's) responsibility to make sure the server is configured correctly with the desired version. I think we need to decide which of the two is what we want and avoid mixing them up.
  • It might simply be worth adding the ability to do an EVAL command that takes as an input a key in the dataset instead of the script content. This way we make sure the scripts and dataset are one and the same and we have a naming mechanism for the scripts. Something like EVALKEY KEY ENGINE numkeys [key ...] [arg ...]
  • Maybe we should call this "Redis Scripts" and not "Redis Functions". We're talking about scripts and script engines so It's probably better to name them this way. There already exists a SCRIPT command which can be extended with this functionality.

Comment From: MeirShpilraien

  • It might simply be worth adding the ability to do an EVAL command that takes as an input a key in the dataset instead of the script content. This way we make sure the scripts and dataset are one and the same and we have a naming mechanism for the scripts. Something like EVALKEY KEY ENGINE numkeys [key ...] [arg ...]

How do you do it on the cluster? You need a different key that matches the shard slot range and the client needs to know the shard it sends the command to and match to key accordingly? I think putting it outside of the keyspace making it much easier.

Comment From: madolson

My thoughts:

  • I like the name "functions" more than scripts, I think that is more canonical for what it is doing.
  • I think we should have two commands: FCALL and FUNCTION. Leaving Function as an admin command and then letting FCALL be the one that actually calls the script should be sufficient for ACLs.
  • I'm not sure we need FCALL_RO. It seems like the responsibility of the function creator to define that it is a RO script, then we can scope down what functions a caller has access too. This is more powerful than what ACLs can do today, but it seems more usable.
  • I agree with Oran that we should model the functions like ACLs, in that they have their own config file that is loaded independently of the data set. I don't view functions as "data". We've had an ongoing discussion that it should be easier to maintain configs across a cluster, but I don't think we've gotten anywhere on that.
  • I also agree that I don't think we should be storing scripts in the keyspace. This data needs to exist on all nodes, which doesn't really fit the model.

Comment From: bionicles

cool idea. functions are definitely data because they have to be stored somewhere, best to keep them with the other data for auditing, versioning, logging, etc. Otherwise it's gonna force re-inventing the wheel and tracking state in multiple places, hidden data is bad, especially if it's a function that can mutate other data. Also, if functions are data, then we can meta-program redis, which would be fascinating

Read Only would be nice for security purposes.

Also, functions would need timeouts,

Also, the functions would need to execute with the ACL permissions of the invoker, and not with system-level permissions (or not, as desired)

Further, events might trigger functions, and that might impact performance dramatically

and the command to add a new function ought to be disabled by default otherwise folks could upgrade their redis server and find users adding new unknown functions.

Python latest stable version would be a more popular language than Lua...Rust is also a nice choice due to performance and critically, memory-safety on your memory-database, but fewer people code it.

This is a huge change to redis and I suggest make it a module or series of modules for different programming languages because it's definitely more complicated than we can imagine right now

Comment From: yossigo

@bionicles Thanks for the feedback, you've highlighted some good points (ping @MeirShpilraien).

Regarding timeouts, do you mean a default timeout setting per function? We should also revisit our past discussions (in the context of scripts) about write atomicity vs. ability to abort execution.

Regarding ACLs, the fact that function load is a different process than execution makes it more compelling to make that discretionary (i.e. functions can either run with user ACLs, or without ACLs like modules do today).

Comment From: manast

My two cents: it would be awesome if these functions supported also blocking operations. Currently I have implemented a redis module mostly because the lack of support of blocking operations in lua scripts https://github.com/taskforcesh/bullmq-redis. As we know redis modules are not so easily adopted by redis users as some library using lua scripts...

Comment From: manast

Another thing that maybe is implicit in the proposal but just to make sure it is supported: the ability to call functions from other functions. This is also a very big shortcoming in current .lua scripts, where I often end copy pasting code between lua scripts since there is no other way to modularize the code.

Comment From: kn30

Hello, I am very interested to make a contribution to the functionalities proposed here. Are there any issues opened for the same?

Comment From: bionicles

@yossigo timeouts would just enable some way to bail out of accidental infinite loops, complex queries etc -- if a function keeps going past some number of seconds, stop it. It would be incredible if timeouts could rollback some atomic transaction, but IDK how hard that is to pull off and this feature already sounds complicated

The main motivating use case (for me) for this feature would be to eliminate round-trips to the DB. For example I recently made code to rewrite GraphQL trees into SQL joins:

child_id_field, parent_id_field = f"{child.table}_id", f"{parent.table}_id"
parent_to_child = (
    f"{parent.table}.{child_id_field} = {child.table}.id"
    if child_id_field in columns[parent.table]
    else None
)
child_to_parent = (
    f"{child.table}.{parent_id_field} = {parent.table}.id"
    if parent_id_field in columns[child.table]
    else None
)

Since Redis doesn't have joins, a 3-layer tree requires a minimum of 3 roundtrips to the database. The round-trips can easily cause so much latency that the speed of Redis can't really shine through. Python or Rust functions in Redis could implement graphql over redis in 1 roundtrip

Comment From: tuxArg

Hi, maybe I'm missing something but an intermediate step could be just adding a store to key to SCRIPT LOAD, like: SCRIPT LOAD script INTO key Then EVALKEY key numkeys [keys] [args]

Also, The client could do: SCRIPT GETSHA1 key And then: EVALSHA ...

Comment From: MeirShpilraien

@manast regarding:

Another thing that maybe is implicit in the proposal but just to make sure it is supported: the ability to call functions from other functions. This is also a very big shortcoming in current .lua scripts, where I often end copy pasting code between lua scripts since there is no other way to modularize the code.

Functions that call other functions are not trivial and might comes with many issues (like recursive calls, engines that call another engine and then return to the original engine, modules that might be involved in this loop). So in the current proposal, we decided not to allow it. But as mentioned we do want to come up with a way for functions to share code, at least on the same engine. We still need to do some POC's to come up with the best way of doing it, but I believe it will probably be engine-specific.

@bionicles regarding timeouts, it is problematic in case the function already performed a write command, it will break the function atomicity. A rollback would be amazing but will be very hard to implement and will probably come with a big overhead. My opinion is that if we assume functions are for short-running operations (as mentioned in the proposal) then reaching a timeout is only because of a bug in the function which will probably be caught in the development stage, and if not it would probably be safer to kill the Redis (like with Lua today) other then abort and potentially stay with corrupted data (from the POV of the application).

@tuxArg As mentioned before in this conversation, putting functions in the keyspace is not such a good idea because it will make things much harder on the cluster. You will have to generate a new name that matches the shard you want to run the function on.

Comment From: manast

Functions that call other functions are not trivial and might comes with many issues (like recursive calls, engines that call another engine and then return to the original engine, modules that might be involved in this loop). So in the current proposal, we decided not to allow it. But as mentioned we do want to come up with a way for functions to share code, at least on the same engine. We still need to do some POC's to come up with the best way of doing it, but I believe it will probably be engine-specific.

But I should be able to do something like: redis.call("FUNCTION", "CALL", ..., ); from inside my functions, no?

Comment From: MeirShpilraien

@manast at least not from the current suggestion, no. But we do want to allow code sharing between functions, maybe by defining kind of "libraries". But this part is not yet finalized.

Comment From: manast

@MeirShpilraien I see. Seems a bit inconsistent to me, since from lua I can call all redis commands with the exception of blocking commands. This is just another command so I do not see why it should not work.

Comment From: MeirShpilraien

@manast from Lua you can not execute another eval command to call another Lua. Redis will not allow you to do it. There are more commands that can not be called from Lua which is marked with no-script flag.

Comment From: yoav-steinberg

@tuxArg As mentioned before in this conversation, putting functions in the keyspace is not such a good idea because it will make things much harder on the cluster. You will have to generate a new name that matches the shard you want to run the function on.

If this is the main reason blocking putting functions in the keyspace then I'd consider adding a mechanism for for creating global keys which are replicated across all shards (or fetched on demand). This is super useful even before considering functions, and might resolve issues related to functions as well. In addition there can be intermediate solutions like making the client library responsible to write the function to all shards. And even a limited support which is less practical in redis-cluster context can be initially considered.

I also feel that given the existence of both native redis modules and Lua/EVAL, it's probably in the interest of this project to extend the existing rather than inventing a third scripting/extension mechanism.

Comment From: MeirShpilraien

If this is the main reason blocking putting functions in the keyspace then I'd consider adding a mechanism for for creating global keys which are replicated across all shards (or fetched on demand). This is super useful even before considering functions, and might resolve issues related to functions as well.

This is what we are doing with functions, they will be on all shards, I agree this requirement can be generalized to more use-cases (like ACL for example) but this is out of the scope of this issue and I think it should be discussed as a separate issue

In addition there can be intermediate solutions like making the client library responsible to write the function to all shards.

This is the idea, client libraries will have to run the FUNCTION CREATE commands on all the shards, but it will be the same command on all the shards and there will be no need to find a key that matches the shards hslot.

And even a limited support which is less practical in redis-cluster context can be initially considered.

If you suggest putting it in a key as an initial solution and not supporting it in a cluster, I disagree. I think that we should not do something which we will drastically change in the future. Once something is done, it needs to be maintained and we will end up with more scripting API's.

I also feel that given the existence of both native redis modules and Lua/EVAL, it's probably in the interest of this project to extend the existing rather than inventing a third scripting/extension mechanism.

I think this is exactly what it does, you will still be able to do all the things you are doing today with Lua (on most cases (maybe even on all cases) without even changing the scripts). But you will gain all the advantages that comes with using functions (that was described in the proposal)

Comment From: zuiderkwast

I saw @yossigo's presentation of this on RedisConf yesterday, including demo! (Has he already implemented it! ... or was it fake? Nobody knows......) Pretty nice presentation anyway. I hope it will be available online.

Some questions:

  • Will there be a separate API for engines or will it be possible to implement engines using the Modules API?
  • Will some engines be shipped with Redis?
  • Any chance we can support multiple versions of Lua, e.g. Lua 6.2, 6.3, 6.4 and LuaJIT, in this way?
  • Any configuration per engine?

Comment From: yossigo

Hey @zuiderkwast It wasn't fake at all! But it's not my work - an early version prototyped by @MeirShpilraien. I hope he'll have something stable enough to PR sooner than later.

These are all good questions, I think we'll need to come up with the answers here as this is all still work in progress. These are my thoughts for now: * Bundling engines with Redis - the main benefit is to have them available as a core capability and easily accessible to users. But at the same time we'd really want to avoid bloat, so I think we'll need to come up with some middle way. * Theoretically it could be a way to support multiple Lua versions, we did some experiments and saw we can get around some issues and concurrently link different Lua versions.

Comment From: madolson

@yossigo I was going to bring up the discussion we had a long time ago about "Redis core" modules might be a good way to add additional functionality. The default implementation just gives you a LUA scripting engine, but we also provide python/javascript modules as well that can be easily loaded. Sounds like a middle way.

Comment From: yossigo

@madolson Yes this is definitely something to consider. We briefly mentioned 1st party modules in the past but never had a proper discussion nor a decision, I'll open an issue about that.

Comment From: bionicles

Hey it'd be wonderful to have the option to do the same thing as the Lua stuff but in other languages, that could be a game changer for us, any way to do this with Rust or Python functions?

Comment From: brenncat

Hi all, here are a few recommendations from my team:

1) FUNCTION RUNNING command that returns the name of the currently executing function. The client would be able to check what function is running before sending a FUNCTION KILL.

2) Background function execution. This will prevent the server from being held hostage by long-running functions. Other clients won’t be starved and we won’t need to resort to killing the Redis process if it performed write operations. An added benefit could be improved CPU usage on multi-core CPUs.

Implementation ideas: * (Preferred) Lock the keys that will be accessed by the function before running it in a background thread. * Run a non-atomic client on the server node, treating the function as another client. It would then be up to the author to handle atomicity with multi/exec. We prefer the first idea because it maintains the atomicity guarantees.

3) Store the source code and make it accessible to users. Before replacing the source code of a function, it would be good to be able to validate the current function’s code is what we think it is. Storing the source code internally also means we can re-compile and ensure that the functions are forward-compatible if the Lua engines are updated.

Comment From: oranagra

@brenncat 1. FUNCTION RUNNING. i think we discussed that. it is not enough to know the name, since the function can complete before you do the KILL (there will be a race). long running functions should be invalid (these scripting features are intended for short simple tasks), so i don't think we want to invest a lot of effort around them. the only thing i think this may be useful is for debugging / detecting which is the offensive function. we already have a log print today for Eval script timeouts, @MeirShpilraien please make sure we have one for functions too. 2. background script execution may be nice to have some day with key locking (when we implement such concept), but that's not today. the non-atomic suggestion is invalid IMHO, the whole idea behind functions and scripts is that they're short and atomic. 3. we do store the code in order to be able to persist it and replicate it. but i don't understand the suggestion of "validate the current function’s code is what we think it is"

Comment From: madolson

brencat and I talked about that it's common in declarative infrastructure configurations that you first validate a value is what you think it is before updating it. That way if it's some other random value, you can throw an error and not overwrite it, since something is wrong. If it's already the desired value, you can skip the update. All Redis needs to do to support this paradigm is be able to describe the contents of a script.

Comment From: MeirShpilraien

Loading Functions on cluster and/or startup time is discussed here: #9899

Comment From: MeirShpilraien

Code sharing on functions is discussed here: #9906