Redis Redis Functions Flags - Nineya|java/go/python

https://github.com/redis/redis/pull/10004 introduce Redis function libraries which allows code sharing between functions by uploading a batch of functions together. We open this issue to discuss functions flags that was left open on https://github.com/redis/redis/pull/10004.

Library Functions Flags

The orginal suggestion on https://github.com/redis/redis/issues/9906 was to support flags that can be given to Redis when the library creates a function. Those flags are used to tell Redis when it is OK to invoke the function. For example, one flag might indicate to disallow running the function if Redis reached its memory limit. Here is the list of flags we plan to support first (other flags might be added in the future):

DENY_OOM - do not allow the function to be invoked if the memory limit is reached.
BREAK_ATOMICITY - indicate that it's OK to kill the function even if it already performed a write command or disallow write * commands if OOM reached.
NO_REPLICATE - indicate that all the data created by the function is temporary and should not be replicated to replicas/aof. Equivalent to redis.set_repl.

Note: DENY_OOM can solve this issue: #8478

A discussion with @oranagra raised concerns and open question about this topic and we would like to community input to decide how to proceed.

How to handle unknown flags

Assume on Redis 7.2 we will add a new flag that was not exists on Redis 7.0, how should Redis 7.0 handles this flag. One approach is to decide from day one that flags are just a hints and if Redis do not recognise them then it just ignores. The advantage is that script could then run on Redis 7.2 and Redis 7.0 without any change. The disadvantage is that user might expect some functionality but he will not get it on Redis 7.0. Also we will get different behaviour between Redis 7.0 and Redis 7.2 on the same script. To illustrate it, take for example the DENY_OOM flag describe above, assuming we will only support it on Redis 7.2. If user will use it he will not be able to execute the function if OOM reached but he will be able to execute it on Redis 7.0 (even if OOM reached).

The alternative is to raise an error if we see a flag which is not recognised. The advantage is that everything is explicit, if something is not supported the user will get an error and can decide how to handle it. The disadvantage is that we pass more responsibility and complexity to the user. Important thing to notice is that such complexity exists even today, if a user uses redis.set_resp API he needs to consider the case in which this API do not exists in order to be compatible with old Redis versions.

Personally I prefer option 2 but would like to hear others opinions.

Lua API

How do we expose the ability to set flags with the Lua API. https://github.com/redis/redis/pull/10004 introduce the following API to register a function: redis.register_function(<function_name>, <callback>, [<description>]). It is possible to add another argument to pass the flags, but then every time the user want to pass a flag he must also pass a description because both are positional arguments.

Another option is to allow a version of redis.register_function that gets a named arguments:

redis.register_function {
    name='f1',
    callback = function() return 1 end,
    description = 'some desc',
    flags = ...
}

If we decide to add this named argument version we should do it before Redis 7.0 to prevent another backward compatibility issue for functions (a script that will use the named argument API will not work on Redis 7.0)

Comment From: oranagra

@redis/core-team please review and comment.

Comment From: oranagra

Other flags we could add are write and read-only. maybe a function that wants to be executed only on masters can declare the write flag so it'll immediately get blocked on replicas? i.e. in the alternative, imagine a function that does a big read operation, processes it, and then attempts to write it. (doing the read and processing will be wasteful sine the write is bound to fail)

Comment From: zuiderkwast

We could have two kinds of flags. Let's call them Hint flags (unrecognised flag is ignored) and Critical flags (unrecognised flag raises error). This gives us the best of both worlds.

Comment From: oranagra

I gave these flags some additional thought, and i think i concluded that redis should indeed fail when an unknown flag is provided. i'm viewing such libraries as modules, and thus, a library that wants to be compatible with older versions of redis (e.g. some 3rd party library that can be included in many redis applications, and wants to support multiple versions), should issue an API call before registering the functions, and use flags appropriate for that versions (Meir's suggestion). So it could also have other behavior changes in the code to compensate for that missing feature (so redis.get_version() should be supported both at registration and at runtime).

I also realized that there's no need to provide flags that can be just as well, provided when the function runs, e.g. the NO_REPLICATE which matches the redis.set_repl, same goes for BREAK_ATOMICITY.

The ones that are needed at registration, are ones that can be used by redis before executing the function, e.g. similar checks like the ones in processCommand. Arguably, the functions can do that on their own too (i.e. check if they're running in a read-only replica, or if redis is over the memory limit), but that would be very complicated to do from the function's side (unlike just declaring redis.set_repl).

in theory, the list of flags is similar to the list of possible flags redis has for it's own command (per any given redis version): * WRITE / READONLY (or none) * DENYOOM * LOADING ? (by default FCALL is rejected, so maybe this should be an opt-out? or maybe opt-in?) * NO_ASYNC_LOADING * STALE

Other checks done in processCommand, can maybe also added. Most importantly, ACL checks, but that's more complicated. It may not be required for the function to declare key-specs, since the caller has to explicitly specify which arguments are keys. But considering #9974, it's not enough to just specify key names, you have to specify what you're gonna do with them. Also a function can call multiple redis commands, some of which could be blocked (regardless of keys, even in existing / old ACL mechanics). So it looks to me that the only sensible way to handle it would be for the function to run some validations at startup so that it can detect issues and avoid a failure later after half of it is processed. i.e. something in the spirit of #9309. p.s. the difference in that case between modules and functions is that module that doesn't check ACL, will still successfully execute, while a function will execute half way (possibly doing half of the modification), and then get an error and possibly leave a messed up state.

Comment From: yossigo

I support function flags, but I think we should be strict about them and not try to come up with some form of forward compatibility.

I can see the value in some of the command flags as well as others. For example, flag to indicate a function is privileged and executes without implicit ACL checks (yet another step in the direction of functions as modules).

But every flag is a commitment and a potential future complication, so I think beyond the principal idea of flags we should progress in baby steps and not rush to push all possible flags right now. The first three are probably OK, but I propose that we still carefully consider the arguments for each one.

Comment From: oranagra

@yossigo which are the first three? READ / WRITE / DENY_OOM?

Comment From: MeirShpilraien

I also realized that there's no need to provide flags that can be just as well, provided when the function runs, e.g. the NO_REPLICATE which matches the redis.set_repl, same goes for BREAK_ATOMICITY.

@oranagra I disagree about it, for performance reason, calling a Lua function each time can be costly. If the user know, for example, that his function only creates temporary keys he can use NO_REPLICATE flag and avoid the function call in each script execution.

I also suggest to consider RESP3 flag that can avoid using redis.set_resp.

Comment From: oranagra

@MeirShpilraien i'm not sure that this overhead is significant. i agree there's a value in setting a declarative flag rather than calling a method. but it's certainly not mandatory, unlike other flags.

Comment From: oranagra

we discussed this in a core-team meeting. we concluded we wanna make baby steps and be conservative for now. so we wanna introduce the flags mechanism, and only add flags that are for sure relevant. we went over the flags that can be declared by modules, and here are the ones we think we wanna take: * readonly * write * deny-oom * allow-stale * no-cluster there may be value by adding allow-loading, but we decided to leave it out for now.

other module flags that we don't wanna support include: admin, pubsub, no-monitor, no-slowlog, fast, no-auth, no-multi, may-replicate

another thing we would need to do is expose an api like RM_ACLCheckCommandPermissions

Comment From: MeirShpilraien

@oranagra what about the API, do we agree with the named arguments approach? If so we should add it before Redis 7.

Comment From: oranagra

IIRC you said that we can support both. let's see how it looks in a PR