Redis [BUG] Lua script doesn't return OOM when making arbitrary delete command first

Describe the bug

Redis config has a max memory limit set and no-eviction policy. If the memory is > the max memory, calling a Lua script containing redis.call('del', 'anything') before redis.call('set', 'fo', 'bar') doesn't result in an OOM error, and the key is created.

To reproduce

Set the following config in Redis.conf maxmemory 1mb maxmemory-policy noeviction

Run a script to that will call a Lua script repeatedly, using a new key each time, until the error returned is OOM. Lua script:

redis.call('set',{some key}, '1')

Now run the following:

redis.call('del', 'anything')
redis.call('set',{some key}, '1')

Expected behavior

When the second Lua script is run that includes the deletion of a random key, it should still return an OOM error.

Additional information Tried on Redis server versions: 5.0.6 and 6.0.0

Comment From: oranagra

it's not ideal, but at the moment, it's the intended behavior.

as soon as one write command succeeded in the script, the script can't be aborted due to non-deterministic errors, it must be run to completion. the reasons are: 1. aborting it would be a violation to the script atomicity guarantee. 2. after the script performed a modification, it must be replicated to AOF and replicas, if the script is propagated as a script (if lua-replicate-commands is disabled), failing some commands based on non-deterministic errors, would mean it can perform different modifications on the master vs the replica and cause consistency issues.

we are considering to deprecate lua-replicate-commands in redis 7.0, in which case we might be allowed to perform evictions during scripts, but i think we'll still not be allowed to OOM commands.

Comment From: willdot

That makes a lot of sense, thank you.

I guess I could move the delete operation outside of the lua script and run the delete first. It’s not essential to run in the script, I was just using it there to keep it all together.

Now I know the behaviour is expected, I’m a lot more confident I’m not doing something wrong. Thank you!

Comment From: WilliamShoww

it's not ideal, but at the moment, it's the intended behavior.

as soon as one write command succeeded in the script, the script can't be aborted due to non-deterministic errors, it must be run to completion. the reasons are:

aborting it would be a violation to the script atomicity guarantee.

after the script performed a modification, it must be replicated to AOF and replicas, if the script is propagated as a script (if lua-replicate-commands is disabled), failing some commands based on non-deterministic errors, would mean it can perform different modifications on the master vs the replica and cause consistency issues.

we are considering to deprecate lua-replicate-commands in redis 7.0, in which case we might be allowed to perform evictions during scripts, but i think we'll still not be allowed to OOM commands.

if max_memory_policy set allkeys_lru，is still throw OOM exception when do script commond?

Comment From: oranagra

@WilliamShoww i'm not certain what you meant to ask, so i'll try to describe everything that's relevant.

The EVAL and EVALSHA commands are never rejected due to OOM (regardless of the eviction policy), since they may contains read-only scripts. Regardless, redis does attempt to perform eviction (if the eviction policy allows for it) before executing EVAL / EVALSHA. But inside the script (when it does redis.call or redis.pcall), no eviction is performed, doing so could lead to a different outcome when the script will be propagated to relicas / AOF (resulting in inconsistencies). The only thing left to mention is that before the script performs it's first modification on the database, if it is a write command that allows being rejected by OOM (most write commands excluding DEL), the command can fail and possibly terminate the script, but it does that based on the memory utilization that existed before the script started (not based on the memory utilization that exists when executing that command inside the script)

Comment From: WilliamShoww

@WilliamShoww i'm not certain what you meant to ask, so i'll try to describe everything that's relevant.

The EVAL and EVALSHA commands are never rejected due to OOM (regardless of the eviction policy), since they may contains read-only scripts. Regardless, redis does attempt to perform eviction (if the eviction policy allows for it) before executing EVAL / EVALSHA. But inside the script (when it does redis.call or redis.pcall), no eviction is performed, doing so could lead to a different outcome when the script will be propagated to relicas / AOF (resulting in inconsistencies). The only thing left to mention is that before the script performs it's first modification on the database, if it is a write command that allows being rejected by OOM (most write commands excluding DEL), the command can fail and possibly terminate the script, but it does that based on the memory utilization that existed before the script started (not based on the memory utilization that exists when executing that command inside the script)

When redis machine memory used max memory, i do EVAL the script(use cluster lock) ，redis return OOM exception, how can i do handler no OOM? please help how to config redis ? the script is : "local lockClientId = redis.call('GET', KEYS[1])\n" + "if lockClientId == ARGV[1] then\n" + " redis.call('PEXPIRE', KEYS[1], ARGV[2])\n" + " return true\n" + "elseif not lockClientId then\n" + " redis.call('SET', KEYS[1], ARGV[1], 'PX', ARGV[2])\n" + " return true\n" + "end\n" + "return false"

Comment From: oranagra

@WilliamShoww which version of redis do you use? maybe you have a version before this fix: 38f6207f884f514e928513acb6560fdb375daa2e

Comment From: WilliamShoww

@WilliamShoww which version of redis do you use? maybe you have a version before this fix: 38f6207

my redis version: 5.0.3

Comment From: oranagra

@WilliamShoww then i suggest you upgrade to the latest 5.0.12 (or 6.0.12), this bug is fixed there

Comment From: JimB123

@madolson Asking to reopen this report.

I just spent time diagnosing this same issue. The machine was configured for "noeviction" and the only traffic was a script which started with LREM. The host continued WAY past maxmemory and into swap. This is not acceptable behavior.

@willdot your analysis is spot on. One thing you implied, but didn't explicitly state is that the script will continue even if the delete is for a key which doesn't exist (the database isn't "dirtied" by the command).

The problem stems from having 2 classifications of commands: 1) commands that use memory, and 2) commands that are "write" commands. In this case, the script is allowed to continue because the first command does not use memory (and doesn't trigger the OOM check) - however it is a "write" command which makes the script (intentionally) unkillable.

I suggest that a better behavior would be to kill the script for ANY write command (even if it "doesn't use memory") when OOM. (This applies ONLY to the first write command in the script. Once the script has performed a write, it is an unkillable transaction.)

Note - given the new incremental eviction logic, OOM is defined as: Over maxmemory AND nothing is evictable. So this change will not impact systems which have a functional eviction policy.

Comment From: madolson

Going to re-open since it doesn't look like we seriously considered attempting to evict memory during the first write command. I'm not sure the the approach is valid, imagine you intend to cleanup all the data with a LUA script, but I think we should discuss it.

Comment From: oranagra

responding with OOM on the first write command in the script would mean that a script that solely contains deletions will fail, imagine a use case with no-eviction, and a script that's responsible on keeping the database size at check. imagine a case with no eviction, who only uses scripts (any command, even DEL is wrapped in a script who does nothing more)

The whole thing with scripts and memory stinks for many reasons, in many of them that's a malicious or badly written script, or an app that generates a new script for each call by embedding it's arguments in the code. for many of these i would argue that after a certain threshold redis can just exit(1) (since we can't kill the script and let redis keep running).

So putting aside bad use cases, i suppose the current implementation is only problematic with no-eviction: assuming scripts are always short (no infinite loops), if redis is allowed to evict before the script runs, it's ok for the script to go over the memory limit, and get back in check before the next run.

so the problem in the above reported cases is that the script is a valid one (not abusive), and the application actually expects it to fail with OOM! i.e. there's no task of cleanup that runs in parallel to the script, one consuming memory and one releasing it (maybe that task will be triggered by the OOM error). The workload (at some point in time, at least in one example) was composed only of the one script which does LREM and then increases memory usage.

I don't like the suggestion Jim made since it'll mean that a script that only performs cleanup fails too. That means that we have to be able to distinguish between scripts that consume memory and scripts that don't.

So the other way to "solve" this is for the script to somehow declare (before execution) that it's gonna consume memory. Or have an API with which the script can check the memory state before it performs any modifications (or even after), and abort itself. This way, the script can remain atomic (it can decide when it's safe to exit leaving the database in sane / consistent state).

Note that having a script be able to read the server state is very bad if the script will be replicated as a script, but with today's default of lua-replicate-commands (which soon should be the only possibility), this is safe.

Comment From: QuChen88

For now, I opened a Redis documentation PR to better document this Lua behavior under low memory conditions https://github.com/redis/redis-doc/pull/1533

Comment From: JimB123

@oranagra I understand and concur with your points. A few clarifications: 1. This is an issue with noeviction OR the eviction policy is non-functional (a volatile policy with no volatile data) 1. A use case where there is a specific LUA script designed to perform cleanup on a node with noeviction seems like a very unusual case (though theoretically possible). Is there any evidence that users are doing this?

So these might be reasonable options:

Original proposal

If we are over maxmemory
AND nothing is evictable
THEN disallow LUA scripts which execute ANY write command

Add some small/arbitrary safety margin

If we are over maxmemory
AND nothing is evictable
AND we have exceeded an arbitrary safety margin
THEN disallow LUA scripts which execute ANY write command

Consider historical behavior

Since scripts are cached in memory and usually reused, we could add a tag on the stored script which would record its use of memory. This could be a tag indicating that any command which uses memory was executed. Alternatively, we could try to determine if the net effect on memory was positive (but I think this is much more complex).

My feeling is that this option is unnecessary complex. But if both of the options above are not acceptable, this may be the only option to "fix" the problem.

If we are over maxmemory
AND nothing is evictable
AND in the past, this script has called commands which use memory
THEN disallow the script

(Side note: If we consider tagging on scripts, it might be useful to tag scripts loaded by LOAD SCRIPT rather than EVAL. This would allow us to purge scripts loaded by EVAL without risk of hindering EVALSHA.)

Re: documentation

The current behavior is counter-intuitive. We expect that Redis will disallow further memory growth after maxmemory is exceeded. No user would expect that we can exceed maxmemory and continue adding keys through a loophole, command after command, until the process gets killed. Furthermore, I don't think any user would be surprised if this "functionality" was removed.

@QuChen88 I don't think documenting the bad behavior is sufficient. https://github.com/redis/redis-doc/pull/1533

Comment From: QuChen88

@JimB123 I agree that this is a problem in Redis today that user can unintentionally go above maxmemory with poorly programmed Lua scripts and we need to fix this in the long run one way or another. In the interim, I have updated the documentation for better visibility into the current Lua script behavior to help people make more informed decisions

Comment From: yossigo

We need to work under the assumption that there's no way to determine if a write command is memory net positive or negative, considering all possible side effects (keyspace notifications, client side caching invalidations, etc.).

Because of this, I think that even if scripts were consistent and predictable, we'd still not be able to tag them reliably and completely avoid the problem.

For the common case, I lean towards accepting the idea that disallow ANY write command if we're OOM and beyond some safety margin.

To address @oranagra's use case, we could allow Lua scripts to override this behavior, and fall back to what we have now. This won't be the first dangerous power tool available to those who're willing to dig in the docs (see redis.set_repl() for example).

So a script that is expected to be memory net negative could invoke redis.set_mode(redis.IGNORE_OOM) and fall back to the current behavior.

We can also take in another step forward and provide a redis.set_mode(redis.NON_ATOMIC) to indicate violation of atomicity is allowed in certain conditions. While it's not ideal, it's certainly possible after script replication is no longer supported.

Comment From: QuChen88

Just playing the devil's advocate here a bit - if it is impossible to know whether the rest of the Lua script is adding memory by looking at the first write command, why not just treat all the mutating Lua scripts the same? i.e. Simply reject the script as soon as a mutating command is encountered when we are over maxmemory? This change is as simple as this one-liner in scripting.c

@@ -668,8 +668,7 @@ int luaRedisGenericCommand(lua_State *lua, int raise_error) {
         !server.loading &&              /* Don't care about mem if loading. */
         !server.masterhost &&           /* Slave must execute the script. */
         server.lua_write_dirty == 0 &&  /* Script had no side effects so far. */
         server.lua_oom &&               /* Detected OOM when script start. */
-        (cmd->flags & CMD_DENYOOM))
+        (cmd->flags & CMD_WRITE))
     {
         luaPushError(lua, shared.oomerr->ptr);
         goto cleanup;

The CMD_DENYOOM flag works fine when the client executes an individual command. So in the context of such command being executed outside of a Lua script, we can safely allow it to happen when over maxmemory. We can probably recommend users to do that to free up memory.

Comment From: madolson

@QuChen88 The reason we don't is that we allow LUA scripts that perform deletions today, so that would be a backwards breaking change. Which is getting at the core of the discussion.

I like Yossi's idea of being able to set the script mode. The way it naturally makes sense to me is to have three modes, READ_ONLY, DENY_OOM, ALLOW_OOM. Setting the mode gives you different guarantees. READ_ONLY will prevent writes, makings sure the script is abortable, DENY_OOM will deny memory commands, making sure the script always succeeds on memory situations, and ALLOW_OOM does what Jim suggested. It might also make sense to add these as new commands instead, so that people can restrict them based on ACLs. (Like EVALRO, EVALDENYOOM, EVAL).

Comment From: oranagra

maybe i'm missing something, what's the difference between @QuChen88 last post and what @JimB123 suggested? isn't that the same thing? or om i missing something?

Yossi's suggestion is somewhat similar to what i suggested, just different default behavior. i.e the script is able to tell you before any modifications, if it is gonna release memory or consume it, or in other words, if it wants to get denied by OOM on deletions or not. I suppose we can add such a feature even before we deprecate script replication, just need to decide what default to use. which is worse, to break scripts that do cleanup, or stop naive scripts from bloating the memory?

Comment From: QuChen88

I agree with Yossi's idea that the default behavior in Lua script can be that all write are rejected if redis is over maxmemory (which is in line with what I proposed), and we can allow user to override this behavior in Lua with some special flags like IGNORE_OOM.

However, this can make it difficult for the user to properly tag the Lua script. Say the script contains multiple different write commands, then it is very hard to know beforehand if the net memory will increase or decrease as it depends on the sizes of the accessed items.

Comment From: nmvk

Taking a look

Comment From: yoav-steinberg

Hi all, I'm going to PR @yossigo's suggestion here https://github.com/redis/redis/issues/8478#issuecomment-799757436 What I'm wondering about is the idea of also adding a break-atomicity option. This means that when set the script will allow running non-memory growing commands but will return an OOM error as soon as it reaches a DENY_OOM command. This will break the script's atomicity. It might be useful if the user doesn't mind their script aborting in the middle after performing some modifications to the dataset. In terms of replication, this isn't such a big deal since we're going to chuck script replication anyway. So WDYT?

Comment From: madolson

@yoav-steinberg The only question I would ask is if @nmvk is still looking at this.

Comment From: nmvk

@madolson Apologies for delay, I am still looking at it, will post an update soon

Comment From: oranagra

Note, when we handle that issue, let's also fix the code that sets WRITE_DIRTY on any CMD_WRITE, and instead actually look at server.dirty.

Comment From: yoav-steinberg

@nmvk ping?