Redis No limit on query buffer - Nineya|java/go/python

There is currently no mechanism that limits query buffers (like the equivalent client output buffer limit). While not very common, it can still cause Redis to bloat even in seemingly innocent use cases. For example, the following NodeJS script simply tries to BLPOP an entry and then do some SETs to the same key but because NodeJS is async and BLPOP hangs, the qbuf gets huge.

var redis_lib = require("redis");
var host = '127.0.0.1';
var port = 6379;
var pass = null;
var str1M = Array(1024*1024).join('x');
var redis = redis_lib.createClient(port, host, {auth_pass: pass});

redis.on("ready", function() {
        console.log("Connection established");
        redis.blpop("list", 0, redis.print);
        for(var i = 0; i <= 1000; i++) {
                redis.set("my-key", str1M, redis.print);
        }
});

Comment From: antirez

Hello @yossigo, yep that's a problem. I remember we used to have some arbitrary limit, but that was removed because of some reason. All this may be just in my dreams... I'll check. In the case of blocking operations indeed this is especially troublesome since even small payloads accumulating can lead to very big query buffers.

I would like to understand why/if there was a limit and it was removed to start. Then there are different fixes. One involves what happens during blocking operations, and if we should try to read or not client data while we are blocking, but I've the feeling this would be a too much big semantical change: it makes client writes blocking when there are blocking ops pending, which may not be idea in pipelining scenarios or other edge cases maybe? Not sure.

Another fix would be to have limits, like, configurable limits. In such a case the limit may be enforced only when there is a pending blocking operation. However there would be still the strange case of huge single arguments of an incredible long command composed of many arguments of this huge size.

Comment From: yossigo

Not reading from client sockets is always a problem, because then there's no way to tell a connection was dropped as it happens.

I think a limit that works a lot like the client output buffer is a reasonable solution. It's true that in extreme cases it can get in the way, but so is the current client output buffer mechanism, and it's always possible to configure it for these special cases.

Comment From: antirez

Hey @yossigo, just found that we actually have a fixed limit here. Check networking.c:

    if (sdslen(c->querybuf) > server.client_max_querybuf_len) {

This is good, we can just make it configurable. However isn't it strange you did not triggered it in your tests?

Comment From: yossigo

The default is 1GB, that's why my test did not trigger it. I think it's good to have it configurable, but what do you consider a reasonable default value? I think 1GB might really be too big.

Comment From: antirez

Yep.. 1GB is pretty huge limit. It was put there mainly to avoid to segfault for the SDS string overflow at 2GB (now 4GB but used to be half than that in the past). 256MB looks more appropriate as a default value perhaps?

Comment From: yossigo

256MB sounds like a good limit for queries queuing up while the connection is blocked. It might be a bit low for very rare requests (e.g. very long strings) though. Ideally I think this calls for either a time-sensitive limit (like output buffer) or different limits depending on state (i.e. reading chunks of a big request vs. plain buffering because the client is blocked).

Comment From: toddnestor

This is a very old issue, but in case someone ever follows up on it, it seems this issue was resolved here: https://github.com/redis/redis/commit/b509a14c3e8a76aadd6bd48296cafb4616be25e2

It appears the max-querybuf-len configuration option was introduced there. The default is still 1GB, but I assume that is intentional.

Comment From: yossigo

Thanks for noticing, @toddnestor!