Redis Add MBRPOPLPUSH - Nineya|java/go/python

MBRPOPLPUSH source1 source2 source3 ... destination timeout

I'd like a variant of BRPOPLPUSH which accepts multiple sources and pops from the first source that contains an element. When the user has multiple reliable queues I have to poll each queue with RPOPLPUSH so I don't block on an empty queue, which causes big load on Redis. Think of this like select() for queues.

It should return [sourceX, element].

Comment From: girak

Comment From: mperham

Ping @antirez ?

Comment From: antirez

Sorry it is a complex matter (you can find archives in the Redis mailing lists where downsides are also discussed), so I'm not able to reply ASAP, but in the next days hopefully, since I want to understand in a final way if for the future we may have this or not.

Note that this behavior is already allowed with BLPOP and BRPOP which support multiple lists, but there the semantics is ways more obvious. Feedbacks ASAP!

Comment From: mperham

I can't use BXPOP because they are destructive, not reliable.

I'll probably have to make my own Lua command even if you can do it since I need to support 2.6, 2.8, etc.

Comment From: badboy

I don't think he meant to suggest using BxPOP instead. It was more like a note that similar behavior already exists, so adding MBRPOPLPUSH wouldn't be so much of a special case anyway.

Comment From: polachok

@mperham how is that possible? you can't block in lua afaik

Comment From: antirez

It is not possible to block on Lua, but it is planned.

Comment From: mperham

No but I can poll all the queues

On Jun 5, 2014, at 3:12, Alexander Polakov notifications@github.com wrote:

@mperham how is that possible? you can't block in lua afaik

— Reply to this email directly or view it on GitHub.

Comment From: antirez

@mperham instead of polling all the queues in this way, you can more easily block into multiple keys with BRPOP that is multiple-keys capable, and use it as a possibly unsafe synchronization primitive (more info on the unsafe part later).

Basically you do something like that:

results = BRPOP wakeup_queue_1 wakeup_queue_2 wakeup_queue_3 ... 30
IF results != NULL
    FOREACH key,val in results DO
        AS LONG AS THERE ARE ELEMENTS, DO => BRPOPLPUSH key ...
    DONE
ELSE
    # Timeout without results! Happens after 30 seconds without activity.
    ... code to check all the queues like it does today ...
END

Of course when you add a task, you also send a char in the corresponding wakeup key. Note that I said unsafe because thanks to the 30 seconds timeout I put into the BRPOP part, you are allowed to use this in a best-effort way if the sequence of commands are not designed to guarantee a one-to-one correspondence between wakeup and the queue actually containing elements. Anyway even if there is a race condition, in 5 seconds you'll check you queues again.

Or you may way to make sure via MULTI/EXEC and other means to mount a race-free algorithm, however I think that the 30 seconds trick (that you may want to tune depending on your needs) makes things more resistant in corner cases that may arise.

Comment From: vsespb

Good workaround. Except that if there are 1000000 tasks in wakeup_queue_3, and one task is lost in wakeup_queue_1, you won't find that until process all 1000000 from wakeup_queue_3 (and that is a problem, because wakeup_queue_1 should be in proiorty over wakeup_queue_3).

That is a very common case when there are lots of task in queue with low proority, and there is queue with hight priority, which should be serverd.. with really high priority, really. For example when you send huge marketing email batch over night, but at same time you should sometimes send emails like password recovery or payment confirmations.

So. To workaround this you'll need to poll all queues time-to-time, even if there was no 30 seconds timeout.

Another problem, that when you poll queues (after timeout etc), there can be new items arrived to queues at same time, so you can end up having extra items in wakeup queues. That means that BRPOPLPUSH will block (and possible return null after timeout), but that is not big problem - you just should set timeout to BRPOPLPUSH too and be ready to start from beginning if NULL returned.

Comment From: vsespb

workaround this you'll need to poll all queues time-to-time, even if there was no 30 seconds timeout.

... and if you have many worker processes, this time-to-time should not happen very often.. so you probably should also have counter (INCR with expire) to control how often you check queue.

Comment From: etanol

This is only for curious readers comming from search engines.

Even though both B[RL]POP and BRPOPLPUSH end up calling blockForKeys (which has a target parameter to store the result), turns out the resolution of a blocked client is a completely different code path for each case. You can see that in serveClientBlockedOnList.

What I fail to understand is such different treatment for each case. I'm pretty convinced that most users would be happy with almost identical semantics between B[RL]POP and a potential BRPOPLPUSH with multiple sources.

Comment From: mperham

This is still my number 1 issue with Redis; it means that reliable queuing with many queues in Sidekiq remains inefficient. Has anything changed which might make this easier to implement?

Comment From: alxbog

+1 to importance of the feature.

Comment From: tkram01

+1. This is out #1 issue with Redis.

Comment From: leonerd

+1. This would be nice to have. Lack of this feature is mentioned in the documentation for Job::Async::Redis, with this issue being linked.

https://metacpan.org/pod/Job::Async::Redis#reliable-mode

Comment From: tmedford

@antirez Any thoughts if this would ever come?

Comment From: rusterholz

Coming up on 8 years since this feature was first requested and almost 3 years since the last comment, are there any updates? @antirez did you ever determine whether this will be feasible or not?

Comment From: quinn

This would be a nice feature to have