Hi, I tried to make a mass insert in a redis cluster mode following this tutorial https://redis.io/topics/mass-insert but the methods does not work properly.

The first option with redis-cli --pipe does not work in cluster mode, I get errors like "MOVED 15045 127.0.0.1:7003" the --pipe does not follow the redirect even using -c option with redis-cli.

The second option with nc I don't get any answer about the inserts, maybe I loose data or get errors and with this method I can't handle that.

How is the best or correct and safe way to load mass data into a redis in cluster mode?

Comment From: K-Jean

Hi.

In our redis cluster, we used the --pipe with the same file in all our nodes to propagate the data because the --pipe does not follow the redirection.

Comment From: charlenezheng

Hi, I am facing a same error Last reply received from server. errors: 100000, replies: 100000 Is there any solution for this question?

Comment From: yossigo

This is indeed a limitation of redis-cli which does not support cluster mode with --pipe.

Comment From: hwware

just take a quick glance of the code, we might need to change the approach used in the pipe mode implementation if we need to fix this issue. since currently it is implemented in a non-blocking way, however during cluster redirect it need to know which command it fails and doing resend rather than check and count the reply only. If anyone has some comments in this issue? Also glad to hear and discuss if anyone has more thoughts on this. thanks

Comment From: yossigo

@hwware the problem is that pipelining is required when attempting bulk operations, as if we block the next command until we receive a reply we'll end up binding our throughput to latency. I think the problem with redirects is not so much about knowing which command it applies to, but actually maintaining a backlog so we can rewind and re-transmit on demand. It can be done of course but it's going to take a lot of work.

A possible compromise can be to do CLUSTER SLOTS and set up all connections in advance, without supporting redirects - so a redirect response is treated as an error that terminates the session. In a way it's aligned with how redis-cli already doesn't handle errors or tries to re-connect/re-transmit.

Comment From: hwware

Hello @yossigo , thank you for your reply, maybe my last comment is a little bit confusing, what i mean for we need to know which command it fails is the previous executed commands it fails executing,since we need to redirect and send the command again.I think we are mentioning the samething.

For option 1 using backog, I think since we are doing a non-blocking way, therefore we cannot guarantee if we got a MOVED or ASK error reply, we can successfully find the original commands in the backlog, unless we wait for all the commands executed in backlog, blocking for getting all reply for this batch, and send another batch, if we wait for each batch finished, maybe we can think it as a buffer..

For option 2, I would think it may cause issue in this case: if we setting up the connection before and during the transmission, redis did slot migration we may have some data cannot successfully transmitted to the correct node, IMHO I wouldn't think this is a rare case since normally we use --pipe mode we will do the mass insertion and it may take long time.. please let me know if i am missing anything here, Thank you!

Comment From: yossigo

@hwware I agree with you that it would be better to be able to handle migrations during redis-cli --pipe.

Comment From: DaveLanday

Any update on this? I was working in single mode, and having to work in cluster mode has broken many of my simple but extremely important scripts using --pipe.

Comment From: Ilan-StartIO

Same for me

Comment From: Miguelme

Is there a workaround to be able to use the mass insert functionality from redis on cluster-mode ?

Comment From: ssndhu01

For a workaround, We calculated the cluster slot manually and separated the commands on the basis of cluster slots in different files. and created 1 file specific to each master.

the slot can be calculated simply by the below formula, for example:- set xyz 123123123 key = xyz slot = crc16(key) % 16384

Comment From: sambhavk

  1. We also had a use-case to bulk insert entries in a redis cluster but as that is not possible right now so instead we divided our cluster into multiple single master node cluster and did client level sharding.
  2. This allowed us to have cluster advantage with pipe operation speed

Comment From: adibaulia

is there any update for this issue?