lsb_release -a of GCP Compute Engine(VM):

No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 20.04.5 LTS
Release:    20.04
Codename:   focal

Hi, this is a GitHub issue related to network issues between Redis cluster deployed on our site's worker node (on-prem) and Redis client on GCP VM connected to our site via Site-to-Site VPN. Specifically, the issue is related to the unresponsiveness of the Redis clients (i.e redis-py or redis-cli) when using the info stats / info memory commands while connected to the this Redis cluster deployed on our site's node.

Redis Redis client gets stuck / hangs on only the

Like above screenshot, when the Redis cluster is deployed within the Google Compute Engine (GCP) VMs (So at the same place as where redis client is), the info stats / info memory commands work without any issues. (Note how 1,491 bytes for the info memory response)

Redis Redis client gets stuck / hangs on only the

However, when the Redis cluster is deployed on the site's worker node, the Redis client hangs indefinitely when using the same commands, and the response from the server includes [TCP Previous segment not captured] Response: [fragment] [fragment] in the packet dump (Shown in the above screenshot).

After reading (https://cloud.google.com/vpc/docs/mtu#handling_of_packets_that_exceed_mtu) (Below screenshot), I first thought it could be a problem regarding the MTU on Google VPC because it mentions that IP fragmentation is not supported in TCP.

Redis Redis client gets stuck / hangs on only the

Thus, my thinking flow was something like since we have more network layers to go through because of VPN, it means more bytes in the response -> And this exceeds the MTU limit -> So the client hangs. To be honest, I am not even sure whether I am on the right track. Valid MTU(https://cloud.google.com/vpc/docs/mtu#valid_mtus) are already specified however just to see if it changes anything, I also tried changing the VPC MTU to Jumbo (8,896 bytes) but did not help at all.

It is worth noting that all other info commands and any other commands work just fine for the same Redis cluster.

Therefore, the issue at hand seems to be network related issue, but I am unable to troubleshoot.

I would appreciate any help!

Comment From: yossigo

@imageschool The problem you describe is a generic networking issue that will affect any TCP service, it is not specific to Redis (which is the topic of this repository). You should try to solve that with your cloud provider or VPN vendor.

Comment From: imageschool

@yossigo You are right, I will do that. I was just wondering whether there was anything I was misunderstanding related to Redis commands.

Comment From: yossigo

@imageschool For the record - no, I imagine the only difference between the commands that worked and those that didn't is the size of the reply. If the reply is big enough to trigger your underlying MTU issue, it will "freeze".

Comment From: imageschool

Thanks for the clarification!