Currently, our production system has more than 128 cluster modes, it is more than CONFIG_FDSET_INCR(128) . if we have client connections close to server's max client number, cluster nodes will have issue connecting each other since for each cluster node the reserved CONFIG_FDSET_INCR is not enough for reserving cluster bus connections. Is there any solution we can solve this issue? one potential solution is we can make CONFIG_FDSET_INCR configurable in config file.is there any potential risk for this? thank you!
Comment From: antirez
Hello @daidaotong, yes this is a real problem. Let me check if I can do something about it... Cluster connections may vary dynamically but I think I may have a few ideas. On a side note, do you know the italian music band called "CCCP"? https://www.youtube.com/watch?v=iiG8AFAc9GM
Comment From: antirez
What do you think about 4b8d8826a? It's a very simple approach, but I think it could be quite effective especially given that clusters may get larger with time. So instead of trying to do guesswork in advance, we just add the two sources of "fd usage" there, and change the error to make more clear what is happening for the user.
Comment From: daidaotong
Hello @antirez, thank you very much for your reply and quick fix. It looks good to me and I think this can effectively avoid the cluster connection issue. :) I only have some small suggestions: 1. Maybe we need to change the documentation for max client config in order to let user know the cluster connection was counted in already? 2. two trivial part, first is I think the cluster connection count might be (dictSize(server.cluster->nodes)-1)*2 , is that correct? also #include "cluster.h" should be put in networking.c in order to avoid the compiling warnings when we do make. or we can put the getClusterConnectionsCount function declaration somewhere else like server.h..
btw I know the Italian music band CCCP but I didn't listen its music much, the reason I put my profile picture is because of its original meaning -USSR (like the music band), I considered myself as a Socialist person :)
Comment From: ShooterIT
Hi @antirez @daidaotong Interesting, this problem has been around for a long time, I also ever show my thought https://github.com/antirez/redis/pull/6945#issuecomment-633421367
I agree with @daidaotong suggestions, What's more, I think we also should check maxclients when the cluster node accepts a connection by clusterAcceptHandler.
Additionally, I can modify my pr https://github.com/antirez/redis/pull/6945 if you think it makes sense @antirez
Comment From: daidaotong
thanks @ShooterIT , I just saw your PR it looks like a cool idea, the admin can manually change max client value if it reached the max limit and that may needed in some use cases..
Comment From: antirez
Hi @daidaotong,
yes, the documentation should be updated indeed! Just putting a note about the fact that we also count the cluster connections. And yes, in theory it should be (-1), but I figured out that anyway to have 2 connections more is not a big deal so was tempted to leave it as it is. But after all maybe to make the math more correct is actually more clear as you noticed. About including cluster.h, sure I'll do it. So we agree this solution is ok? In that case I can proceed implementing it.
Comment From: daidaotong
Hello @antirez , yep I think the solution is totally ok for me...thank you very much for your quick response and help...
Comment From: antirez
Changed done :) Closing.
Comment From: antirez
@daidaotong oh about the CCCP music, yep they have texts that are very socialist oriented. I appreciated what they had to say in 90s both politically and musically.