Since the cluster work has not been making a lot of recent progress (https://github.com/redis/redis/pull/10875), I wanted to use this thread to identify all work streams we could start incrementally working towards to make progress.
I'm proposing we break down development into three milestones for an "MVP" with incremental work that can be built ontop of it after each milestone.
- [ ] : Milestone 1: A basic rust module that implements against the new cluster interface.
- [ ] : Finalize the refactoring in cluster.c for pubsub.
- [ ] : Implement a module API to support an external module to extend the clustering system.
- [ ] : Add the functionality to build cluster v2 with a module and add basic test harnesses to test it works correctly. Node will be static.
- [ ] : Milestone 2: Basic cluster creation with raft leader elections
- [ ] : Implement the internal storage format for cluster nodes and structures to serve data from APIs.
- [ ] : Investigate an integrate with a rust raft library. (Suggesting https://github.com/tikv/raft-rs)
- [ ] : Add configuration options to bootstrap cluster with number of shards.
- [ ] : Milestone 3: Support Redis cluster topology mutations, failovers and log compaction
- [ ] : Implement Redis shard failover logic with heartbeats.
- [ ] : Support compacting the cluster state on disk.
- [ ] : Add APIs to mutate topology state
- [ ] : Milestone 4: Quality of life features
- [ ] : Support operations to apply function on all nodes in cluster.
- [ ] : Support additional metadata to store and fetch functions in a cluster. (Not strictly required for GA, but it should be unblocked)
- [ ] : Review operational readiness: testing and monitoring.
- [ ] : Redis-cli connectively features.
At the end of milestone 3, we will be able to create a cluster, add and remove node and assign them to a fixed number of shards. Milestone 4 is intended primarily to make the feature ready to launch for a GA.
Additional follow up work streams (Maybe for 8.2?): - [ ] : Support dynamic shard changes through online resharding. - [ ] : Support decoupled FC and TD for scaling beyond throughput of single node.
Comment From: zuiderkwast
Plan looks good to me, but I find it hard to believe it can be in 8.0 if nothing if this is done yet.
Is there any code at all in a branch anywhere?
Is there any team working on this or should we start from scratch in the community?
Is the module going to be added to the main repo, introducing (conditional) dependency to a Rust toolchain, or will it be in a separate repo?