Redis Odd behavior after reiterating over key range after a large delete batch

Hi,

I'm iterating over the keys in a Redis Cluster, inserting them into another storage system and deleting the written keys immediately afterwards. At some point I stop the script (gracefully). Then I start the script again. What I noticed on a subsequent run is that the iterator method doesn't return any key for a long time (although there is network activity in the background). My assumption is that the iterator actually starts to traverse the key space simply skipping deleted keys (and it takes considerable time if the number of deleted keys was large).

Used methods: scan_iter(SCAN) and delete(DEL) from the python client. Deletion is done in a pipeline to speed up the process. Redis version: 5.0.6.

My questions are: * Is my assumption correct (deleted keys are actually iterated over on subsequent runs) * If the assumption is correct when the keys are actually removed from the iterator's path (or how to speed-up this process). Is there an workaround to skip the deleted keys?

Comment From: oranagra

@EugeniuZ yes, this is a known issue with this pattern. SCAN iterates over the keys by walking on the hashtable, so if you delete them, it crates a contentious space of empty hash buckets. Then, the next time you start a SCAN from 0, it has to iterate over all these empty buckets. in the past, it could have caused redis to freeze on a long call to SCAN. now SCAN just comes back empty handed with a new cursor so that it doesn't cause redis to completely freeze.

what you can do is that when your script terminates, it can print the last scan cursor it used, and you can pass that cursor as a command line argument to that script for subsequent execution.

Comment From: EugeniuZ

Thanks for the tip @oranagra. In the case of the single instance api that definitely works. When using the Redis in cluster mode is probably trickier to implement.

Are the empty buckets staying forever in Redis or is the space compacted at some later point in time? Or can this compaction be triggered by some configuration command or utility?

Comment From: oranagra

when the hashtable goes below 10% utilization it'll start a gradual rehashing procedure. at that point SCAN will keep working, but will possibly start causing a bit more latency (having to scan 8 buckets of the larger table on each sweep though a single bucket of the smaller one).