I've written a small Java program to copy and transform keys and their values from one redis cluster to another. I have to process tens of millions of keys. To do the job properly I need to send a round-trip request to get the type of the keys returned from SCAN which causes time and network traffic to add up quickly with millions of keys.

I'd like to suggest a modification to the SCAN command to allow for a filter on type or other attributes so I can batch the copy by type and save the round trip. Something like:

scan 0 match * count 100 type hash

or

scan 0 match * count 100 filter type=hash

which will filter the scan results to only return keys that represent a hash type. This avoids the request for the type. I can then parallelize the copy process to do hash, set, zset, etc. as separate tasks and avoid the overhead of the type call.

I can imagine many kinds of filters, such as keys that will expire in 1 hour, sets that contain more than nnn elements, lists with exactly one element, etc.

Perhaps filters can be combined : filter type=list AND ttl<3600

Comment From: itamarhaber

This has been partially addressed (support for TYPE) by #6116

Comment From: itamarhaber

Closing - feel free to reopen or create a new issue (for extended filtering)