The problem/use-case that the feature addresses
Many LBS applications may require the pick one function for random recommendation, that is, a random selection that meets the distance, not necessarily the closest, but need to return quickly.
Although GEORADIUS or GEOSEARCH commands provide COUNT <count> option, it will search for all values that meet the conditions internally and finally filter. Therefore, when there are many local points, it may be very time-consuming, as shown in the following document described.
By default all the matching items are returned. It is possible to limit the results to the first N matching items by using the COUNT
option. However note that internally the command needs to perform an effort proportional to the number of items matching the specified area, so to query very large areas with a very small COUNT option may be slow even if just a few results are returned. On the other hand COUNT can be a very effective way to reduce bandwidth usage if normally just the first results are used.
Description of the feature
Increase LIMIT limit option to GEORADIUS and GEOSEARCH, once the number reaches the limit, stop searching and return.
GEORADIUS key longitude latitude radius m|km|ft|mi [WITHCOORD] [WITHDIST] [WITHHASH] [LIMIT limit] [COUNT count] [ASC|DESC] [STORE key] [STOREDIST key]
Compatibility with current COUNT count:
- Using COUNT alone will search all values and return count elements.(Current logic)
- Using LIMIT alone will search for limit elements or more, and return limit elements.
- To use the two together, the limit must be greater than or equal to count in the parameters passed, search for limit elements or more, and finally the number of elements is returned according to count.
Comment From: oranagra
sounds like a useful feature to me. @redis/core-team WDYT?
Comment From: itamarhaber
Agreed about potential usefulness.
- We should be very clear that this will not provide true randomness but rather only a fast break
- I'm not sure I understand the case for using the two together, but how about expanding COUNT to support negative values for limiting purposes? I.e. COUNT -1 will mean a limit of one.
Comment From: yangbodong22011
We should be very clear that this will not provide true randomness but rather only a fast break
I mainly want to provide fast break, but if randomness is necessary, we can continue the discussion.
COUNT -1 will mean a limit of one.
This can avoid adding parameters, but LIMIT limit seems to be easier to understand, we listen to other people's opinions.
Comment From: oranagra
I actually think that having both COUNT and LIMIT arguments is confusing.
And i'm not sure about the usefulness of using them together.
If we want something else other than negative value, maybe we can create a ~ modifier like in XADD.
either way, i'm now more concerned about problems we'll run into while implementing it. we currently have code that first searches for everything inside the wider area and later does the filtering for the exact radius. now if we want to break the search early, we need to check that the points we found are indeed candidates for a reply (fit inside the exact shape).
@yangbodong22011 i think we agree about the usefulness of the feature and you can go ahead and start implementing it, we can decide on the syntax later (real easy to change).
Comment From: yangbodong22011
I submitted the draft code to express my thoughts, limit was saved, and compared with ga->used in the traversal of zset:
if (ga->used && ga->used >= (unsigned long) shape->limit) break;
- It is unreasonable to put the limit variable in GeoShape. I hope to rename GeoShape to GeoSearchContext. The current situation is that GeoShape also includes some variables that do not belong to it, such as
searchType,conversion, etc., so I think GeoSearchContext is a more appropriate name. - Draft code does not add COUNT and LIMIT processing.
Comment From: guybe7
didn't read the whole thread but just raising a question: is there a way to distinguish between when: 1. GEO returned because there are no more elements to return; and 2. GEO returned because it reached LIMIT (in which case we may need to call it again?)
Comment From: yangbodong22011
didn't read the whole thread but just raising a question: is there a way to distinguish between when:
- GEO* returned because there are no more elements to return; and
- GEO* returned because it reached LIMIT (in which case we may need to call it again?)
The limit is just for fast break. I am afraid I did not understand your question. Why do we need to distinguish between the two cases you mentioned and call it again in the second case?
Comment From: oranagra
I think Guy's question is more suitable for the COUNT argument than the (new) LIMIT argument, and AFAIK there's no way to tell if there were more results which got truncated.
Comment From: itamarhaber
About the syntax:
LIMIT limit seems to be easier to understand, we listen to other people's opinions.
It is indeed easier to understand, but also somehow misleading given how COUNT+LIMIT are supposed to work, so users will have to read the docs anyway once they run into that corner. If negative values are confusing, we can consider some special modifier such as COUNT ~10, but I find that even more confusing (personally).
BTW, the negative COUNT approach is already used by SRANDMEMBER, although for a different purpose.
Comment From: oranagra
I would vote for negative COUNT. LIMIT and COUNT sound like synonyms, if it were me I would always confuse them with one another.
Comment From: yangbodong22011
Updated syntax to support negative COUNT, and added performance test, see #8259 for details.