Currently whenever a dict iterator is created using dictGetIterator it performs a heap allocation using zmalloc, however as far as I checked all uses of iterators are limited to the same stack frame. As we are very dependent on jemalloc cache I am not sure how much performance gain could be achieved but I think this can easily be tested to understand. as a general rule of thumb I think memory allocations should be avoided, especially in cases where the scope is bounded to a specific stack frame. The suggestion is to use the same method as used for rax. we can offer the following alternatives: dictIterator *dictGetIterator(dict *d); will be replaced with void dictInitIterator(dict *d);

dictIterator *dictGetSafeIterator(dict *d); will be replaced with void dictInitSafeIterator(dict *d);

we can also support an initialization macro so that code can be:

dictIterator iter = DICT_ITERATOR_INITIALIZER(dict *d);
dictIterator iter = DICT_SAFE_ITERATOR_INITIALIZER(dict *d);

the same can be done for iterators encapsulating the dict iterator like the hash or zset iterators.

Comment From: oranagra

i agree it'll be better. the other advantage is that we won't get so many leaks by people forgetting to release the iterator in the error handling code (early exit). and also, that the error handling code will be shorter.

however, considering that there are some 109 usages of these, and in many of them the performance might not be a concern, i'm not sure i'd rush to replace all of them (blame log destruction and cherry pick / merge conflicts).

maybe we can start by creating an alternative, and using it in the few places that have performance concern and the few with complicated error handling (note that there are 169 calls to the release function, maybe there's one or two functions that account for the majority of the delta)

Comment From: ranshid

maybe we can start by creating an alternative, and using it in the few places that have performance concern and the few with complicated error handling (note that there are 169 calls to the release function, maybe there's one or two functions that account for the majority of the delta)

@oranagra I have no problem taking the route you suggested, I would only want to know what is the concern replacing all of the occurrences (I guess it is better to keep a single iterator use API) - is it the git blame obfuscation?

Comment From: uvletter

How about remaining the api but substituting the function implementation, thus the caller code wouldn't be broken. Additional problem is release interator will be a nop function, should we remain the calling to minimize the change or just clean up(cleanup wouldn't break git blame while it just removes code)

Comment From: ranshid

How about remaining the api but substituting the function implementation, thus the caller code wouldn't be broken.

@uvletter the problem with this is that the current API return a pointer to the newly allocated iterator. making the iterators local to the stack frame will require some changes in the calling code in my opinion. We can maintain a pool of iterators to make this work. but IMO this will only complicate the implementation and will probably be less performant.

Additional problem is release interator will be a nop function, should we remain the calling to minimize the change or just clean up(cleanup wouldn't break git blame while it just removes code)

I think that the release might still be needed in order to resume rehashing or perform fingerprint checks.

Comment From: oranagra

@ranshid i usually try to avoid refactoring, specifically if not strictly necessarily. for one, it causes a mess in the blame log, making it harder to find out the real reason some piece of code was changed or added. secondly, it can cause painful merge conflicts when backporting changes to older releases, or for people who maintain forks. thirdly, it can cause issues for old / pending PRs.

so it's always a compromise between the benefits and the potential damage... i agree it'll be ugly to have two interfaces, but maybe we can gradually move away from the old one when we do other changes until there's no much left that uses it and then we can eliminate it.

so i think the big question to answer is what are the real benefits of doing this change and i imagine that maybe there's one or two places in which this change can be a real performance improvement (not just theoretical), and there's maybe one or two places were this can make error handling code significantly simpler.

if i understand @uvletter suggestion correctly, i.e. changing the interface and adding macros to make it look like the old one so that we don't need to touch the code. i don't think i like it since it'll make the code confusing.

Comment From: ranshid

@ranshid i usually try to avoid refactoring, specifically if not strictly necessarily. for one, it causes a mess in the blame log, making it harder to find out the real reason some piece of code was changed or added. secondly, it can cause painful merge conflicts when backporting changes to older releases, or for people who maintain forks. thirdly, it can cause issues for old / pending PRs.

so it's always a compromise between the benefits and the potential damage... i agree it'll be ugly to have two interfaces, but maybe we can gradually move away from the old one when we do other changes until there's no much left that uses it and then we can eliminate it.

so i think the big question to answer is what are the real benefits of doing this change and i imagine that maybe there's one or two places in which this change can be a real performance improvement (not just theoretical), and there's maybe one or two places were this can make error handling code significantly simpler.

@oranagra, I understand. May I suggest we first make internal test to try and replace all occurrences and report back of the potential performance benefit (there might be none, but still as you correctly stated, it also has a maintenance benefit and we might consider migrating new code to use the new API) In case we will observe some performance benefit, we will start to scan for the most contributing places and narrow the change only to these places. Does this sounds like a plan?

Comment From: oranagra

i think you may be able to achieve the same with a smaller investment by reviewing the usages one by one without bothering to change them, and then think what should / could be benchmarked. but anyway, do what you think it best and report back what you found.

Comment From: uvletter

Yes I meant that leverage some macro trick to implement initializing a local variable and returning the pointer of it, but I find the macro is hard to think out... Just as @oranagra said, initializing a pointer to a local variable is strange, and a function(macro) returns a pointer to a stack variable is confusing, so introducing a new API seems more appropriate.

Comment From: ranshid

@oranagra - we have made some internal benchmark tests after apply the change over ALL use cases of dictGetIterator. as expected we were unable to identify a noticeable performance improvement, so we can continue with the option to expose the new API and preferably use it in future cases of dict iterator. I will update and link this issue to the PR once added.

Comment From: ranshid

Provided PR: #11245