The API for RedisModule_HashGet uses C varargs. For our use case, we would like to call it with a list of keys to retrieve that is not known at compile time (specifically from redismodule-rs, which would take a vector of keys to retrieve and then call the HashGet API with arguments based on this vector).

Unfortunately there’s no standard way of passing a list or array to a function that uses C varargs. The usual solution to this is having a function accepts a pointer to an array, and wrap that function with a higher level one that uses varargs. This is the approach used by the standard execl(3) vs execv(3) POSIX APIs, or by many printf implementations (vprintf).

A description of the problem that the feature will solve, or the use-case with which the feature will be used.

We suggest adding a couple of APIs to address this:

int RM_HashGetV(RedisModuleKey *key, int flags, StringRedisModule *const fields[], StringRedisModule *vals[]);
int RM_HashSetV(RedisModuleKey *key, int flags, StringRedisModule *const fields[], const StringRedisModule *vals[]);

(the V suffix was inspired by execv and similar functions, we could come up with a better name).

Comment From: gavrie

@yossigo @slavak this is the feature request we discussed.

Comment From: yossigo

@gavrie Did you consider the other Module API calls that use varargs and how applicable this is for them as well?

Comment From: oranagra

It would be nice to have an alternative to each one of the APIs that use varargs. But i think in this case it may be better to add a REDISMODULE_HASH_VECTOR flag. So you would call it like:

RedisModule_HashSet(mykey, REDISMODULE_HASH_VECTOR, count, name_array, value_array);

For other APIs that get a format string, we can add the v format arg like the one we have in RM_Call:

 *   v    The argument(s) is a vector of RedisModuleString.

I know that @ashtul was looking into that as well.

Comment From: gavrie

@oranagra I think that the existing implementation using REDISMODULE_HASH_CFIELDS is error prone enough as it is. I understand the motivation to keep it consistent, but I would actually prefer deprecating this API altogether and adding more typesafe and orthogonal alternatives, similar to the execl, execle, execlp, execv, execvp example.

Using printf-style format strings wrongly is one of the most common errors in C code (or at least used to be, until compilers became smart enough to check this at compile time). So while it would make sense for a function accepting many types of values, that's not relevant in this case.

So we could have one dimension specifying a vector or varargs, an another one specifying the type of those args, e.g.:

int RedisModule_HashSetL(RedisModuleKey *key, int flags, ...); /* varargs are of type `StringRedisModule*` */
int RedisModule_HashSetLC(RedisModuleKey *key, int flags, ...); /* varargs are of type `char*` */
int RedisModule_HashSetV(RedisModuleKey *key, int flags, StringRedisModule *const fields[], StringRedisModule *vals[]);
int RedisModule_HashSetVC(RedisModuleKey *key, int flags, char *const fields[], char *vals[]);

This would allow catching more errors at compile time.

Note regarding the L/V/C suffixes: These should be familiar to C programmers due to standard libc conventions, but they could be replaced by longer/friendlier ones.

Comment From: gavrie

@yossigo regarding the other APIs that take format strings -- I agree with @oranagra's suggestion.

Comment From: gavrie

@oranagra regarding a count argument vs relying on the arrays being NULL-terminated: Do you have a specific reason to add the count? I'm not opposed to it, just curious.

Comment From: oranagra

@gavrie no reason, just seem more explicit. also now that i'm thinking of it, if the array was already allocated by the caller, an adapter function is able to use the count variant by not the other way around.

regarding your previous post, which suggestion do you agree with? the v format string? considering your arguments, this suggestion is even worse than the REDISMODULE_HASH_VECTOR which to me sounds like an easy way out here (no type checking, at at least the usage is quite simple)

Comment From: gavrie

@gavrie no reason, just seem more explicit. also now that i'm thinking of it, if the array was already allocated by the caller, an adapter function is able to use the count variant by not the other way around.

Very well.

regarding your previous post, which suggestion do you agree with? the v format string?

Indeed.

considering your arguments, this suggestion is even worse than the REDISMODULE_HASH_VECTOR which to me sounds like an easy way out here (no type checking, at at least the usage is quite simple)

Ok, I won't argue here -- your suggestion doesn't make the API much worse than it already is 😄.

Would you care to explain why my suggestion is worse? I agree that it's quite verbose, but (a) it does add some type safety and (b) it's the de facto way to "overload" functions in C.

Comment From: oranagra

traditionally, redis is doing everything differently than any pre-existing standard.. so that's breaking tradition 8-) (just kidding)

anyway, seriously: i feel my suggestion is more in line with the current API (not just this HashSet function, but many others). and i kinda dislike adding 4 APIs and some duplicate boiler plate code, docs, etc, in place of one or two ifs inside one function.

Although my suggestion does pose s problem for detecting if this feature API is supported ( https://github.com/redis/redis/issues/7801) (considering that previous versions of redis didn't return an error when provided with unsupported flags.

Comment From: slavak

@oranagra The external API can always be a thin wrapper (possibly even using a macro) around a common internal implementation with some conditional ifs. Grouping several function variants into a single family for documentation purposes is pretty common practice, and as long as the naming style is consistent it should make the API quite readable and easy to navigate.

I'm not necessarily arguing in favor of that option; There's a lot to be said for maintaining consistency within the API, and I'm assuming you wouldn't want to redesign the entire modules API along with this change. But I do have to agree that the current API using variadic arguments with highly overloaded semantics isn't exactly the most user-friendly I've seen.

Comment From: oranagra

P.S one more consideration to break and deprecate the existing API for hash and zset is that they don't allow for an error return value. see https://github.com/redis/redis/pull/7865 this is not related to the varargs input, but if we do want to break the API it's an opportunity to fix it.

Comment From: gavrie

P.S one more consideration to break and deprecate the existing API for hash and zset is that they don't allow for an error return value. see #7865

HashGet does return an error value. HashSet doesn't, but how would it do so, if it needs to return the number of fields that it has set (I guess for feature parity with HSET)? This is C after all -- so you would need to add a pointer-based output parameter for either the error or the count.

@oranagra so what decision are you leaning towards? I already specified my preference for: 1. Not using varargs as the main API (but support it as a secondary API since it's useful) 2. Not using a flag to get around C's (already very minimal) type system but rather using named functions for different argument types.

How to return both an error value and the number of values remains an open issue. However, it seems like the "Redis way" is to avoid error values and make functions infallible. Looking at how it's done for HSET was intrigued to learn about the checkType hack, but that would not be applicable here. So a convention is needed for returning errors, which again would make the API inconsistent with how it's currently implemented. Having HashSet return the error instead of the field count seems best, and then the field count would become an output parameter.

Comment From: slavak

What's wrong with C's standard way of returning positive values for "field count" and negative values for errors?

I mean, other than the fact that it's horrifying and not type safe... But it's the way it's been done for decades.

Comment From: gavrie

@slavak nothing except the fact that Redis defines REDISMODULE_ERR as a positive number (1) and this is what is returned from all other functions... so it would be a big change.

Comment From: oranagra

the only thing that's consistent in redis is that it's inconsistent. 8-) i think my problem with returning negative number is that existing code didn't expect it, so it may break something.

my initial thought was that i rather add a new flag to the existing API and avoid adding any new ones. it looks to me more in the spirit of the existing approach (more options in one function, no need to deprecate anything).

but considering the current API is bad for several reasons, i'm considering to replace / deprecate it, but i'm still uncomfortable with adding 4 variants (maybe even more some day) for each API.

Would like to hear opinions from @yossigo @guybe7 @MeirShpilraien

Comment From: MeirShpilraien

I agree this API should be deprecated but because of the return value, which @gavrie, if you check closely, the RM_HashSet returns the number of fields changed (so new fields are not counted), so if you set an entirely new hash you will get zero and you have no way to know if it failed or succeeded. @gavrie regarding your suggestion, I also do not like a lot of functions with different arguments that do the same but I see its very common in C so probably I am the problem. The real issue I see with it (and maybe I just missed it somewhere in your suggestion) is that it does not cover the Cstr fields with RedisModuleString values which we have in the current API, so we basically reduce the capabilities?

Comment From: gavrie

I agree this API should be deprecated but because of the return value, which @gavrie, if you check closely, the RM_HashSet returns the number of fields changed (so new fields are not counted), so if you set an entirely new hash you will get zero and you have no way to know if it failed or succeeded.

@MeirShpilraien good point, thanks!

@gavrie regarding your suggestion, I also do not like a lot of functions with different arguments that do the same but I see its very common in C so probably I am the problem.

That's C, not you... no overloading. It's even older than me by 2 years 😂...

The real issue I see with it (and maybe I just missed it somewhere in your suggestion) is that it does not cover the Cstr fields with RedisModuleString values which we have in the current API, so we basically reduce the capabilities?

I suggested two dimensions: - Vector (V) vs varargs (L) - C strings (C) vs RedisModuleString (no suffix)

So the VC and LC variants would have C fields with RMS values, as opposed to RMS fields with RMS values (Redis doesn't support a variant of C fields with C values so that's not needed).

Comment From: gavrie

Quoting from a random C man page:

int printf(const char * restrict format, ...);
int fprintf(FILE * restrict stream, const char * restrict format, ...);
int sprintf(char * restrict str, const char * restrict format, ...);
int snprintf(char * restrict str, size_t size, const char * restrict format, ...);
int asprintf(char **ret, const char *format, ...);
int dprintf(int fd, const char * restrict format, ...);
int vprintf(const char * restrict format, va_list ap);
int vfprintf(FILE * restrict stream, const char * restrict format, va_list ap);
int vsprintf(char * restrict str, const char * restrict format, va_list ap);
int vsnprintf(char * restrict str, size_t size, const char * restrict format, va_list ap);
int vasprintf(char **ret, const char *format, va_list ap);
int vdprintf(int fd, const char * restrict format, va_list ap);

Comment From: guybe7

TBH i don't really see the point in: 1. having a Redis Module API take any string other than RMS (maybe it could have some advantages, but it's against the API spirit) 2. having a SET/GET API being vararg - i'm guessing most users have an RMS array of fields/values anyway...

i think that we should 1. deprecate the existing API (mostly because of the return value issue) 2. replace it with an API that takes an array of RMS (the V option) only

Comment From: gavrie

@guybe7 From an API implementor's standpoint that may be correct. But from a user's (i.e. module developer's) perspective, the combination of varargs with C strings for the field names is extremely simple to use in many cases.

We could do some research by looking at the usage of these APIs on GitHub and see how people built the arguments.

Comment From: MeirShpilraien

@guybe7 we actually using the option of passing C string and not RMS

Comment From: guybe7

so i may ask "why shouldn't all APIs take take RMS/array should have a variant of Cstring/vararg?"

my point is that, even if having an API for every user is useful, it's a nightmare to maintain...

Comment From: oranagra

related issue: https://github.com/redis/redis/issues/6914

Comment From: gavrie

BTW, I just saw that even hiredis uses this convention:

int redisvAsyncCommand(redisAsyncContext *ac, redisCallbackFn *fn, void *privdata, const char *format, va_list ap);
int redisAsyncCommand(redisAsyncContext *ac, redisCallbackFn *fn, void *privdata, const char *format, ...);

Comment From: slavak

Keep in mind that taking a va_list argument is only useful for passing through a call from another varargs function. As far as a I know there's no (sane?) way of filling a va_list dynamically at runtime, so usage which requires a dynamic number of arguments known only at runtime is not enabled by such an API.

Comment From: zuiderkwast

RM_HashGet does one hash lookup per field, so there isn't any overhead (almost) if you just call RM_HashGet with single field at a time, repeatedly for each field. The same is true for RM_HashSet.

Therefore, I don't think adding any more variants of this function is necessary, really. The docs could simply clarify this by including an example with a loop where one field is fetched at a time.

@oranagra, @yossigo or @guybe7 does this make sense?

Comment From: tessykt

any news on this topic?

Comment From: slavak

@zuiderkwast The extra function call overhead over enough iterations might well make a difference.

Comment From: yassinemoukeddar

Any updates regarding this one?