Describe the bug

SUBSTR returns an empty string when end is less than start. However, if start is 0, the first character is returned.

To reproduce

> set test cat
OK
> substr test 1 -500
""
> substr test 0 -500
"c"

Expected behavior

If end < start, SUBSTR should return an empty string.

Additional information

Redis server v=7.0.8 sha=00000000:0 malloc=jemalloc-5.2.1 bits=64 build=8b9bd5cdb53a6549

Comment From: enjoy-binbin

Not sure if that's expected, but it does behave a little odd.

127.0.0.1:6379> set key abc
OK
127.0.0.1:6379> getrange key -100 -100
"a"
127.0.0.1:6379> getrange key -100 -101
""
127.0.0.1:6379> getrange key -100 -99
"a"

the reason is that: - the "" is returned ASAP when (start < 0 && end < 0 && start > end) - the other cases the start and the end became 0 at the last, so it behave just like getrange key 0 0

    /* Convert negative indexes */
    if (start < 0 && end < 0 && start > end) {
        addReply(c,shared.emptybulk);
        return;
    }
    if (start < 0) start = strlen+start;
    if (end < 0) end = strlen+end;
    if (start < 0) start = 0;
    if (end < 0) end = 0;
    if ((unsigned long long)end >= strlen) end = strlen-1;

@oranagra please take a look and make a call

Comment From: oranagra

let's improve that, but since it'll be a potentially breaking change, we can handle that only in 8.0. p.s. many of these complications are probably because end is inclusive. if it wasn't then the other cases would return an empty string as well, so maybe it could be easier to convert the end to non-inclusive at the beginning of the command, and then the rest of the code would be easier with less surprises.

Comment From: jimsnab

Maybe?

    /* Convert negative indexes */
    if (start < 0) start = strlen+start;
    if (end < 0) end = strlen+end;

    /* Bounds enforcement */
    if (start < 0) start = 0;
    if ((unsigned long long)start >= strlen || end < start) {
        addReply(c,shared.emptybulk);
        return;
    }

    if ((unsigned long long)end >= strlen) end = strlen-1;

Comment From: yashLadha

One more thing to add here, as per my understanding if the end is a negative integer, it should work like a backward traversal.

127.0.0.1:6379> set test cat
OK
127.0.0.1:6379> substr test 0 -100
"c"
127.0.0.1:6379> substr test -1 -100
""
127.0.0.1:6379> substr test 0 0
"c"
127.0.0.1:6379> substr test 2 -1
"t"
127.0.0.1:6379> substr test 2 -2
""
127.0.0.1:6379> 

I think this case also should be added. Open to other views.

Comment From: slavak

Seems to me it should be sufficient to move the condition for start > end to before clamping both to the valid index range. So instead of:

if (start < 0) start = strlen+start;
if (end < 0) end = strlen+end;
if (start < 0) start = 0;
if (end < 0) end = 0;
if ((unsigned long long)end >= strlen) end = strlen-1;

/* Precondition: end >= 0 && end < strlen, so the only condition where
 * nothing can be returned is: start > end. */
 if (start > end || strlen == 0) {
     addReply(c,shared.emptybulk);
} else { ...

it would become:

if (start < 0) start = strlen+start;
if (end < 0) end = strlen+end;

if (start > end || strlen == 0) {
    addReply(c,shared.emptybulk);
}

if (start < 0) start = 0;
if (end < 0) end = 0;
if ((unsigned long long)end >= strlen) end = strlen-1;

This seems to work correctly, except it behaves a little strangely in an interactive redis-cli session:

127.0.0.1:6379> SET s REDIS
OK
127.0.0.1:6379> GETRANGE s 0 -500
""
127.0.0.1:6379> GETRANGE s 0 -500
"R"
127.0.0.1:6379> GETRANGE s 0 -500
""
127.0.0.1:6379> GETRANGE s 0 -500
"R"

I'm not familiar with redis-cli and hiredis, which it depends on, so I don't quite understand why this is happening yet.

Comment From: linzihao1999

Hi @enjoy-binbin

I am learning for Redis, and made a change( #12272 ) for this issue. Please give any ideas when you have time. Thanks.