Spring Cloud Netflix ribbon RoundRobinRule not work in some case

Describe the bug when use spring-retry & ribbon, provider only have two instances, consumer will always call one instance when response code is 500

Sample

provider:
  ribbon:
    NFLoadBalancerRuleClassName: com.netflix.loadbalancer.RoundRobinRule
    MaxAutoRetries: 1
    MaxAutoRetriesNextServer: 0
    OkToRetryOnAllOperations: true
    retryableStatusCodes: 500

this will only happen if there are two provider instances, and request to provider always return ststuscode 500

find code in org.springframework.cloud.netflix.ribbon.RibbonLoadBalancedRetryPolicy

    public boolean canRetrySameServer(LoadBalancedRetryContext context) {
        return this.sameServerCount < this.lbContext.getRetryHandler().getMaxRetriesOnSameServer() && this.canRetry(context);
    }

    public boolean canRetryNextServer(LoadBalancedRetryContext context) {
        return this.nextServerCount <= this.lbContext.getRetryHandler().getMaxRetriesOnNextServer() && this.canRetry(context);
    }

    public void registerThrowable(LoadBalancedRetryContext context, Throwable throwable) {
        if (this.lbContext.getRetryHandler().isCircuitTrippingException(throwable)) {
            this.updateServerInstanceStats(context);
        }

        /*
        notice !!!
        after last same server retry，will do choose. 
        if current retry is call instance1, then the choose will select instance2
        so next request will to instance1, after retry, choose will select instance2
        loop like this,all request will to instance1
        */
        if (!this.canRetrySameServer(context) && this.canRetryNextServer(context)) {
            context.setServiceInstance(this.loadBalanceChooser.choose(this.serviceId));
        }

        if (this.sameServerCount >= this.lbContext.getRetryHandler().getMaxRetriesOnSameServer() && this.canRetry(context)) {
            this.sameServerCount = 0;
            ++this.nextServerCount;
            if (!this.canRetryNextServer(context)) {
                context.setExhaustedOnly();
            }
        } else {
            ++this.sameServerCount;
        }

    }

so why the judgment in canRetryNextServer is this.nextServerCount <= this.lbContext.getRetryHandler().getMaxRetriesOnNextServer() instead of this.nextServerCount < this.lbContext.getRetryHandler().getMaxRetriesOnNextServer()

then I find in spring-cloud-commons org.springframework.cloud.client.loadbalancer.InterceptorRetryPolicy

    //after first retry，canretry depend on canRetryNextServer
    public boolean canRetry(RetryContext context) {
        LoadBalancedRetryContext lbContext = (LoadBalancedRetryContext)context;
        if (lbContext.getRetryCount() == 0 && lbContext.getServiceInstance() == null) {
            lbContext.setServiceInstance(this.serviceInstanceChooser.choose(this.serviceName));
            return true;
        } else {
            return this.policy.canRetryNextServer(lbContext);
        }
    }

can like this ？

    public boolean canRetry(RetryContext context) {
        LoadBalancedRetryContext lbContext = (LoadBalancedRetryContext)context;
        if (lbContext.getRetryCount() == 0 && lbContext.getServiceInstance() == null) {
            lbContext.setServiceInstance(this.serviceInstanceChooser.choose(this.serviceName));
            return true;
        } else if(lbContext.getRetryCount() < lbContext.getRetryHandler().getMaxRetriesOnSameServer()) {
            return true;
        }else {
            // if MaxAutoRetriesNextServer is 0,return false
            return this.policy.canRetryNextServer(lbContext);
        }
    }

Comment From: OlgaMaciaszek

Please provide a minimal, complete, verifiable example that reproduces the issue (can be a separate project GH link or a test added on a branch).

Comment From: twogoods

@OlgaMaciaszek

Please provide a minimal, complete, verifiable example that reproduces the issue (can be a separate project GH link or a test added on a branch).

sample-project : https://github.com/twogoods/ribbon-retry-sample

Comment From: OlgaMaciaszek

@twogoods I am really sorry for not getting back to you earlier. We have decided to discontinue the Hoxton release train and Ribbon support and previously do critical issue maintenance only (appropriate decisions and schedules were published on our blog), so the issue will not be addressed.