Bug description https://x.com/alexalbert__/status/1812921642143900036

This anthropic dev says "Just add the header "anthropic-beta": "max-tokens-3-5-sonnet-2024-07-15" to your API calls."

and you can double the output tokens from 4096 to 8192.

There doesn't seem to be an option to add these extra headers.

That, on top of setting spring.ai.anthropic.chat.options.max-tokens: 8192 causes an error:

org.springframework.ai.retry.NonTransientAiException: 400 - {"type":"error","error":{"type":"invalid_request_error","message":"max_tokens: 8192 > 4096, which is the maximum allowed number of output tokens for claude-3-5-sonnet-20240620"}}
    at org.springframework.ai.autoconfigure.retry.SpringAiRetryAutoConfiguration$2.handleError(SpringAiRetryAutoConfiguration.java:95) ~[spring-ai-spring-boot-autoconfigure-1.0.0-20240717.140712-339.jar:1.0.0-SNAPSHOT]
    at org.springframework.web.client.ResponseErrorHandler.handleError(ResponseErrorHandler.java:63) ~[spring-web-6.1.10.jar:6.1.10]
    at org.springframework.web.client.StatusHandler.lambda$fromErrorHandler$1(StatusHandler.java:71) ~[spring-web-6.1.10.jar:6.1.10]
    at org.springframework.web.client.StatusHandler.handle(StatusHandler.java:146) ~[spring-web-6.1.10.jar:6.1.10]
    at org.springframework.web.client.DefaultRestClient$DefaultResponseSpec.applyStatusHandlers(DefaultRestClient.java:698) ~[spring-web-6.1.10.jar:6.1.10]
    at org.springframework.web.client.DefaultRestClient.readWithMessageConverters(DefaultRestClient.java:200) ~[spring-web-6.1.10.jar:6.1.10]
    at org.springframework.web.client.DefaultRestClient$DefaultResponseSpec.readBody(DefaultRestClient.java:685) ~[spring-web-6.1.10.jar:6.1.10]
    at org.springframework.web.client.DefaultRestClient$DefaultResponseSpec.toEntityInternal(DefaultRestClient.java:655) ~[spring-web-6.1.10.jar:6.1.10]
    at org.springframework.web.client.DefaultRestClient$DefaultResponseSpec.toEntity(DefaultRestClient.java:644) ~[spring-web-6.1.10.jar:6.1.10]
    at org.springframework.ai.anthropic.api.AnthropicApi.chatCompletionEntity(AnthropicApi.java:860) ~[spring-ai-anthropic-1.0.0-20240717.140712-345.jar:1.0.0-SNAPSHOT]
    at org.springframework.ai.anthropic.AnthropicChatModel.lambda$call$0(AnthropicChatModel.java:153) ~[spring-ai-anthropic-1.0.0-20240717.140712-345.jar:1.0.0-SNAPSHOT]
    at org.springframework.retry.support.RetryTemplate.doExecute(RetryTemplate.java:344) ~[spring-retry-2.0.6.jar:na]
    at org.springframework.retry.support.RetryTemplate.execute(RetryTemplate.java:217) ~[spring-retry-2.0.6.jar:na]
    at org.springframework.ai.anthropic.AnthropicChatModel.call(AnthropicChatModel.java:152) ~[spring-ai-anthropic-1.0.0-20240717.140712-345.jar:1.0.0-SNAPSHOT]
    at com.hooswhere.blog_agent.manager.AnthropicApiMgrImpl.generateOutline(AnthropicApiMgrImpl.java:29) ~[classes/:na]
    at com.hooswhere.blog_agent.temporal.activities.BlogPostActivitiesImpl.generateOutline(BlogPostActivitiesImpl.java:22) ~[classes/:na]
    at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:103) ~[na:na]
    at java.base/java.lang.reflect.Method.invoke(Method.java:580) ~[na:na]
    at io.temporal.internal.activity.RootActivityInboundCallsInterceptor$POJOActivityInboundCallsInterceptor.executeActivity(RootActivityInboundCallsInterceptor.java:64) ~[temporal-sdk-1.24.1.jar:na]
    at io.temporal.internal.activity.RootActivityInboundCallsInterceptor.execute(RootActivityInboundCallsInterceptor.java:43) ~[temporal-sdk-1.24.1.jar:na]
    at io.temporal.internal.activity.ActivityTaskExecutors$BaseActivityTaskExecutor.execute(ActivityTaskExecutors.java:107) ~[temporal-sdk-1.24.1.jar:na]
    at io.temporal.internal.activity.ActivityTaskHandlerImpl.handle(ActivityTaskHandlerImpl.java:124) ~[temporal-sdk-1.24.1.jar:na]
    at io.temporal.internal.worker.ActivityWorker$TaskHandlerImpl.handleActivity(ActivityWorker.java:278) ~[temporal-sdk-1.24.1.jar:na]
    at io.temporal.internal.worker.ActivityWorker$TaskHandlerImpl.handle(ActivityWorker.java:243) ~[temporal-sdk-1.24.1.jar:na]
    at io.temporal.internal.worker.ActivityWorker$TaskHandlerImpl.handle(ActivityWorker.java:216) ~[temporal-sdk-1.24.1.jar:na]
    at io.temporal.internal.worker.PollTaskExecutor.lambda$process$0(PollTaskExecutor.java:105) ~[temporal-sdk-1.24.1.jar:na]
    ```


**Expected behavior**
Either add the ability to add extra-headers, and remove 4096 default limit.

**Minimal Complete Reproducible example**
Set `spring.ai.anthropic.chat.options.max-tokens: 8192` and call the chatModel.


**Comment From: smaldd14**

Upon deeper dive, I do see that the `anthropic-beta` header is added as one of the default headers [here](https://github.com/spring-projects/spring-ai/blob/main/models/spring-ai-anthropic/src/main/java/org/springframework/ai/anthropic/api/AnthropicApi.java#L105)

I think a viable solution would be to add `spring.ai.anthropic.beta-headers` configuration property where users can specify what they want to use for the `anthropic-beta` header value. It currently defaults to `tools-2024-04-04`, but there have been new updates where you could specify `max-tokens-3-5-sonnet-2024-07-15` to increase max-tokens limit, as well as `messages-2023-12-15`.

You may also need to change the cap on max-tokens from 4096 to 8192 if `max-tokens-3-5-sonnet-2024-07-15` is set

**Comment From: tzolov**

Thanks for bringing this up @smaldd14 , let me see what i can do.

**Comment From: smaldd14**

> Thanks for bringing this up @smaldd14 , let me see what i can do.

@tzolov Great! Thanks for the quick response! Let me know if I can help at all.

**Comment From: tzolov**

@smaldd14 I believe this should do the trick: https://github.com/spring-projects/spring-ai/pull/1076 ?
Check the callWith8KResponseContext() test for an example.

**Comment From: tzolov**

Merged. Documentation is updated as well
<img width="1142" alt="docs" src="https://github.com/user-attachments/assets/d742f773-6b52-4653-8766-f6f35ce2572c">



**Comment From: smaldd14**

@tzolov Thanks for the quick resolve, I am trying it out now. Do I need to update spring.ai.bom.version to get these incorporated changes?

Or does a new jar for `spring-ai-anthropic-1.0.0-20240718.183133-351.jar` need to be released?

**Comment From: smaldd14**

Following up on this

Fix was put in in the `spring-ai-anthropic-1.0.0-20240725.162755-379` build. Thanks for the quick fix and deploy @tzolov  !

I noticed that you have to include both

spring.ai: anthropic: beta-version: max-tokens-3-5-sonnet-2024-07-15 chat.options: max-tokens: 8192 ```

For those wondering. If you do not include max-tokens it seems to use the default 500 for max-tokens.

Thanks again for the fix!