This PR defines a new, strongly-typed API in Spring AI for capturing AI metadata and metrics sent in an AI response to a Prompt from an AI provider's (REST) API.
This new API includes both AI model usage metrics, such as Prompt and Generation (completion) token counts, along with AI provider access metrics, such as rate limits for both requests and tokens.
High-level feature additions in this PR include, but are not limited to:
- New
AiMetadata
,RateLimit
andUsage
interfaces making up the API. AiResponse
now includes (optional)AiMetadata
(AiMetadata.EMPTY
by default)- Implementation of the new API with OpenAI.
- Includes new method of testing AI provider REST API endpoints using OkHttp3
MockWebServer
, SpringMockMvc
and test class specific@RestController
to mock the AI provider's API for testing purposes.
For example, you can now do something like the following:
AiResponse response = aiClient.generate(prompt);
// process the AI's response (such as chat completion)
AiMetadata metadata = response.getMetadata();
long totalTokenCount = metadata.getUsage().getTotalTokens();
// do something responsible with this information
To see a complete example, have a look at the test.
In this API, I preferred strongly-typed objects (for example, AiMetadata
) over storing key-values in Map<String, Object>
objects present in AiResponse
and Generation
classes since it provides 1) type safety, 2) easier, more descriptive and programmatical access to allow for things like type conversion, encoding/decoding, etc and 3) more immediately apparent metadata avaiable from an AI provider that is uniformly accessible from Spring AI.
While this API may be more restrictive, or only capable of supporting the lowest-common denominator (LCD), we can always include support for free-form metadata, such as in the following example, which is not uncommon in Spring when you consider the PropertyResolver
API, for instance:
aiMetadata.getPropertyAs("propertyName", SomeType.class);
Subclassing will also given users the ability to access AI provider-specific metadata.
In short, it really should not matter to the Spring AI developer whether metadata is stored internally in a Map<?, ?>
, or by some other means.
TODO:
-
Upon initial review and discussion with both @markpollack and @tzolov, I recommend this feature be integrated and conditionally enabled based on a Spring property (for example:
spring.ai.openai.metadata.capture-enabled
). Spring Boot's auto-configuration (by property using@ConditionalOnProperty
) can help in this regard. - DONE -
Create other implementations of the AI metadata interfaces: Azure OpenAI, HuggingFace, etc.
-
Further exploration and enhancements could include integration with and exposing this AI metadata in Spring Boot Actuator.
-
In addition, there maybe clearer integration points directly with Micrometer as well.
Comment From: jxblum
Note, I made the commits granular in this PR so that 1) the changes were easier to combine or remove as necessary (Spring Boot style) and 2) so that you could follow the progression of development (thinking, direction) in this new feature.
Comment From: jxblum
I also think there is additional room for improvement on this initial implementation. For example. These can be addressed iteratively.
Comment From: jxblum
The source of information (metadata) pulled from an AI response during an AI request (Prompt) using OpenAI's API comes from:
1) The Chat Completion object. 2) Along with OpenAI's docuementation on Rate Limits.
Comment From: markpollack
Note, I made the commits granular in this PR so that ...
Raising the bar! My PRs are typically a mess.
Comment From: jxblum
Note, I made the commits granular in this PR so that ...
Raising the bar! My PRs are typically a mess.
Thank you @markpollack. I appreciate your feedback and review on this PR.
I addressed most of your concerns and feedback across a few new commits already. Specifically, I did the following:
- Renamed
AiMetadata
toGenerationMetadata
. - Repackaged the AI metadata under
org.springframework.ai.metadata
. - Implemented the NULL Object pattern for
RateLimit
andUsage
interfaces and returned theNULL
value objects fromGenerationMetadata
instead of throwing anIllegalStateException
. - Edited the Javadoc and documentation to match the API changes.
I am going to continue by building an implementation of the AI metadata API for Azure OpenAI and possibly Hugging Face.
Comment From: jxblum
I completed an initial implementation of the AI metadata API for Microsoft Azure OpenAI Service.
Additionally, I rebased this PR on the latest changes from main
so that this PR remains in a buildable and shippable state.
Comment From: markpollack
Cool.
The docs on https://learn.microsoft.com/en-us/java/api/com.azure.ai.openai.models.chatcompletions?view=azure-java-preview
Has a List<PromptFilterResults>
that pairs with the List<ChatChoice>
. Maybe this goes into GenerationMetadata
per Generation
instance as json string or something since the object structure is quite involved.
Comment From: jxblum
Cool.
The docs on https://learn.microsoft.com/en-us/java/api/com.azure.ai.openai.models.chatcompletions?view=azure-java-preview
Has a
List<PromptFilterResults>
that pairs with theList<ChatChoice>
. Maybe this goes intoGenerationMetadata
perGeneration
instance as json string or something since the object structure is quite involved.
I have been exploring the Microsoft Azure OpenAI documentation and source further, and the List
of PromptFilterResult
is technically tied to the ChatCompletions
(i.e. final result of the AI request), which also contains the List
of ChatChoice
.
I am guessing the PromptFilterResults
are not a 1-for-1 with ChatChoices
. For instance, a single prompt could generate 1 or more choices from the AI, which is also configurable using ChatCompletionsOptions.n
(option, accessible from getN()
), along with some other settings in ChatCompletionsOptions
I think (e.g. stop
perhaps?? and likely even maxTokens
, where completion cannot terminate either because the user has exhausted their token limit or reached the max).
I am starting to form some ideas around the model structure in Spring AI to capture and represent this data, again preferring strongly-type Objects and leveraging the NULL
Object pattern as you nicely suggested in this review.
Stay tuned. :)
Comment From: jxblum
While Microsoft Azure's OpenAI Service is much more robust in capturing metadata, it does not (thankfully) embed metadata in HTTP headers, unlike OpenAI.
I wrote an example application using the Microsofts OpenAI lib (and API) directly to peak at the HTTP headers coming back in the AI response (i.e. Completion) when issuing a Prompt (AI request). I wrote an Adapter for MS's HttpClient
to intercept the HTTP request/response.
So, it seems most of the [meta]data is capture in ChatCommpletions
, and the lists of PromptFilterResult
and ChatChoice
objects, along with ContentFilterResult
and so on.
OpenAI using Theo Kanning's Java lib (reference) is very limited by comparison.
Comment From: jxblum
Regarding my statement:
I am guessing the PromptFilterResults are not a 1-for-1 with ChatChoices.
I think this summarizes it nicely.
Specifically, in the Completions
class, public setN(:Integer)
method :
"Set the n property: The number of completions choices that should be generated per provided prompt as part of an overall completions response. Because this setting can generate many completions, it may quickly consume your token quota. Use carefully and ensure reasonable settings for max_tokens and stop."
Comment From: jxblum
I tweaked my example (MS Azure OpenAI) application, and given the following Prompt:
Give me 2 Java learning references.
Give me 2 Kotlin learning references.
I additional set MAX TOKENS
to 1000
and MAX COMPLETION CHOICES
to 5
.
The result was:
HTTP status code [200]
HTTP header [Cache-Control] is [no-cache, must-revalidate]
HTTP header [Content-Length] is [9499]
HTTP header [Content-Type] is [application/json]
HTTP header [access-control-allow-origin] is [*]
HTTP header [apim-request-id] is [4e6d1170-5bdc-4748-9ef4-5b3cfd8b4b68]
HTTP header [Strict-Transport-Security] is [max-age=31536000; includeSubDomains; preload]
HTTP header [x-accel-buffering] is [no]
HTTP header [x-request-id] is [ac8dc299-69e9-490f-9227-d10cec8c198d]
HTTP header [x-ms-client-request-id] is [e0dc04cb-fa95-4e75-9ebc-2e3a298cf5d7]
HTTP header [x-content-type-options] is [nosniff]
HTTP header [x-ms-region] is [East US]
HTTP header [Date] is [Mon, 20 Nov 2023 22:51:48 GMT]
Completions ID [cmpl-8N7PGf6TpNPSSUB24wBGsuWVqx5OS] completed at [2023-Nov-20 22:51:46]
JSON [{
"choices" : [ {
"finish_reason" : "stop",
"index" : 0,
"logprobs" : null,
"text" : "\n\n1. \"Java: A Beginner's Guide\" by Herbert Schildt - This book provides a comprehensive introduction to Java, covering everything from basic syntax to advanced topics such as multithreading and networking. It also includes practice exercises and quizzes to reinforce learning.\n\n2. Codeacademy's Java Course - This online course offers a hands-on approach to learning Java through interactive tutorials and practical projects. It covers topics such as variables, control flow, arrays, and object-oriented programming. It also has a community forum where learners can ask questions and get help from others.",
"content_filter_results" : {
"error" : null,
"hate" : {
"filtered" : false,
"severity" : "safe"
},
"self_harm" : {
"filtered" : false,
"severity" : "safe"
},
"sexual" : {
"filtered" : false,
"severity" : "safe"
},
"violence" : {
"filtered" : false,
"severity" : "safe"
}
}
}, {
"finish_reason" : "stop",
"index" : 1,
"logprobs" : null,
"text" : "\n1. Java Tutorials on Oracle: This is the official resource for learning Java developed by Oracle. It covers all the basics of Java programming, from beginner concepts to advanced topics such as Java Collections, Multithreading, and GUI development. The tutorials also include practical examples and exercises to reinforce your learning. You can access the Java Tutorials for free on the Oracle website.\n\n2. Head First Java by Kathy Sierra and Bert Bates: This book is a popular choice for beginners learning Java. It provides a fun and interactive approach to learning Java, using visuals, puzzles, and real-world examples. It covers all the core concepts of Java, including object-oriented programming, data types, control structures, and more. The book also includes hands-on exercises and projects to help you apply what you've learned. ",
"content_filter_results" : {
"error" : null,
"hate" : {
"filtered" : false,
"severity" : "safe"
},
"self_harm" : {
"filtered" : false,
"severity" : "safe"
},
"sexual" : {
"filtered" : false,
"severity" : "safe"
},
"violence" : {
"filtered" : false,
"severity" : "safe"
}
}
}, {
"finish_reason" : "stop",
"index" : 2,
"logprobs" : null,
"text" : "\n\n1. \"Head First Java\" by Kathy Sierra and Bert Bates - This book is highly recommended for beginners as it covers the fundamentals of Java in a fun and interactive way. It also includes exercises, puzzles, and real-world examples to reinforce learning.\n\n2. \"Java: A Beginner's Guide\" by Herbert Schildt - This comprehensive book covers all the basics of Java programming and also delves into advanced topics such as multithreading and network programming. It also includes helpful quizzes, self-assessment questions, and practice exercises.",
"content_filter_results" : {
"error" : null,
"hate" : {
"filtered" : false,
"severity" : "safe"
},
"self_harm" : {
"filtered" : false,
"severity" : "safe"
},
"sexual" : {
"filtered" : false,
"severity" : "safe"
},
"violence" : {
"filtered" : false,
"severity" : "safe"
}
}
}, {
"finish_reason" : "stop",
"index" : 3,
"logprobs" : null,
"text" : "\n\n1. Codeacademy's Java Course: Codeacademy is a popular online platform for learning coding languages. Their Java course is interactive and beginner-friendly, making it a great resource for beginners. It covers basic concepts, data types, control flow, and object-oriented programming. \nLink: https://www.codecademy.com/learn/learn-java\n\n2. Oracle's Java Tutorials: Oracle, the company that developed Java, offers a comprehensive set of tutorials for learning Java. It covers everything from basic syntax to advanced topics like multi-threading and networking. The tutorials are well-organized and include code examples to help you understand the concepts better. \nLink: https://docs.oracle.com/javase/tutorial/",
"content_filter_results" : {
"error" : null,
"hate" : {
"filtered" : false,
"severity" : "safe"
},
"self_harm" : {
"filtered" : false,
"severity" : "safe"
},
"sexual" : {
"filtered" : false,
"severity" : "safe"
},
"violence" : {
"filtered" : false,
"severity" : "safe"
}
}
}, {
"finish_reason" : "stop",
"index" : 4,
"logprobs" : null,
"text" : "\n\n1. \"Head First Java\" by Kathy Sierra and Bert Bates - This book is highly recommended for beginners as it uses a unique approach to teach Java through fun and engaging visuals, puzzles, and exercises.\n\n2. \"Java: A Beginner's Guide\" by Herbert Schildt - This comprehensive and well-structured guide covers all the basics of Java programming and includes exercises, quizzes, and real-world examples to help readers grasp the concepts effectively. ",
"content_filter_results" : {
"error" : null,
"hate" : {
"filtered" : false,
"severity" : "safe"
},
"self_harm" : {
"filtered" : false,
"severity" : "safe"
},
"sexual" : {
"filtered" : false,
"severity" : "safe"
},
"violence" : {
"filtered" : false,
"severity" : "safe"
}
}
}, {
"finish_reason" : "stop",
"index" : 5,
"logprobs" : null,
"text" : "\n\n1. \"Kotlin in Action\" by Dmitry Jemerov and Svetlana Isakova - This book is considered the definitive guide to learning Kotlin. It covers all the major features of the language, as well as best practices and real-world examples.\n\n2. \"Official Kotlin Documentation\" - This is the official website for Kotlin and includes comprehensive documentation, tutorials, and guides for learning the language. It also has a section for frequently asked questions and a forum for community support. ",
"content_filter_results" : {
"error" : null,
"hate" : {
"filtered" : false,
"severity" : "safe"
},
"self_harm" : {
"filtered" : false,
"severity" : "safe"
},
"sexual" : {
"filtered" : false,
"severity" : "safe"
},
"violence" : {
"filtered" : false,
"severity" : "safe"
}
}
}, {
"finish_reason" : "stop",
"index" : 6,
"logprobs" : null,
"text" : "\n\n1. \"Kotlin in Action\" book by Dmitry Jemerov and Svetlana Isakova - This book is a comprehensive guide to learning Kotlin, covering all the major aspects of the language in a clear and concise manner.\n\n2. \"Kotlin for Android Developers: Learn Kotlin the easy way while developing an Android App\" course on Udemy - This course is specifically designed for Android developers who want to learn Kotlin. It covers all the basics of the language and how to use it for Android app development. ",
"content_filter_results" : {
"error" : null,
"hate" : {
"filtered" : false,
"severity" : "safe"
},
"self_harm" : {
"filtered" : false,
"severity" : "safe"
},
"sexual" : {
"filtered" : false,
"severity" : "safe"
},
"violence" : {
"filtered" : false,
"severity" : "safe"
}
}
}, {
"finish_reason" : "stop",
"index" : 7,
"logprobs" : null,
"text" : "\n1. \"Kotlin in Action\" by Dmitry Jemerov and Svetlana Isakova - This book is the official guide to Kotlin, written by the creators of the language. It covers all aspects of the language, from basic syntax to advanced features, and includes hands-on exercises and examples.\n\n2. \"Kotlin Bootcamp for Programmers\" by Google - This free online course, created by Google in partnership with Udacity, is designed for programmers with some prior experience who want to learn Kotlin. It covers the basics of the language, as well as best practices and common use cases. The course also includes interactive coding exercises and quizzes.",
"content_filter_results" : {
"error" : null,
"hate" : {
"filtered" : false,
"severity" : "safe"
},
"self_harm" : {
"filtered" : false,
"severity" : "safe"
},
"sexual" : {
"filtered" : false,
"severity" : "safe"
},
"violence" : {
"filtered" : false,
"severity" : "safe"
}
}
}, {
"finish_reason" : "stop",
"index" : 8,
"logprobs" : null,
"text" : "\n\n1. \"Kotlin in Action\" by Dmitry Jemerov and Svetlana Isakova - This book is widely recognized as the go-to resource for learning Kotlin. It covers all the key features of the language and provides practical examples and exercises to help readers grasp concepts easily.\n\n2. \"Kotlin for Android Developers\" by Antonio Leiva - This book focuses on using Kotlin for Android app development. It covers the basics of the language and then dives into advanced topics such as coroutines, functional programming, and reactive programming. It also includes real-world examples and projects to help readers apply their knowledge. ",
"content_filter_results" : {
"error" : null,
"hate" : {
"filtered" : false,
"severity" : "safe"
},
"self_harm" : {
"filtered" : false,
"severity" : "safe"
},
"sexual" : {
"filtered" : false,
"severity" : "safe"
},
"violence" : {
"filtered" : false,
"severity" : "safe"
}
}
}, {
"finish_reason" : "stop",
"index" : 9,
"logprobs" : null,
"text" : "\n\n1. Official Kotlin Documentation: This is the most comprehensive and up-to-date resource for learning Kotlin. It covers everything from basic syntax to advanced features, and also includes tutorials and code examples. You can access it at https://kotlinlang.org/docs/home.html.\n\n2. Kotlin for Android Developers: This book by Antonio Leiva is a great resource for learning Kotlin specifically for Android development. It covers the basics of the language and how to use it to build Android apps. It also includes real-world examples and best practices. You can find it on Amazon or other online bookstores.",
"content_filter_results" : {
"error" : null,
"hate" : {
"filtered" : false,
"severity" : "safe"
},
"self_harm" : {
"filtered" : false,
"severity" : "safe"
},
"sexual" : {
"filtered" : false,
"severity" : "safe"
},
"violence" : {
"filtered" : false,
"severity" : "safe"
}
}
} ],
"created" : 1700520706.000000000,
"id" : "cmpl-8N7PGf6TpNPSSUB24wBGsuWVqx5OS",
"usage" : {
"completion_tokens" : 1202,
"prompt_tokens" : 16,
"total_tokens" : 1218
},
"prompt_annotations" : null,
"prompt_filter_results" : [ {
"prompt_index" : 0,
"content_filter_results" : {
"error" : null,
"hate" : {
"filtered" : false,
"severity" : "safe"
},
"self_harm" : {
"filtered" : false,
"severity" : "safe"
},
"sexual" : {
"filtered" : false,
"severity" : "safe"
},
"violence" : {
"filtered" : false,
"severity" : "safe"
}
}
}, {
"prompt_index" : 1,
"content_filter_results" : {
"error" : null,
"hate" : {
"filtered" : false,
"severity" : "safe"
},
"self_harm" : {
"filtered" : false,
"severity" : "safe"
},
"sexual" : {
"filtered" : false,
"severity" : "safe"
},
"violence" : {
"filtered" : false,
"severity" : "safe"
}
}
} ]
}]
Prompt Tokens Used [16]
Completion Tokens Used [1202]
Total Tokens Used [1218]
Either my prompt was terrible or the AI does not know how to listen very well.
That was 10 choices with 1218 tokens.
I suppose 10 choices is in line with 2 prompts multiplied by 5 choices. I was under the impression that I was giving it 5 choices "total". I guess not.
But, the tokens, hmm???
Comment From: jxblum
FTR, my CompletionsOptions
was constructed as:
CompletionsOptions options = newCompletionsOptions(
"Give me 2 Java learning references.",
"Give me 2 Kotlin learning references."
);
// ...
CompletionsOptions newCompletionsOptions(List<String> prompt) {
return new CompletionsOptions(prompt)
.setMaxTokens(AI_MAX_TOKENS_PER_REQUEST)
.setModel(AZURE_OPENAI_MODEL)
.setN(AI_MAX_COMPLETION_CHOICES)
.setTemperature(AI_TEMPERATURE);
}
Where the constants are defined as:
private static final int AI_MAX_COMPLETION_CHOICES = 5;
private static final int AI_MAX_TOKENS_PER_REQUEST = 1000;
private static final double AI_TEMPERATURE = 0.75d;
protected static final String AZURE_OPENAI_MODEL = "gpt-35-turbo-instruct";
Comment From: markpollack
removed some of the older fields that were intended to capture metadata about requests.
merged as 37a4884c51eab94b328b8af870a94eb4f1e4fd03