Hey team,
Could you add support for cached_tokens in the Usage model for OpenAI responses?
The new response structure from OpenAI looks like this:
"usage": {
"prompt_tokens": 2006,
"completion_tokens": 300,
"total_tokens": 2306,
"prompt_tokens_details": {
"cached_tokens": 1920
},
"completion_tokens_details": {
"reasoning_tokens": 0
}
}
For more details, you can refer to the official OpenAI documentation on prompt caching: OpenAI Prompt Caching Guide
This would be a valuable addition to enhance the accuracy of token usage tracking.
Thanks!