Bug description
The ChatClient
API makes it possible to pass system and user messages via the system()
and user()
clauses.
In that case, the final List<Message>
passed to the Prompt
contains the following:
SystemMessage
built fromsystem()
UserMessage
built fromuser()
The API also allows passing a list of messages directly via messages()
.
In that case, the final List<Message>
passed to the Prompt
contains the following:
List<Message>
built frommessages()
SystemMessage
built fromsystem()
UserMessage
built fromuser()
.
There is inconsistency in the two scenarios about where will the SystemMessage
built from system()
end up in the chat history.
Now, imagine using the few-shot prompting strategy and using the messages()
clause to pass the few-shot examples (list of UserMessage
and AssistantMessage
pairs). In that case, the SystemMessage
built from system()
ends up at the bottom of the list, which makes the few-shot prompting strategy not working in many cases due to the wrong position of the SystemMessage
.
A possible workaround is to pass the desired SystemMessage
via messages()
together with the few-shot examples, perhaps even failing the ChatClient call request if both messages()
and system()
+user()
are defined, but that would limit the convenience of the API.
Environment * Spring AI 1.0.0-SNAPSHOT * Java 22
Steps to reproduce
Single system message:
var content = chatClient.prompt()
.system("System text")
.messages(
new UserMessage("My question"),
new AssistantMessage("Your answer")
)
.call().content();
Multiple system messages:
var content = chatClient.prompt()
.system("System text")
.messages(
new SystemMessage("Historical system text"),
new UserMessage("My question"),
new AssistantMessage("Your answer")
)
.call().content();
Expected behavior
I would expect the use of system()
to result in a SystemMessage
always placed on the top of the message list, unless another one already exists passed via messages()
.
When messages()
is used and it does NOT include any SystemMessage
, I expect the following List<Message>
passed to the Prompt
:
SystemMessage
built fromsystem()
List<Message>
built frommessages()
UserMessage
built fromuser()
.
When messages()
is used and it DOES include any SystemMessage
, I expect the following List<Message>
passed to the Prompt
:
List<Message>
built frommessages()
(including one or moreSystemMessage
)SystemMessage
built fromsystem()
UserMessage
built fromuser()
.
There's room for introducing more structured support for few-shot prompting via the Advisor
API. I'm working on a few proposals in that direction, but this issue of the SystemMessage
might need fixing first.
I have a PR ready for implementing what described above, but I'm not 100% sure it's a good expected behaviour. It might be worth considering this issue in more general terms, including other common prompting strategies and the handling of the chat memory.
@tzolov what do you think? I'd be happy to discuss further about it and perhaps share a few experiments I've been working one.
Comment From: iAMSagar44
@ThomasVitale - Thanks for raising this issue. I have noticed inconsistencies in the LLM response when using the technique of 'few shot prompts' and massing the user and assistant messages in the chat client request (using the messages()
clause to pass the few-shot examples (list of UserMessage
and AssistantMessage
pairs). And this could be due to the placement of the defaultSystemMessage
in the list.
Due to this issue I had to remove the messages() list and not use few shot prompts. Instead I provided some examples in the system message prompt template - something similar to the example provided here.
Have you been able to successfully find a different workaround to resolve this issue?
Another question I had is when using the system message in the chat client builder, is the system message sent to the LLM during every user query or is it set once for the user session?
Comment From: markpollack
Regarding
When messages() is used and it DOES include any SystemMessage, I expect the following List
- List
built from messages() (including one or more SystemMessage) - SystemMessage built from system()
- UserMessage built from user().
What if we restrict any SystemMessage from being passed in via the method messages
, we throw an exception. I originally thought 'last one to specify wins' but that would still make understanding the code hard in terms of infering behavior - "principle of least surprise"
Comment From: ThomasVitale
Restricting the messages()
clause from supporting SystemMessage might block some use cases. I can think of three use cases for the messages()
clause.
- I don't use
system()
anduser()
at all. Instead, I pass any message explicitly viamessages()
. No problem here. - I use
system()
anduser()
, but I also want to adopt more sophisticated prompt design techniques, such as few-shots prompting. Therefore, I usemessages()
to provide the examples. I expect the only system message (passed viasystem()
) to be placed on the top of the list, then the examples inmessages()
, and finally the user message inuser()
. Today, this doesn't work because the system message is placed after the examples. - I use
system()
anduser()
, but I also manage the chat memory explicitly (no Advisor) and pass it viamessages()
. Here we might have one or more system messages present in the chat memory and passed viamessages()
. At this point, following the "principle of least surprise", I would apply the same rule as use case 2: if you specify a system message viasystem()
, that will be on the top of the list. If I want it placed differently, I can always leverage the list of messages I pass viamessages()
, so we wouldn't be restricting any edge case here. But the overall behaviour would be consistent.
Summary. My suggestion would be to make ChatClient
ALWAYS structure the Prompt
as follows:
- System message from
system()
(if exists) - Messages from
messages()
(if exists) - User message from
user()
(if exists)
What do you think @markpollack?