Spring-ai [ChatClient] Inconsistent handling of system messages

Bug description The ChatClient API makes it possible to pass system and user messages via the system() and user() clauses. In that case, the final List<Message> passed to the Prompt contains the following:

SystemMessage built from system()
UserMessage built from user()

The API also allows passing a list of messages directly via messages(). In that case, the final List<Message> passed to the Prompt contains the following:

List<Message> built from messages()
SystemMessage built from system()
UserMessage built from user().

There is inconsistency in the two scenarios about where will the SystemMessage built from system() end up in the chat history.

Now, imagine using the few-shot prompting strategy and using the messages() clause to pass the few-shot examples (list of UserMessage and AssistantMessage pairs). In that case, the SystemMessage built from system() ends up at the bottom of the list, which makes the few-shot prompting strategy not working in many cases due to the wrong position of the SystemMessage.

A possible workaround is to pass the desired SystemMessage via messages() together with the few-shot examples, perhaps even failing the ChatClient call request if both messages() and system()+user() are defined, but that would limit the convenience of the API.

Environment * Spring AI 1.0.0-SNAPSHOT * Java 22

Steps to reproduce

Single system message:

var content = chatClient.prompt()
                .system("System text")
                .messages(
                        new UserMessage("My question"),
                        new AssistantMessage("Your answer")
                )
                .call().content();

Multiple system messages:

var content = chatClient.prompt()
                .system("System text")
                .messages(
                        new SystemMessage("Historical system text"),
                        new UserMessage("My question"),
                        new AssistantMessage("Your answer")
                )
                .call().content();

Expected behavior

I would expect the use of system() to result in a SystemMessage always placed on the top of the message list, unless another one already exists passed via messages().

When messages() is used and it does NOT include any SystemMessage, I expect the following List<Message> passed to the Prompt:

SystemMessage built from system()
List<Message> built from messages()
UserMessage built from user().

When messages() is used and it DOES include any SystemMessage, I expect the following List<Message> passed to the Prompt:

List<Message> built from messages() (including one or more SystemMessage)
SystemMessage built from system()
UserMessage built from user().

There's room for introducing more structured support for few-shot prompting via the Advisor API. I'm working on a few proposals in that direction, but this issue of the SystemMessage might need fixing first.

I have a PR ready for implementing what described above, but I'm not 100% sure it's a good expected behaviour. It might be worth considering this issue in more general terms, including other common prompting strategies and the handling of the chat memory.

@tzolov what do you think? I'd be happy to discuss further about it and perhaps share a few experiments I've been working one.

Comment From: iAMSagar44

@ThomasVitale - Thanks for raising this issue. I have noticed inconsistencies in the LLM response when using the technique of 'few shot prompts' and massing the user and assistant messages in the chat client request (using the messages() clause to pass the few-shot examples (list of UserMessage and AssistantMessage pairs). And this could be due to the placement of the defaultSystemMessage in the list. Due to this issue I had to remove the messages() list and not use few shot prompts. Instead I provided some examples in the system message prompt template - something similar to the example provided here.

Have you been able to successfully find a different workaround to resolve this issue?

Another question I had is when using the system message in the chat client builder, is the system message sent to the LLM during every user query or is it set once for the user session?

Comment From: markpollack

Regarding

When messages() is used and it DOES include any SystemMessage, I expect the following List passed to the Prompt:

List built from messages() (including one or more SystemMessage)
SystemMessage built from system()
UserMessage built from user().

What if we restrict any SystemMessage from being passed in via the method messages, we throw an exception. I originally thought 'last one to specify wins' but that would still make understanding the code hard in terms of infering behavior - "principle of least surprise"

Comment From: ThomasVitale

Restricting the messages() clause from supporting SystemMessage might block some use cases. I can think of three use cases for the messages() clause.

I don't use system() and user() at all. Instead, I pass any message explicitly via messages(). No problem here.
I use system() and user(), but I also want to adopt more sophisticated prompt design techniques, such as few-shots prompting. Therefore, I use messages() to provide the examples. I expect the only system message (passed via system()) to be placed on the top of the list, then the examples in messages(), and finally the user message in user(). Today, this doesn't work because the system message is placed after the examples.
I use system() and user(), but I also manage the chat memory explicitly (no Advisor) and pass it via messages(). Here we might have one or more system messages present in the chat memory and passed via messages(). At this point, following the "principle of least surprise", I would apply the same rule as use case 2: if you specify a system message via system(), that will be on the top of the list. If I want it placed differently, I can always leverage the list of messages I pass via messages(), so we wouldn't be restricting any edge case here. But the overall behaviour would be consistent.

Summary. My suggestion would be to make ChatClient ALWAYS structure the Prompt as follows:

System message from system() (if exists)
Messages from messages() (if exists)
User message from user() (if exists)

What do you think @markpollack?