Affects: - Spring Boot v2.5.5 with spring-boot-starter-websocket, which uses - spring-messaging, 5.3.10 - spring-websocket, 5.3.10 - spring-boot-starter-web, 2.5.5


We are using spring boot with STOMP via SockJS and Websockets. We have app and website clients. However the Websocket Stats do not match the actual user count after a certain amount of time, here is an excerpt from our logs. You can already see at the third message, that there are more websocket sessions reporterd than there are sessions in general. If you look at the last message, this will go to extremes (573 actual sessions, 4791 reported websocket sessions).

I don't know if this issue also extends to Http Streaming or Polling, because we have very few clients using this method.

2022-10-18 21:08:04.381  INFO 1 --- [MessageBroker-1] o.s.w.s.c.WebSocketMessageBrokerStats    : WebSocketSession[177 current WS(175)-HttpStream(2)-HttpPoll(0), 182 total, 0 closed abnormally (0 connect failure, 0 send limit, 5 transport error)], stompSubProtocol[processed CONNECT(182)-CONNECTED(182)-DISCONNECT(0)], stompBrokerRelay[null], inboundChannel[pool size = 8, active threads = 0, queued tasks = 0, completed tasks = 1107], outboundChannel[pool size = 8, active threads = 0, queued tasks = 0, completed tasks = 369], sockJsScheduler[pool size = 4, active threads = 1, queued tasks = 174, completed tasks = 361]
2022-10-18 21:38:04.382  INFO 1 --- [MessageBroker-2] o.s.w.s.c.WebSocketMessageBrokerStats    : WebSocketSession[167 current WS(166)-HttpStream(1)-HttpPoll(0), 263 total, 0 closed abnormally (0 connect failure, 0 send limit, 44 transport error)], stompSubProtocol[processed CONNECT(263)-CONNECTED(263)-DISCONNECT(8)], stompBrokerRelay[null], inboundChannel[pool size = 8, active threads = 0, queued tasks = 0, completed tasks = 1890], outboundChannel[pool size = 3, active threads = 0, queued tasks = 0, completed tasks = 1974], sockJsScheduler[pool size = 4, active threads = 1, queued tasks = 164, completed tasks = 12065]
...
2022-10-18 22:38:04.383  INFO 1 --- [MessageBroker-4] o.s.w.s.c.WebSocketMessageBrokerStats    : WebSocketSession[138 current WS(140)-HttpStream(1)-HttpPoll(0), 333 total, 3 closed abnormally (3 connect failure, 0 send limit, 97 transport error)], stompSubProtocol[processed CONNECT(328)-CONNECTED(328)-DISCONNECT(10)], stompBrokerRelay[null], inboundChannel[pool size = 8, active threads = 0, queued tasks = 0, completed tasks = 2571], outboundChannel[pool size = 8, active threads = 0, queued tasks = 0, completed tasks = 4922], sockJsScheduler[pool size = 4, active threads = 1, queued tasks = 135, completed tasks = 32608]
...
2022-10-24 07:38:04.475  INFO 1 --- [MessageBroker-4] o.s.w.s.c.WebSocketMessageBrokerStats    : WebSocketSession[573 current WS(4791)-HttpStream(1)-HttpPoll(2), 56490 total, 153 closed abnormally (153 connect failure, 0 send limit, 26430 transport error)], stompSubProtocol[processed CONNECT(52917)-CONNECTED(52913)-DISCONNECT(5201)], stompBrokerRelay[null], inboundChannel[pool size = 8, active threads = 0, queued tasks = 0, completed tasks = 484905], outboundChannel[pool size = 8, active threads = 0, queued tasks = 0, completed tasks = 937815], sockJsScheduler[pool size = 4, active threads = 1, queued tasks = 569, completed tasks = 6611570]

Comment From: rstoyanchev

573 actual sessions, 4791 reported websocket sessions

This is strange. The first is the size of sessions map in SubProtocolWebSocketHandler, and the second is a counter that gets incremented/decremented as sessions are added/removed. They are always updated together, in afterConnectionEstablished and in clearSession. I don't understand how the two can get out of sync.

I don't know if this issue also extends to Http Streaming or Polling, because we have very few clients using this method.

It does seem like there is a fair number of transport errors (connection dropped). Also those "closed abnormally" means that a WebSocket connection can be established but no messages came which could be a proxy issue. In cases a WebSocket connection can't even be established, it should fall back on HTTP transports, and I would have thought those polling/streaming session numbers should be higher.

Comment From: daniel-naegele

We use traefik as our proxies and have no problems connecting via websocket. The http status code stats do not indicate any abnormal behaviour regarding the proxying of requests. The "connect failure" stats are probably from users behind coporate proxies, which remove the Upgrade Header from all requests, which we know exist (a few). I guess the huge number of transport errors are rooted in the long uptime and that we have app clients, which can switch networks, be in a car, ...

Comment From: rstoyanchev

Okay, I finally see where the issue is. We keep stats by WebSocket session type (websocket, stream, poll), and therefore need to locate the right counter by session type. However, the session is decorated after it is established, and when it is closed it must be unwrapped. This works as expected when the session is closed normally since ConcurrentWebSocketSessionDecorator calls the delegate. However, in case of SessionLimitExceededException, when the session is not websocket, we fail to find the right counter.

Comment From: rstoyanchev

On closer look at the stats again , I don't think there are limit exceeded exceptions, or they would show as "send limit". In any case, I did find a problem, and will fix it. I'm not sure what else to do here without a way to reproduce it.

Comment From: rstoyanchev

Closing this for now. You can try with 5.3.26 once available.

Comment From: daniel-naegele

Hey, we are still experiencing this issue with Spring Boot 3.0.6, which uses spring-messaging and spring-websocket with version 6.0.8. Here is an excerpt from our logs:

023-05-11T14:50:29.682Z  INFO 1 --- [MessageBroker-4] o.s.w.s.c.WebSocketMessageBrokerStats    : WebSocketSession[928 current WS(2983)-HttpStream(3)-HttpPoll(6), 125916 total, 483 closed abnormally (483 connect failure, 0 send limit, 56579 transport error)], stompSubProtocol[processed CONNECT(120381)-CONNECTED(118726)-DISCONNECT(2413)], stompBrokerRelay[null], inboundChannel[pool size = 8, active threads = 0, queued tasks = 0, completed tasks = 1089279], outboundChannel[pool size = 8, active threads = 0, queued tasks = 0, completed tasks = 1467036], sockJsScheduler[pool size = 4, active threads = 1, queued tasks = 907, completed tasks = 3746831]