Сursors in certain circumstances became useless, because hold all the data in the memory, causing OOM: - on large number of rows fetched via one cursor (depends on Xmx) - on any amount of data fetched via cursor, if that cursor is created and iterated sufficient number of times (e.g. fetch 1000 rows in a loop via different cursors N times (N depends on Xmx))
This happens because MyBatis: 1. registers all cursors created within session and hold a hard reference to them (so they could not be garbage collected) 2. cursor itself caches each row's data, when there is any nested result map, via hard reference
MyBatis version
3.5.11
Database vendor and version
PostgreSQL 13
Test case or example project
https://github.com/gallyamb/mybatis-3/blob/cursor-cache-oom/src/test/java/org/apache/ibatis/submitted/cursor_cache_oom/CursorOomTest.java
Steps to reproduce
Use result map with nested result map and fetch data via cursor
Expected result
The memory will be used at constant level
Actual result
The memory used as linear with fetched amount of data, i.e. more rows fetched via cursor - more memory is used
Comment From: gallyamb
FYI test case in PR https://github.com/mybatis/mybatis-3/pull/2813
Comment From: harawata
Hello @gallyamb ,
When using cursor involving nested results, you should specify resultOrdered="true"
in the <select>
.
The test still won't pass, but the DefaultResultSetHandler.nestedResultObjects
will hold nested objects of the 'current' item only.
https://mybatis.org/mybatis-3/sqlmap-xml.html#select
Please let me know if the problem persists.
Comment From: gallyamb
@harawata thank you for a quick reply!
I confirm, that adding resultOrdered="true"
result in keeping only current row's data in memory (pushed a commit with that change). I've added resultOrdered="true"
, but did not add ORDER BY
clause in SQL, and everything works like a charm
The Cusros' javadoc says, that resultOrdered="true"
have to be used, when I have a collection, but in my test case I haven't. So why each row's data is still persisted in memory? Isn't such behaviour is a bug?
https://github.com/mybatis/mybatis-3/blob/b175c22d0b8b13bf4745424179dc96d1975d4dd7/src/main/java/org/apache/ibatis/cursor/Cursor.java#L21-L23
Comment From: harawata
Hi @gallyamb ,
When there is a <collection>
and the rows are not properly ordered, the result could be incorrect.
For <association>
, the result may be correct regardless of the order, so it's not a 'must'.
That is what the Javadoc comment means, I assume.
Comment From: hazendaz
FWIW - The test case that was there showed possibly some small code coverage improvement, not sure its really OOM there since it of course is working but went ahead and merged that in as it had some references indicating sort of situation that could be problematic, probably from this issue. Figured ok another test, it was well written and built ok.
Comment From: gallyamb
Hi @harawata!
To be honest, I could not see any difference between "could be incorrect" and "may be correct". In both scenarios it either correct or not
Btw, for association I cannot imagine a situation, when result may be incorrect. Each row contains whole entity (properties of entity itself and (if association exists) properties of association)
Summing up, I think, that for scenario, when entity has only associations, but not collections, constant amount of memory must be used, despite resultOrdered="true"
is specified or not
Am I correct?
Comment From: harawata
Hi @gallyamb ,
I thought about it a long time ago and there is a corner case.
I'll try to explain, but first, you need to understand that resultOrdered="true"
is a general performance hint and is not directly related to Cursor
.
Let's say, a query returns the following result set.
id | name | friend_name |
---|---|---|
1 | User1 | John |
2 | User2 | Mary |
1 | User1 | Bob |
The mapper statement returns List<User>
.
List<User> getUsers();
The result maps are exactly the same as your demo.
<resultMap id="User" type="xxx.User">
<id property="id" column="id"/>
<result property="name" column="name"/>
<association property="friend" resultMap="Friend"/>
</resultMap>
<resultMap id="Friend" type="xxx.Friend">
<result property="name" column="friend_name"/>
</resultMap>
As the User.id
is specified as <id>
, the method should return two User
instances in the List
.
The actual result is...
- Without
resultOrdered="true"
, the returnedList
contains twoUser
instances. - With
resultOrdered="true"
, the returnedList
contains threeUser
instances.
You can argue that it is an incorrect usage and I do not necessarily disagree, but still, the result is 'correct' when resultOrdered
is not enabled and we should not break it.
Note that if the return type of the mapper method is Cursor<User>
, it will return three User
instances with or without resultOrdered="true"
.
If you think you can improve the documentation or the Javadoc comment, please send us a PR and we will review. And please let me know if you need further clarification.
Comment From: gallyamb
@harawata thanks for a detailed clarification! It's very helpful
In my personal opinion, the library should not make its' usage more complicated for users, that use it in correct way, in a try to not break the usage for users (I think, the minority ones), that use it incorrectly
As you said, cursor returns three instances with or without resultOrdered="true"
. It's an inconsistency between cursors based and basic collections based method, that, imho, more harmful for the users
And I think, that situation, when multiple rows contains same entity (by id) but with different attribute values (either plain attribute or association) should be treated as undefined behaviour (or better throw an exception)
But also I understand, that breaking changes are not expected by the users in minor releases. So maybe we should plan this change, if you also agree with me, ofc, to be included in some future major release?
What comes for a documentation improvement, first we need to clarify further actions on our issue to understand what to do with docs, I think
Comment From: harawata
Hi @gallyamb ,
I think the current behavior is fine.
- The default (
resultOrdered="false"
) works for all users. - Users who need the best performance can find
resultOrdered="true"
in the doc.
This is exactly what a good library should do.
Going back to your original report, I think we can add an explanation about the effect of resultOrdered="true"
on association.
Or, we can simply say that resultOrdered
should be enabled when using Cursor
.
Comment From: gallyamb
I think it's time to close the issue. I'm no longer use mybatis, so cannot provide any further information or clarification. Thank you guys for assistance! Really love what yall done :heart: