Сursors in certain circumstances became useless, because hold all the data in the memory, causing OOM: - on large number of rows fetched via one cursor (depends on Xmx) - on any amount of data fetched via cursor, if that cursor is created and iterated sufficient number of times (e.g. fetch 1000 rows in a loop via different cursors N times (N depends on Xmx))

This happens because MyBatis: 1. registers all cursors created within session and hold a hard reference to them (so they could not be garbage collected) 2. cursor itself caches each row's data, when there is any nested result map, via hard reference

MyBatis version

3.5.11

Database vendor and version

PostgreSQL 13

Test case or example project

https://github.com/gallyamb/mybatis-3/blob/cursor-cache-oom/src/test/java/org/apache/ibatis/submitted/cursor_cache_oom/CursorOomTest.java

Steps to reproduce

Use result map with nested result map and fetch data via cursor

Expected result

The memory will be used at constant level

Actual result

The memory used as linear with fetched amount of data, i.e. more rows fetched via cursor - more memory is used

Comment From: gallyamb

FYI test case in PR https://github.com/mybatis/mybatis-3/pull/2813

Comment From: harawata

Hello @gallyamb ,

When using cursor involving nested results, you should specify resultOrdered="true" in the <select>. The test still won't pass, but the DefaultResultSetHandler.nestedResultObjects will hold nested objects of the 'current' item only. https://mybatis.org/mybatis-3/sqlmap-xml.html#select

Please let me know if the problem persists.

Comment From: gallyamb

@harawata thank you for a quick reply!

I confirm, that adding resultOrdered="true" result in keeping only current row's data in memory (pushed a commit with that change). I've added resultOrdered="true", but did not add ORDER BY clause in SQL, and everything works like a charm

The Cusros' javadoc says, that resultOrdered="true" have to be used, when I have a collection, but in my test case I haven't. So why each row's data is still persisted in memory? Isn't such behaviour is a bug?

https://github.com/mybatis/mybatis-3/blob/b175c22d0b8b13bf4745424179dc96d1975d4dd7/src/main/java/org/apache/ibatis/cursor/Cursor.java#L21-L23

Comment From: harawata

Hi @gallyamb ,

When there is a <collection> and the rows are not properly ordered, the result could be incorrect. For <association>, the result may be correct regardless of the order, so it's not a 'must'. That is what the Javadoc comment means, I assume.

Comment From: hazendaz

FWIW - The test case that was there showed possibly some small code coverage improvement, not sure its really OOM there since it of course is working but went ahead and merged that in as it had some references indicating sort of situation that could be problematic, probably from this issue. Figured ok another test, it was well written and built ok.

Comment From: gallyamb

Hi @harawata!

To be honest, I could not see any difference between "could be incorrect" and "may be correct". In both scenarios it either correct or not

Btw, for association I cannot imagine a situation, when result may be incorrect. Each row contains whole entity (properties of entity itself and (if association exists) properties of association)

Summing up, I think, that for scenario, when entity has only associations, but not collections, constant amount of memory must be used, despite resultOrdered="true" is specified or not

Am I correct?

Comment From: harawata

Hi @gallyamb ,

I thought about it a long time ago and there is a corner case. I'll try to explain, but first, you need to understand that resultOrdered="true" is a general performance hint and is not directly related to Cursor.

Let's say, a query returns the following result set.

id name friend_name
1 User1 John
2 User2 Mary
1 User1 Bob

The mapper statement returns List<User>.

List<User> getUsers();

The result maps are exactly the same as your demo.

<resultMap id="User" type="xxx.User">
  <id property="id" column="id"/>
  <result property="name" column="name"/>
  <association property="friend" resultMap="Friend"/>
</resultMap>

<resultMap id="Friend" type="xxx.Friend">
  <result property="name" column="friend_name"/>
</resultMap>

As the User.id is specified as <id>, the method should return two User instances in the List. The actual result is...

  • Without resultOrdered="true", the returned List contains two User instances.
  • With resultOrdered="true", the returned List contains three User instances.

You can argue that it is an incorrect usage and I do not necessarily disagree, but still, the result is 'correct' when resultOrdered is not enabled and we should not break it.

Note that if the return type of the mapper method is Cursor<User>, it will return three User instances with or without resultOrdered="true".

If you think you can improve the documentation or the Javadoc comment, please send us a PR and we will review. And please let me know if you need further clarification.

Comment From: gallyamb

@harawata thanks for a detailed clarification! It's very helpful

In my personal opinion, the library should not make its' usage more complicated for users, that use it in correct way, in a try to not break the usage for users (I think, the minority ones), that use it incorrectly

As you said, cursor returns three instances with or without resultOrdered="true". It's an inconsistency between cursors based and basic collections based method, that, imho, more harmful for the users

And I think, that situation, when multiple rows contains same entity (by id) but with different attribute values (either plain attribute or association) should be treated as undefined behaviour (or better throw an exception)

But also I understand, that breaking changes are not expected by the users in minor releases. So maybe we should plan this change, if you also agree with me, ofc, to be included in some future major release?

What comes for a documentation improvement, first we need to clarify further actions on our issue to understand what to do with docs, I think

Comment From: harawata

Hi @gallyamb ,

I think the current behavior is fine.

  • The default (resultOrdered="false") works for all users.
  • Users who need the best performance can find resultOrdered="true" in the doc.

This is exactly what a good library should do.

Going back to your original report, I think we can add an explanation about the effect of resultOrdered="true" on association. Or, we can simply say that resultOrdered should be enabled when using Cursor.

Comment From: gallyamb

I think it's time to close the issue. I'm no longer use mybatis, so cannot provide any further information or clarification. Thank you guys for assistance! Really love what yall done :heart: