We've been investigating an issue where we had a 100x performance drop when switching from JdbcTemplate to NamedParameterJdbcTemplate in batch update operations.

We found out that the root cause is the handling of null values by StatementCreatorUtils:setNull, which in turn calls getParameterMetadata from the connection. It seems this issue a database call for every row to be inserted (call stack from profiler attached) which basically kills the performance gains of the batch insert.

We're using Postgres as back-end database. Spring JDBC 5.2.5.

Thanks.

namedparameterjdbctemplate_stack

Comment From: quaff

Have you tried -Dspring.jdbc.getParameterType.ignore=true?

Comment From: quaff

It seems spring.jdbc.getParameterType.ignore will not avoid ps.getConnection().getMetaData(), should spring need introducing a new spring.jdbc.getMetaData.ignore?

Comment From: quaff

@jhoeller Could you explain why not simply call ps.setObject(paramIndex, null) ?

Comment From: jhoeller

ps.getConnection().getMetaData() usually isn't expensive since it's cached in the connection pool. Just getParameterMetaData() may turn out to be expensive, and that's what spring.jdbc.getParameterType.ignore=true covers.

As for why not simply call setObject, it's about compatibility with certain drivers/databases. Quoting the setObject javadoc:

 *<b>Note:</b> Not all databases allow for a non-typed Null to be sent to
 * the backend. For maximum portability, the <code>setNull</code> or the
 * <code>setObject(int parameterIndex, Object x, int sqlType)</code>
 * method should be used
 * instead of <code>setObject(int parameterIndex, Object x)</code>.

Comment From: ricardoekm

spring.jdbc.getParameterType.ignore=true did the trick, thanks!

Comment From: quaff

@jhoeller I assume all modern databases allow non-typed Null, if it's true, spring.jdbc.getParameterType.ignore should be true by default.

Comment From: ricardoekm

Reasonable defaults, that's what life is about :)

Comment From: jhoeller

We're happy to refine our default assumptions there. However, a few years ago, there were only a few databases supporting untyped setObject(paramIndex, null) calls, e.g. Microsoft's then-new SQL Server driver. For all others, a setNull call with some SQL type argument is needed.

I'd particularly like to learn about recent driver behavior with MySQL and PostgreSQL there. We're effectively using setNull(paramIndex, Types.NULL) for the latter, like for other databases that we don't have explicit rules for. If this is known to work reliably, we can explictly opt into that for such well-known databases, not ever trying getParameterType for those then (independent from the spring.jdbc.getParameterType.ignore setting).

Comment From: jhoeller

A StackOverflow answer about support for setObject vs setNull in common JDBC drivers (https://stackoverflow.com/a/10068111, dating back to 2016) and other articles suggest that MySQL does indeed reliably support untyped nulls, whereas PostgreSQL requires the specific SQL type to be set in some cases. I'd appreciate feedback on the latter since according to the indications above, setNull(paramIndex, Types.NULL) does seem to work for your cases here (since that's what we're effectively using when you set spring.jdbc.getParameterType.ignore=true), possibly due to a more recent JDBC driver?

A more recent discussion thread about PostgreSQL null handling (https://www.postgresql-archive.org/quot-could-not-determine-data-type-of-parameter-quot-with-timestamp-td5995489.html) suggests that setNull(paramIndex, Types.NULL) does usually work but that ParameterMetaData.getParameterType should be used for better execution performance. That's pretty much the situation that we have right now with the choice that spring.jdbc.getParameterType.ignore is providing, following the PostgreSQL recommendation by default but allowing to opt out of it.

Another interesting thread on StackOverflow (https://stackoverflow.com/questions/37942063/slow-insert-on-postgresql-using-jdbc) suggests that the pgjdbc-ng driver caches parameter data and is therefore much faster with our default settings.

I'll consider revising our default algorithm to take recent driver updates into account, reopening this issue for it. There might not be much that we can do about PostgreSQL specifically but this is nevertheless worth a fresh look now with 2020-era drivers.

Comment From: ricardoekm

Great to hear that you're revisiting this implementation.

Setting null with Types.NULL indeed works for PostgresSQL (JDBC) 42.2.12 and C3P0 0.9.5.5.

Adding a bit more information, here is the result of our performance test

JdbcTemplate

Batch 1: 10925 records/sec. Batch 2: 11189 records/sec. Batch 3: 11170 records/sec. Batch 4: 11140 records/sec. Batch 5: 11179 records/sec.

Average: 11,120

NamedJdbcTemplate with default configuration

Batch 1: 312 records/sec. Batch 2: 344 records/sec. Batch 3: 348 records/sec. Batch 4: 350 records/sec. Batch 5: 357 records/sec.

Average: 342.2 records/sec.

NamedJdbcTemplate with spring.jdbc.getParameterType.ignore=true

Batch 1: 10384 records/sec. Batch 2: 10652 records/sec. Batch 3: 10937 records/sec. Batch 4: 10697 records/sec. Batch 5: 10961 records/sec.

Average: 10,726 records/sec.

So in general, JdbcTemplate is a bit faster than NamedJdbcTemplate with spring.jdbc.getParameterType.ignore=true (around 4%), but in the same order of magnitude.

However, there's a huge performance drop by using NamedJdbcTemplate with the default configuration.

Comment From: quaff

@jhoeller I'm wondering why JdbcTemplate not affected.

Comment From: quaff

It seems like JdbcTemplate always use TYPE_UNKNOWN if type is not explicit, so there is some inconsistency with NamedParameterJdbcTemplate. https://github.com/spring-projects/spring-framework/blob/3a0f309e2c9fdbbf7fb2d348be861528177f8555/spring-jdbc/src/main/java/org/springframework/jdbc/core/ArgumentPreparedStatementSetter.java#L69

Comment From: jhoeller

Hmm NamedParameterJdbcTemplate should effectively also end up with SqlTypeValue.UNKNOWN by default... wondering where the performance difference actually comes from. Maybe it's some subtle interaction that I'm missing here.

Comment From: ricardoekm

I've done some additional investigation to try to identify the difference between the NamedJdbcTemplate and JdbcTemplate when using ignoreParameterType.

It seems the issue is StatementCreatorUtils.setParameterInternal, but I was unable to track the cause (it seems something the method is doing itself, not something he's calling).

When using JdbcTemplate setting values take 6.9% of the time: jdbcTemplate

When using NamedJdbcTemplate it goes to 15.9% of the time: namedJdbcTemplate

For additional info here goes the jProfiler snapshot: jProfilerSnapshot.zip

Comment From: sabomichal

Curious why this is still not fixed, having huge performance impact on many systems not even knowing. Is there anything what needs to be done yet?

Comment From: jhoeller

The hard part here is identifying whether it is necessary to specify a concrete SQL type for a given JDBC driver. Most modern drivers take the given type into account and send corresponding prepared parameter information to the DBMS, so simply changing this to untyped by default might be accepted by the driver but influence the resulting execution plans. It is tough to make assumptions here as a caller without knowing the implementation specifics of the driver and the target database.

A quick re-check of several driver implementations leads me to the assumption that getParameterMetaData is actually a cheap enough operation in most drivers, including recent MySQL drivers and PGJDBC-NG... with a single exception: the classic PostgreSQL JDBC driver. Now we could hard-code a bypass there in StatementCreatorUtils.setNull based on DatabaseMetaData.getDriverName().startsWith("PostgreSQL"), never calling getParameterMetaData() by default then. However, even such a workaround is not ideal since we'd keep bypassing it even if a newer PostgreSQL driver version (identifying itself through the same driver name) chooses to implement getParameterMetaData in a cached manner like most other drivers do. With such a reasonable implementation, we would prefer to specify null with a concrete SQL type again.

That said, I do understand the pain here and that spring.jdbc.getParameterType.ignore=true is not so easy to find out about. So we might go with such a bypass eventually if the PostgreSQL JDBC driver maintainers are not likely to ever improve the driver itself there. Has anybody ever had direct contact with them about this?

Comment From: jhoeller

For the classic PostgresSQL JDBC driver, there is actually a GitHub issue for it: https://github.com/pgjdbc/pgjdbc/issues/621 Seems to have got stuck some time last year, despite a concrete caching approach having been tried. Can everyone interested in this from Spring's side here cast their vote over there please :-)

Comment From: davecramer

@jhoeller can you point me to the concrete caching approach you are referring to?

Comment From: jhoeller

@davecramer it looks like @vlsi was pretty much working in that direction with https://github.com/pgjdbc/pgjdbc/issues/621 already. I suppose a PreparedStatement-level cache would go a long way, a Connection-level cache would be even better.

All we really need is getParameterMetaData() turning into a cheap enough operation for repeated calls on the same PreparedStatement, with any necessary backend interaction only happening once and the resulting SQL type information then being immediately available for repeated parameter setting on the same statement.

On Spring's side, we just need that metadata for passing in a SQL type to a PreparedStatement.setNull call if we don't have more specific type information for a given parameter. If there is another way of sending typed nulls to the DBMS without introspecting getParameterMetaData(), we could support that on Spring's side as well. Generally speaking, we operate under the assumption that setNull(..., Types.NULL) is not good enough by default, and neither is setObject(..., null).

Comment From: jhoeller

@davecramer, reviewing other JDBC driver implementations in that respect, the key-based caching approach that @vlsi considered was probably more sophisticated than necessary. The common driver implementations seem to simply cache the ParameterMetaData object in the PreparedStatement instance after obtaining it. So no reactions to intermediate changes in the database schema or the like, and no reuse at the connection level either, just returning the existing metadata object for repeated calls on the same PreparedStatement instance.

Comment From: vlsi

@jhoeller , thank you for the clarification. Let me see if I can implement the MVP.

Comment From: viniciusxyz

I had performance problems in production due to the setNull using MSSQL with the official driver in the latest version available (12.4.1) and JDBCTemplate.update(), when adding the parameter -Dspring.jdbc.getParameterType.ignore=true, the time for preparing the query decreased drastically, giving a 70% lower average execution time and reducing the amount of GC during some tests from 196 executions to just 17 which greatly improved CPU performance.

I would like to understand when the use of this parameter can actually be something negative and if there is any relationship between drivers where it causes problems, currently I use the official mssql driver and the postgresql driver for some demands, is there any way that Do you recommend so that I can test these drivers? If possible, I would like to always set the value of spring.jdbc.getParameterType.ignore to true in some applications I have, but I'm unsure of what exactly might happen if I do that.

Thank you in advance if anyone can clarify these doubts, I don't understand in depth how the drivers work and so it's difficult for me to know exactly which scenarios are problematic with the mentioned parameter.

Comment From: jhoeller

spring.jdbc.getParameterType.ignore=true is safe against PostgreSQL and MS SQL Server, as far as we are able to infer, since the drivers (or rather the DBMS wire protocols) do not actually rely on parameter type information for null values there.

I wasn't aware of such a dramatic performance impact on SQL Server yet, this looks like a reason to prefer the bypass (without a getParameterType call) there as well. So for both PostgreSQL and SQL Server, we have a situation where type information for null values does not matter in common scenarios but getParameterType calls are known to be expensive.

For H2, HSQLDB, Derby and MySQL, type information does not matter for null values either but getParameterType is efficient enough there.

On the other side, to the best of my knowledge, type information for null values is important on Oracle, Sybase and DB2, with getParameterType generally being preferable there. Or of course, explicitly specifying SQL type information on the application side through providing SqlParameterValue objects instead of plain null values to begin with.

All in all, I'm inclined to implement a default bypass for PostgreSQL and MS SQL Server for the case where spring.jdbc.getParameterType.ignore is unspecified, similar to the database-specific setObject vs setNull checks that we got there in StatementCreatorUtils already. An explicit true or false value for the flag would still override the bypass.