Keras keras version of numpy.signbit results in OverflowError on Windows

Hit overflow error while training tf.keras.Sequential classifier on Windows

Specifically: OverflowError: Python int too large to convert to C long due to

tf.constant(0x80000000, dtype=tf.int32)

in Keras verison of tensorflow.numpy.signbit line 2068

However, if I just import tensorflow and run that line in isolation, there is no Overflow error.

>>> import tensorflow as tf
>>> tf.constant(0x80000000, dtype=tf.int32)
<tf.Tensor: shape=(), dtype=int32, numpy=-2147483648>

More info on when it fails.

Though the failing line doesn't actually use x the argument x when it fails is the following:

type(x)
<class 'tensorflow.python.framework.ops.SymbolicTensor'>
print(x)
Tensor("Cast_2:0", shape=(None, 2), dtype=float32)

Traceback (most recent call last):

...
  File "C:\Users\chris\code\py\bugid\training\cli.py", line 70, in train
    report = classifier.train(features, labels)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\chris\code\py\bugid\training\model.py", line 85, in train
    self.model.fit(
  File "C:\Users\chris\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.12_qbz5n2kfra8p0\LocalCache\local-packages\Python312\site-packages\keras\src\utils\traceback_utils.py", line 122, in error_handler
    raise e.with_traceback(filtered_tb) from None
  File "C:\Users\chris\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.12_qbz5n2kfra8p0\LocalCache\local-packages\Python312\site-packages\keras\src\backend\tensorflow\numpy.py", line 2081, in signbit
    tf.constant(0x80000000, dtype=tf.int32),
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
OverflowError: Python int too large to convert to C long

Project where I am seeing this https://github.com/cm6n/bugid

Comment From: dhantule

Hi @cm6n, Thanks for reporting this. Can you provide some standalone code to reproduce?

Comment From: wilsbj

Hi @dhantule, I've just been hitting the same error while using an entirely different project, sktime. Like @cm6n, I am also running this on Windows.

The same line in signbit(x) is throwing this error when x is a SymbolicTensor. The following simplified code snippet should reproduce the error:

import keras
import numpy as np
np.random.seed(0)

X = np.random.random((10,1,20))
y = np.array([[1., 0.],
       [0., 1.],
       [0., 1.],
       [1., 0.],
       [0., 1.],
       [1., 0.],
       [0., 1.],
       [1., 0.],
       [0., 1.],
       [1., 0.]])

input_layer = keras.layers.Input((1, 20))
output_layer = keras.layers.Dense(units=2, activation='sigmoid')(input_layer)

optimizer = keras.optimizers.RMSprop()
model = keras.models.Model(inputs=input_layer, outputs=output_layer)
model.compile(loss='mean_squared_error', optimizer=optimizer, metrics=['accuracy'])
model.fit(X, y)

I have noted that the error does not appear if metrics=['accuracy'] is not provided to model.compile.

I have also noted that the error does not occur (at least on my machine) if I patch out this line in signbit as follows:

-                tf.constant(0x80000000, dtype=tf.int32),
+                tf.constant(tf.int32.min, dtype=tf.int32),

Version info: python: 3.9 keras: 3.9.0 numpy: 1.26.4 tensorflow: 2.16.2

Comment From: cm6n

Thanks @wilsbj ! That does look the same same issue. When I run that example I hit the same Overflow error in the same spot I was experiencing earlier on my project. I was running these versions:

tensorflow: 2.17.1
keras: 3.9.0
Python 3.12.9

Comment From: badshroud

really helpful！Thanks @wilsbj ！Tried lots of methods and still can‘t figure it out.I think its kind of bug on windows..

Comment From: wilsbj

When executing in graph mode, tf.constant(0x80000000, dtype=tf.int32) calls tensor_util.make_tensor_proto, i.e.:

import tensorflow as tf
from tensorflow.python.framework.tensor_util import make_tensor_proto
make_tensor_proto(0x80000000, tf.int32)

In tensorflow 2.16 & 2.17, this in turn tries to create a numpy array from the values and dtype provided:

np.array(0x80000000, dtype=np.int32)

On Windows and numpy version 1.26.4 (as well as other versions), this fails with OverflowError: Python int too large to convert to C long.

Note that Conversion of out of bound python integers has been deprecated as of numpy version 1.24.0 and changed in numpy 2.0 (See numpy release notes). As such, relying on overflow should be avoided.

However, it seems that tensorflow 2.18.0 has added a numpy compatability layer, and this essentially converts the above failing step to the following (See tensorflow release notes):

np.array(0x80000000).astype(np.int32)

which succeeds on both numpy 1.26 and numpy > 2.0.

This could all be very simply fixed by setting the value using tf.int32.min, rather than using the hard coded hexadecimal python integer and relying on overflow. But given this is also fixed by updating tensorflow to a version greater than 2.18, I'm not sure whether to proceed.