Hit overflow error while training tf.keras.Sequential classifier on Windows
Specifically: OverflowError: Python int too large to convert to C long due to
tf.constant(0x80000000, dtype=tf.int32)
in Keras verison of tensorflow.numpy.signbit line 2068
However, if I just import tensorflow and run that line in isolation, there is no Overflow error.
>>> import tensorflow as tf
>>> tf.constant(0x80000000, dtype=tf.int32)
<tf.Tensor: shape=(), dtype=int32, numpy=-2147483648>
More info on when it fails.
Though the failing line doesn't actually use x
the argument x
when it fails is the following:
type(x)
<class 'tensorflow.python.framework.ops.SymbolicTensor'>
print(x)
Tensor("Cast_2:0", shape=(None, 2), dtype=float32)
Traceback (most recent call last):
...
File "C:\Users\chris\code\py\bugid\training\cli.py", line 70, in train
report = classifier.train(features, labels)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\chris\code\py\bugid\training\model.py", line 85, in train
self.model.fit(
File "C:\Users\chris\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.12_qbz5n2kfra8p0\LocalCache\local-packages\Python312\site-packages\keras\src\utils\traceback_utils.py", line 122, in error_handler
raise e.with_traceback(filtered_tb) from None
File "C:\Users\chris\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.12_qbz5n2kfra8p0\LocalCache\local-packages\Python312\site-packages\keras\src\backend\tensorflow\numpy.py", line 2081, in signbit
tf.constant(0x80000000, dtype=tf.int32),
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
OverflowError: Python int too large to convert to C long
Project where I am seeing this https://github.com/cm6n/bugid
Comment From: dhantule
Hi @cm6n, Thanks for reporting this. Can you provide some standalone code to reproduce?
Comment From: wilsbj
Hi @dhantule, I've just been hitting the same error while using an entirely different project, sktime. Like @cm6n, I am also running this on Windows.
The same line in signbit(x)
is throwing this error when x is a SymbolicTensor
. The following simplified code snippet should reproduce the error:
import keras
import numpy as np
np.random.seed(0)
X = np.random.random((10,1,20))
y = np.array([[1., 0.],
[0., 1.],
[0., 1.],
[1., 0.],
[0., 1.],
[1., 0.],
[0., 1.],
[1., 0.],
[0., 1.],
[1., 0.]])
input_layer = keras.layers.Input((1, 20))
output_layer = keras.layers.Dense(units=2, activation='sigmoid')(input_layer)
optimizer = keras.optimizers.RMSprop()
model = keras.models.Model(inputs=input_layer, outputs=output_layer)
model.compile(loss='mean_squared_error', optimizer=optimizer, metrics=['accuracy'])
model.fit(X, y)
I have noted that the error does not appear if metrics=['accuracy']
is not provided to model.compile
.
I have also noted that the error does not occur (at least on my machine) if I patch out this line in signbit
as follows:
- tf.constant(0x80000000, dtype=tf.int32),
+ tf.constant(tf.int32.min, dtype=tf.int32),
Version info: python: 3.9 keras: 3.9.0 numpy: 1.26.4 tensorflow: 2.16.2
Comment From: cm6n
Thanks @wilsbj ! That does look the same same issue. When I run that example I hit the same Overflow error in the same spot I was experiencing earlier on my project. I was running these versions:
- tensorflow: 2.17.1
- keras: 3.9.0
- Python 3.12.9
Comment From: badshroud
really helpful!Thanks @wilsbj !Tried lots of methods and still can‘t figure it out.I think its kind of bug on windows..
Comment From: wilsbj
When executing in graph mode, tf.constant(0x80000000, dtype=tf.int32)
calls tensor_util.make_tensor_proto, i.e.:
import tensorflow as tf
from tensorflow.python.framework.tensor_util import make_tensor_proto
make_tensor_proto(0x80000000, tf.int32)
In tensorflow 2.16 & 2.17, this in turn tries to create a numpy array from the values and dtype provided:
np.array(0x80000000, dtype=np.int32)
On Windows and numpy version 1.26.4 (as well as other versions), this fails with OverflowError: Python int too large to convert to C long
.
Note that Conversion of out of bound python integers has been deprecated as of numpy version 1.24.0 and changed in numpy 2.0 (See numpy release notes). As such, relying on overflow should be avoided.
However, it seems that tensorflow 2.18.0 has added a numpy compatability layer, and this essentially converts the above failing step to the following (See tensorflow release notes):
np.array(0x80000000).astype(np.int32)
which succeeds on both numpy 1.26 and numpy > 2.0.
This could all be very simply fixed by setting the value using tf.int32.min
, rather than using the hard coded hexadecimal python integer and relying on overflow. But given this is also fixed by updating tensorflow to a version greater than 2.18, I'm not sure whether to proceed.