Keras Mixed precision tutorial does not work for custom training loop

I am trying to follow the mixed precision tutorial when using a custom training loop, specifically

@tf.function
def train_step(x, y):
  with tf.GradientTape() as tape:
    predictions = model(x)
    loss = loss_object(y, predictions)
    scaled_loss = optimizer.get_scaled_loss(loss)
  scaled_gradients = tape.gradient(scaled_loss, model.trainable_variables)
  gradients = optimizer.get_unscaled_gradients(scaled_gradients)
  optimizer.apply_gradients(zip(gradients, model.trainable_variables))
  return loss

When trying to invoke, get_scaled_loss and get_unscaled_gradients an exception is thrown that those methods are undefined.

Maybe this is related to https://github.com/keras-team/keras/issues/19244

Comment From: mehtamansi29

Hi @CaptainDario -

Thanks for reporting the issue. Here for mixed precision tutorial when using custom training loop try to use keras with tensorflow backend. Here is the gist shows custom training loop is running fine. For more details you can find the reference here.

Comment From: CaptainDario

Thank you for your reply. Maybe I am missing something, but I do not see how your answer is using mixed precision.

Comment From: mehtamansi29

Hi @CaptainDario -

In keras documentation you can find mixed precision here. Here in the gist you can find the model using mixed precision where model output will be float16.

Comment From: CaptainDario

I still do not understand how this answers my question. Your gist does not even set the mixed precision. The documentation states that during mixed precision training, the loss needs to be scaled. However, the provided tutorial does not work. Therefore, I am not asking how to start training in mixed precision but how to scale the loss correctly.

Comment From: mehtamansi29

Hi @CaptainDario -

In keras3, you can use Loss Scale Optimizer to train your model without custom training loop. Loss Scale Optimizer wrap keras.optimizers.Optimizer instance and using dynamic_growth_steps argument can update the scale upwardsand also set initial scale as well. Attached gist here for the reference.

Comment From: CaptainDario

thank you for your reply. Can I just use it as a normal optimizer when using it in a custom training loop?

Comment From: mehtamansi29

Yes @CaptainDario - you can use it as normal optimizer when using it custom training loop. Here is example for custom training loop with normal optimizer.

Comment From: CaptainDario

thank you!

Comment From: google-ml-butler[bot]

Are you satisfied with the resolution of your issue? Yes No