mindspore.ops.LARSUpdate¶
-
class
mindspore.ops.
LARSUpdate
(*args, **kwargs)[source]¶ Conducts LARS (layer-wise adaptive rate scaling) update on the sum of squares of gradient.
- Parameters
- Inputs:
weight (Tensor) - A tensor, representing the weight.
gradient (Tensor) - The gradient of weight, which has the same shape and dtype with weight.
norm_weight (Tensor) - A scalar tensor, representing the sum of squares of weight.
norm_gradient (Tensor) - A scalar tensor, representing the sum of squares of gradient.
weight_decay (Union[Number, Tensor]) - Weight decay. It must be a scalar tensor or number.
learning_rate (Union[Number, Tensor]) - Learning rate. It must be a scalar tensor or number.
- Outputs:
Tensor, represents the new gradient.
- Raises
TypeError – If neither epsilon nor hyperpara is a float.
TypeError – If use_clip is a bool.
TypeError – If weight, gradient, norm_weight or norm_gradient is not a Tensor.
TypeError – If weight_decay or learning_rate is neither a Number nor a Tensor.
TypeError – If shape of gradient is not same as weight.
- Supported Platforms:
Ascend
Examples
>>> from mindspore import Tensor >>> from mindspore.ops import operations as ops >>> import mindspore.nn as nn >>> import numpy as np >>> class Net(nn.Cell): ... def __init__(self): ... super(Net, self).__init__() ... self.lars = ops.LARSUpdate() ... self.reduce = ops.ReduceSum() ... self.square = ops.Square() ... def construct(self, weight, gradient): ... w_square_sum = self.reduce(self.square(weight)) ... grad_square_sum = self.reduce(self.square(gradient)) ... grad_t = self.lars(weight, gradient, w_square_sum, grad_square_sum, 0.0, 1.0) ... return grad_t ... >>> np.random.seed(0) >>> weight = np.random.random(size=(2, 3)).astype(np.float32) >>> gradient = np.random.random(size=(2, 3)).astype(np.float32) >>> net = Net() >>> output = net(Tensor(weight), Tensor(gradient)) >>> print(output) [[0.00036534 0.00074454 0.00080456] [0.00032014 0.00066101 0.00044157]]