当前位置: 首页
AI
一文搞懂Paddle2.0中的学习率

一文搞懂Paddle2.0中的学习率

热心网友 时间:2025-07-18
转载
本项目围绕深度学习中重要的学习率超参数展开,旨在掌握Paddle2.0学习率API使用、分析不同学习率性能及尝试自定义学习率。使用蜜蜂黄蜂分类数据集,对Paddle2.0的13种自带学习率及自定义的CLR、Adjust_lr进行训练测试,结果显示自定义CLR效果最佳,LinearWarmup等也值得尝试。

一文搞懂paddle2.0中的学习率 - 游乐网

1. 项目简介

对于深度学习中的学习率是一个重要的超参数。在本项目中我们将从三个不同方面来理解学习率:

免费影视、动漫、音乐、游戏、小说资源长期稳定更新! 👉 点此立即查看 👈

学会Paddle2.0中自带的学习率API的使用方法。分析不同学习率的性能优劣。尝试通过自定义学习率来复现目前Paddle中没有的学习率算法。

那么什么是学习率呢?

我们知道,深度学习的目标是找到可以满足我们任务的函数,而要找到这个函数就需要确定这个函数的参数。为了达到这个目的我们需要首先设计 一个损失函数,然后让我们的数据输入到待确定参数的备选函数中时这个损失函数可以有最小值。所以,我们就要不断的调整备选函数的参数值,这个调整的过程就是所谓的梯度下降。而参数值调整的幅度大小可以由学习率来控制。由此可见,学习率是深度学习中非常重要的一个概念。值得庆幸的是,深度学习框架Paddle提供了很多可以拿来就用的学习率函数。本项目就来研究一下这些Paddle中的学习率函数。对于想要自力更生设计学习率函数的同学,本项目也提供了如何进行自定义学习率函数供参考交流。

2. 项目中用到的数据集

本项目使用的数据集是蜜蜂黄蜂分类数据集。包含4个类别:蜜蜂、黄蜂、其它昆虫和其它类别。 共7939张图片,其中蜜蜂3183张,黄蜂4943张,其它昆虫2439张,其它类别856张

2.1 解压缩数据集

In [ ]
!unzip -q data/data65386/beesAndwasps.zip -d work/dataset
登录后复制

2.2 查看图片

In [ ]
import osimport randomfrom matplotlib import pyplot as pltfrom PIL import Imageimgs = []paths = os.listdir('work/dataset')for path in paths:       img_path = os.path.join('work/dataset', path)    if os.path.isdir(img_path):        img_paths = os.listdir(img_path)        img = Image.open(os.path.join(img_path, random.choice(img_paths)))        imgs.append((img, path))f, ax = plt.subplots(2, 3, figsize=(12,12))for i, img in enumerate(imgs[:]):    ax[i//3, i%3].imshow(img[0])    ax[i//3, i%3].axis('off')    ax[i//3, i%3].set_title('label: %s' % img[1])plt.show()plt.show()
登录后复制
登录后复制

2.3 数据预处理

In [ ]
!python code/preprocess.py
登录后复制
finished data preprocessing
登录后复制

3. Paddle2.0中自带的学习率

Paddle2.0 中定义了如下13种不同的学习率算法。本项目在比较不同的学习率算法性能时将采用相同的优化器算法Momentum。

3.1 使用余弦退火学习率

使用方法为:paddle.optimizer.lr.CosineAnnealingDecay(learning_rate, T_max, eta_min=0, last_epoch=- 1, verbose=False)
该算法来自于论文SGDR: Stochastic Gradient Descent with Warm Restarts。

这差不多是最有名的学习率算法了。

更新公式如下:

一文搞懂Paddle2.0中的学习率 - 游乐网In [1]
## 开始训练!python code/train.py --lr 'CosineAnnealingDecay'
登录后复制

可视化结果

一文搞懂Paddle2.0中的学习率 - 游乐网

图1 CosineAnnealingDecay训练验证图

In [13]
## 查看测试集上的效果!python code/test.py --lr 'CosineAnnealingDecay'
登录后复制
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/layers/utils.py:26: DeprecationWarning: `np.int` is a deprecated alias for the builtin `int`. To silence this warning, use `int` by itself. Doing this will not modify any behavior and is safe. When replacing `np.int`, you may wish to use e.g. `np.int64` or `np.int32` to specify the precision. If you wish to review your current use, check the release note link for additional information.Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations  def convert_to_list(value, n, name, dtype=np.int):W0508 22:14:08.330015  4460 device_context.cc:362] Please NOTE: device: 0, GPU Compute Capability: 7.0, Driver API Version: 10.1, Runtime API Version: 10.1W0508 22:14:08.335014  4460 device_context.cc:372] device: 0, cuDNN Version: 7.6.Eval begin...The loss value printed in the log is the current batch, and the metric is the average value of previous step.step 28/28 [==============================] - acc: 0.9472 - 734ms/step        Eval samples: 1763
登录后复制

3.2 使用指数衰减学习率

使用方法为:paddle.optimizer.lr.ExponentialDecay(learning_rate, gamma, last_epoch=-1, verbose=False)

该接口提供一种学习率按指数函数衰减的策略。

更新公式如下:

一文搞懂Paddle2.0中的学习率 - 游乐网In [2]
## 开始训练!python code/train.py --lr 'ExponentialDecay'
登录后复制

可视化结果

一文搞懂Paddle2.0中的学习率 - 游乐网

图2 ExponentialDecay训练验证图

In [12]
## 查看测试集上的效果!python code/test.py --lr 'ExponentialDecay'
登录后复制
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/layers/utils.py:26: DeprecationWarning: `np.int` is a deprecated alias for the builtin `int`. To silence this warning, use `int` by itself. Doing this will not modify any behavior and is safe. When replacing `np.int`, you may wish to use e.g. `np.int64` or `np.int32` to specify the precision. If you wish to review your current use, check the release note link for additional information.Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations  def convert_to_list(value, n, name, dtype=np.int):W0508 22:11:58.334306  4184 device_context.cc:362] Please NOTE: device: 0, GPU Compute Capability: 7.0, Driver API Version: 10.1, Runtime API Version: 10.1W0508 22:11:58.339387  4184 device_context.cc:372] device: 0, cuDNN Version: 7.6.Eval begin...The loss value printed in the log is the current batch, and the metric is the average value of previous step.step 28/28 [==============================] - acc: 0.7686 - 854ms/step        Eval samples: 1763
登录后复制

3.3 使用逆时间衰减学习率

使用方法为:paddle.optimizer.lr.InverseTimeDecay(learning_rate, gamma, last_epoch=- 1, verbose=False)

该接口提供逆时间衰减学习率的策略,即学习率与当前衰减次数成反比。

更新公式如下:

一文搞懂Paddle2.0中的学习率 - 游乐网In [3]
## 开始训练!python code/train.py --lr 'InverseTimeDecay'
登录后复制

可视化结果

一文搞懂Paddle2.0中的学习率 - 游乐网

图3 InverseTimeDecay训练验证图

In [10]
## 查看测试集上的效果!python code/test.py --lr 'InverseTimeDecay'
登录后复制
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/layers/utils.py:26: DeprecationWarning: `np.int` is a deprecated alias for the builtin `int`. To silence this warning, use `int` by itself. Doing this will not modify any behavior and is safe. When replacing `np.int`, you may wish to use e.g. `np.int64` or `np.int32` to specify the precision. If you wish to review your current use, check the release note link for additional information.Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations  def convert_to_list(value, n, name, dtype=np.int):W0508 22:07:21.973084  3618 device_context.cc:362] Please NOTE: device: 0, GPU Compute Capability: 7.0, Driver API Version: 10.1, Runtime API Version: 10.1W0508 22:07:21.977885  3618 device_context.cc:372] device: 0, cuDNN Version: 7.6.Eval begin...The loss value printed in the log is the current batch, and the metric is the average value of previous step.step 28/28 [==============================] - acc: 0.9183 - 729ms/step        Eval samples: 1763
登录后复制

3.4 使用Lambda衰减学习率

使用方法为:paddle.optimizer.lr.LambdaDecay(learning_rate, lr_lambda, last_epoch=-1, verbose=False)

该接口提供lambda函数设置学习率的策略,lr_lambda 为一个lambda函数,其通过epoch计算出一个因子,该因子会乘以初始学习率。

更新公式如下:

lr_lambda = lambda epoch: 0.95 ** epoch
登录后复制In [4]
## 开始训练!python code/train.py --lr 'LambdaDecay'
登录后复制

可视化结果

一文搞懂Paddle2.0中的学习率 - 游乐网

图4 LambdaDecay训练验证图

In [9]
## 查看测试集上的效果!python code/test.py --lr 'LambdaDecay'
登录后复制
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/layers/utils.py:26: DeprecationWarning: `np.int` is a deprecated alias for the builtin `int`. To silence this warning, use `int` by itself. Doing this will not modify any behavior and is safe. When replacing `np.int`, you may wish to use e.g. `np.int64` or `np.int32` to specify the precision. If you wish to review your current use, check the release note link for additional information.Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations  def convert_to_list(value, n, name, dtype=np.int):W0508 22:04:35.426102  3249 device_context.cc:362] Please NOTE: device: 0, GPU Compute Capability: 7.0, Driver API Version: 10.1, Runtime API Version: 10.1W0508 22:04:35.431219  3249 device_context.cc:372] device: 0, cuDNN Version: 7.6.Eval begin...The loss value printed in the log is the current batch, and the metric is the average value of previous step.step 28/28 [==============================] - acc: 0.8667 - 731ms/step        Eval samples: 1763
登录后复制

3.5 使用线性热身学习率

使用方法为:paddle.optimizer.lr.LinearWarmup(learing_rate, warmup_steps, start_lr, end_lr, last_epoch=-1, verbose=False)
该算法来自于论文SGDR: Stochastic Gradient Descent with Warm Restarts。

该接口提供一种学习率优化策略-线性学习率热身(warm up)对学习率进行初步调整。在正常调整学习率之前,先逐步增大学习率。

热身时,学习率更新公式如下:

一文搞懂Paddle2.0中的学习率 - 游乐网In [5]
## 开始训练!python code/train.py --lr 'LinearWarmup'
登录后复制

可视化结果

一文搞懂Paddle2.0中的学习率 - 游乐网

图5 LinearWarmup训练验证图

In [8]
## 查看测试集上的效果!python code/test.py --lr 'LinearWarmup'
登录后复制
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/layers/utils.py:26: DeprecationWarning: `np.int` is a deprecated alias for the builtin `int`. To silence this warning, use `int` by itself. Doing this will not modify any behavior and is safe. When replacing `np.int`, you may wish to use e.g. `np.int64` or `np.int32` to specify the precision. If you wish to review your current use, check the release note link for additional information.Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations  def convert_to_list(value, n, name, dtype=np.int):W0508 22:02:36.454607  2972 device_context.cc:362] Please NOTE: device: 0, GPU Compute Capability: 7.0, Driver API Version: 10.1, Runtime API Version: 10.1W0508 22:02:36.459585  2972 device_context.cc:372] device: 0, cuDNN Version: 7.6.Eval begin...The loss value printed in the log is the current batch, and the metric is the average value of previous step.step 28/28 [==============================] - acc: 0.9563 - 760ms/step        Eval samples: 1763
登录后复制

3.6 使用多阶衰减学习率

使用方法为:paddle.optimizer.lr.MultiStepDecay(learning_rate, milestones, gamma=0.1, last_epoch=- 1, verbose=False)

该接口提供一种学习率按指定轮数进行衰减的策略。

学习率更新公式如下:

一文搞懂Paddle2.0中的学习率 - 游乐网In [6]
## 开始训练!python code/train.py --lr 'MultiStepDecay'
登录后复制

可视化结果

一文搞懂Paddle2.0中的学习率 - 游乐网

图6 MultiStepDecay训练验证图

In [7]
## 查看测试集上的效果!python code/test.py --lr 'MultiStepDecay'
登录后复制
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/layers/utils.py:26: DeprecationWarning: `np.int` is a deprecated alias for the builtin `int`. To silence this warning, use `int` by itself. Doing this will not modify any behavior and is safe. When replacing `np.int`, you may wish to use e.g. `np.int64` or `np.int32` to specify the precision. If you wish to review your current use, check the release note link for additional information.Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations  def convert_to_list(value, n, name, dtype=np.int):W0508 21:59:52.387367  2630 device_context.cc:362] Please NOTE: device: 0, GPU Compute Capability: 7.0, Driver API Version: 10.1, Runtime API Version: 10.1W0508 21:59:52.392359  2630 device_context.cc:372] device: 0, cuDNN Version: 7.6.Eval begin...The loss value printed in the log is the current batch, and the metric is the average value of previous step.step 28/28 [==============================] - acc: 0.6982 - 736ms/step        Eval samples: 1763
登录后复制

3.7 使用自然指数衰减学习率

使用方法为:paddle.optimizer.lr.NaturalExpDecay(learning_rate, gamma, last_epoch=-1, verbose=False)

该接口提供按自然指数衰减学习率的策略。

学习率更新公式如下:

一文搞懂Paddle2.0中的学习率 - 游乐网In [7]
## 开始训练!python code/train.py --lr 'NaturalExpDecay'
登录后复制

可视化结果

一文搞懂Paddle2.0中的学习率 - 游乐网

图7 NaturalExpDecay训练验证图

In [6]
## 查看测试集上的效果!python code/test.py --lr 'NaturalExpDecay'
登录后复制
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/layers/utils.py:26: DeprecationWarning: `np.int` is a deprecated alias for the builtin `int`. To silence this warning, use `int` by itself. Doing this will not modify any behavior and is safe. When replacing `np.int`, you may wish to use e.g. `np.int64` or `np.int32` to specify the precision. If you wish to review your current use, check the release note link for additional information.Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations  def convert_to_list(value, n, name, dtype=np.int):W0508 21:57:41.858023  2325 device_context.cc:362] Please NOTE: device: 0, GPU Compute Capability: 7.0, Driver API Version: 10.1, Runtime API Version: 10.1W0508 21:57:41.863117  2325 device_context.cc:372] device: 0, cuDNN Version: 7.6.Eval begin...The loss value printed in the log is the current batch, and the metric is the average value of previous step.step 28/28 [==============================] - acc: 0.7379 - 726ms/step        Eval samples: 1763
登录后复制

3.8 使用Noam衰减学习率

使用方法为:paddle.optimizer.lr.NoamDecay(d_model, warmup_steps, learning_rate=1.0, last_epoch=-1, verbose=False)

该接口提供Noam衰减学习率的策略。

学习率更新公式如下:

一文搞懂Paddle2.0中的学习率 - 游乐网In [8]
## 开始训练!python code/train.py --lr 'NoamDecay'
登录后复制

可视化结果

一文搞懂Paddle2.0中的学习率 - 游乐网

图8 NoamDecay训练验证图

In [5]
## 查看测试集上的效果!python code/test.py --lr 'NoamDecay'
登录后复制
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/layers/utils.py:26: DeprecationWarning: `np.int` is a deprecated alias for the builtin `int`. To silence this warning, use `int` by itself. Doing this will not modify any behavior and is safe. When replacing `np.int`, you may wish to use e.g. `np.int64` or `np.int32` to specify the precision. If you wish to review your current use, check the release note link for additional information.Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations  def convert_to_list(value, n, name, dtype=np.int):W0508 21:55:24.339512  1995 device_context.cc:362] Please NOTE: device: 0, GPU Compute Capability: 7.0, Driver API Version: 10.1, Runtime API Version: 10.1W0508 21:55:24.344694  1995 device_context.cc:372] device: 0, cuDNN Version: 7.6.Eval begin...The loss value printed in the log is the current batch, and the metric is the average value of previous step.step 28/28 [==============================] - acc: 0.9490 - 749ms/step        Eval samples: 1763
登录后复制

3.9 使用分段衰减学习率

使用方法为:paddle.optimizer.lr.PiecewiseDecay(boundaries, values, last_epoch=-1, verbose=False)

该接口提供分段设置学习率的策略。

In [9]
## 开始训练!python code/train.py --lr 'PiecewiseDecay'
登录后复制

可视化结果

一文搞懂Paddle2.0中的学习率 - 游乐网

图9 PiecewiseDecay训练验证图

In [4]
## 查看测试集上的效果!python code/test.py --lr 'PiecewiseDecay'
登录后复制
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/layers/utils.py:26: DeprecationWarning: `np.int` is a deprecated alias for the builtin `int`. To silence this warning, use `int` by itself. Doing this will not modify any behavior and is safe. When replacing `np.int`, you may wish to use e.g. `np.int64` or `np.int32` to specify the precision. If you wish to review your current use, check the release note link for additional information.Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations  def convert_to_list(value, n, name, dtype=np.int):W0508 21:52:38.228776  1695 device_context.cc:362] Please NOTE: device: 0, GPU Compute Capability: 7.0, Driver API Version: 10.1, Runtime API Version: 10.1W0508 21:52:38.233954  1695 device_context.cc:372] device: 0, cuDNN Version: 7.6.Eval begin...The loss value printed in the log is the current batch, and the metric is the average value of previous step.step 28/28 [==============================] - acc: 0.9416 - 749ms/step        Eval samples: 1763
登录后复制

3.10 使用多项式衰减学习率

使用方法为:paddle.optimizer.lr.PolynomialDecay(learning_rate, decay_steps, end_lr=0.0001, power=1.0, cycle=False, last_epoch=-1, verbose=False)

该接口提供学习率按多项式衰减的策略。通过多项式衰减函数,使得学习率值逐步从初始的learning_rate衰减到 end_lr。

In [10]
## 开始训练!python code/train.py --lr 'PolynomialDecay'
登录后复制

可视化结果

一文搞懂Paddle2.0中的学习率 - 游乐网

图10 PolynomialDecay训练验证图

In [3]
## 查看测试集上的效果!python code/test.py --lr 'PolynomialDecay'
登录后复制
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/layers/utils.py:26: DeprecationWarning: `np.int` is a deprecated alias for the builtin `int`. To silence this warning, use `int` by itself. Doing this will not modify any behavior and is safe. When replacing `np.int`, you may wish to use e.g. `np.int64` or `np.int32` to specify the precision. If you wish to review your current use, check the release note link for additional information.Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations  def convert_to_list(value, n, name, dtype=np.int):W0508 21:50:10.634958  1327 device_context.cc:362] Please NOTE: device: 0, GPU Compute Capability: 7.0, Driver API Version: 10.1, Runtime API Version: 10.1W0508 21:50:10.639853  1327 device_context.cc:372] device: 0, cuDNN Version: 7.6.Eval begin...The loss value printed in the log is the current batch, and the metric is the average value of previous step.step 28/28 [==============================] - acc: 0.9325 - 753ms/step        Eval samples: 1763
登录后复制

3.11 Loss自适应的学习率

使用方法为:paddle.optimizer.lr.ReduceOnPlateau(learning_rate, mode='min', factor=0.1, patience=10, threshold=1e-4, threshold_mode='rel', cooldown=0, min_lr=0, epsilon=1e-8, verbose=False)

该接口提供Loss自适应学习率的策略。如果 loss 停止下降超过patience个epoch,学习率将会衰减为 learning_rate * factor。每降低一次学习率后,将会进入一个时长为cooldown个epoch的冷静期,在冷静期内,将不会监控loss的变化情况,也不会衰减。在冷静期之后,会继续监控loss的上升或下降。

In [11]
## 开始训练!python code/train_reduceonplateau.py
登录后复制

可视化结果

一文搞懂Paddle2.0中的学习率 - 游乐网

图11 ReduceOnPlateau训练验证图

In [22]
## 查看测试集上的效果!python code/test.py --lr 'ReduceOnPlateau'
登录后复制
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/layers/utils.py:26: DeprecationWarning: `np.int` is a deprecated alias for the builtin `int`. To silence this warning, use `int` by itself. Doing this will not modify any behavior and is safe. When replacing `np.int`, you may wish to use e.g. `np.int64` or `np.int32` to specify the precision. If you wish to review your current use, check the release note link for additional information.Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations  def convert_to_list(value, n, name, dtype=np.int):W0508 00:26:51.685511 20016 device_context.cc:362] Please NOTE: device: 0, GPU Compute Capability: 7.0, Driver API Version: 11.0, Runtime API Version: 10.1W0508 00:26:51.690647 20016 device_context.cc:372] device: 0, cuDNN Version: 7.6.Eval begin...The loss value printed in the log is the current batch, and the metric is the average value of previous step.step 28/28 [==============================] - acc: 0.9365 - 706ms/step        Eval samples: 1763
登录后复制

3.12 阶段衰减学习率

使用方法为:paddle.optimizer.lr.StepDecay(learning_rate, step_size, gamma=0.1, last_epoch=-1, verbose=False)

该接口提供一种学习率按指定间隔轮数衰减的策略。

In [12]
## 开始训练!python code/train.py --lr 'StepDecay'
登录后复制

可视化结果

一文搞懂Paddle2.0中的学习率 - 游乐网

图12 StepDecay训练验证图

In [ ]
## 查看测试集上的效果!python code/test.py --lr 'StepDecay'
登录后复制
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/layers/utils.py:26: DeprecationWarning: `np.int` is a deprecated alias for the builtin `int`. To silence this warning, use `int` by itself. Doing this will not modify any behavior and is safe. When replacing `np.int`, you may wish to use e.g. `np.int64` or `np.int32` to specify the precision. If you wish to review your current use, check the release note link for additional information.Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations  def convert_to_list(value, n, name, dtype=np.int):W0507 23:30:52.671571 14817 device_context.cc:362] Please NOTE: device: 0, GPU Compute Capability: 7.0, Driver API Version: 11.0, Runtime API Version: 10.1W0507 23:30:52.676681 14817 device_context.cc:372] device: 0, cuDNN Version: 7.6.Eval begin...The loss value printed in the log is the current batch, and the metric is the average value of previous step.step 28/28 [==============================] - acc: 0.5315 - 733ms/step        Eval samples: 1763
登录后复制

4. 自定义学习率算法

自定义学习率除了可以使用Paddle2.0中提供的学习率算法基类paddle.optimizer.lr.LRScheduler以外,还可以完全不依赖基类自定义学习率,这里采用自定义的方法实现循环学习率算法CLR和采用基类paddle.optimizer.lr.LRScheduler实现每隔一定epoch衰减学习率的Adjust_lr。

4.1 自定义学习率算法CLR

CLR方法来自于论文Cyclical Learning Rates for Training Neural Networks

根据论文的描述,CLR方法可以在训练过程中周期性的增大和减小学习率,从而使学习率始终在最优点附近徘徊。

更新公式如下:

一文搞懂Paddle2.0中的学习率 - 游乐网In [13]
## 开始训练!python code/train_clr.py
登录后复制

可视化结果

一文搞懂Paddle2.0中的学习率 - 游乐网

图13 CLR训练验证图

In [1]
## 查看测试集上的效果!python code/test.py --lr 'clr'
登录后复制
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/layers/utils.py:26: DeprecationWarning: `np.int` is a deprecated alias for the builtin `int`. To silence this warning, use `int` by itself. Doing this will not modify any behavior and is safe. When replacing `np.int`, you may wish to use e.g. `np.int64` or `np.int32` to specify the precision. If you wish to review your current use, check the release note link for additional information.Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations  def convert_to_list(value, n, name, dtype=np.int):W0508 21:37:39.301434   105 device_context.cc:362] Please NOTE: device: 0, GPU Compute Capability: 7.0, Driver API Version: 10.1, Runtime API Version: 10.1W0508 21:37:39.306394   105 device_context.cc:372] device: 0, cuDNN Version: 7.6.Eval begin...The loss value printed in the log is the current batch, and the metric is the average value of previous step.step 28/28 [==============================] - acc: 0.9575 - 731ms/step        Eval samples: 1763
登录后复制

4.2 自定义学习率算法Adjust_lr

这里实现的Adjust_lr方法的更新算法是:在整个10个epoch里,每隔2个epoch将学习率衰减为原来的0.95倍。

In [1]
## 开始训练!python code/train.py --lr 'Adjust_lr'
登录后复制

可视化结果

一文搞懂Paddle2.0中的学习率 - 游乐网

图14 Adjust_lr训练验证图

In [2]
## 查看测试集上的效果!python code/test.py --lr 'Adjust_lr'
登录后复制
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/layers/utils.py:26: DeprecationWarning: `np.int` is a deprecated alias for the builtin `int`. To silence this warning, use `int` by itself. Doing this will not modify any behavior and is safe. When replacing `np.int`, you may wish to use e.g. `np.int64` or `np.int32` to specify the precision. If you wish to review your current use, check the release note link for additional information.Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations  def convert_to_list(value, n, name, dtype=np.int):W0511 21:55:19.051586  6496 device_context.cc:362] Please NOTE: device: 0, GPU Compute Capability: 7.0, Driver API Version: 11.0, Runtime API Version: 10.1W0511 21:55:19.055864  6496 device_context.cc:372] device: 0, cuDNN Version: 7.6.Eval begin...The loss value printed in the log is the current batch, and the metric is the average value of previous step.step 28/28 [==============================] - acc: 0.9308 - 713ms/step        Eval samples: 1763
登录后复制

5. 结果比较

本项目中所有学习率的性能比较如下图所示,其中自定义的循环学习率CLR效果最好,建议使用。其次为LinearWarmup 、NoamDecay、CosineAnnealingDecay和PiecewiseDecay,值得尝试。

一文搞懂Paddle2.0中的学习率 - 游乐网

图14 学习率性能比较

来源:https://www.php.cn/faq/1414346.html

游乐网为非赢利性网站,所展示的游戏/软件/文章内容均来自于互联网或第三方用户上传分享,版权归原作者所有,本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容,请联系youleyoucom@outlook.com。

同类文章
更多
逼AI当山顶洞人!Claude防话痨插件爆火,网友:受够了AI废话

逼AI当山顶洞人!Claude防话痨插件爆火,网友:受够了AI废话

新智元报道编辑:元宇【新智元导读】一个让AI像原始人一样说话的插件,在HN上一夜爆火,冲破2w星。它的核心只是一条简单粗暴的prompt:删掉冠词、客套和一切废话,号称能省下75%的输出token。

时间:2026-04-07 14:55
季度利润翻 8 倍,最赚钱的「卖铲人」财报背后,内存涨价狂潮如何收场?

季度利润翻 8 倍,最赚钱的「卖铲人」财报背后,内存涨价狂潮如何收场?

AI 时代最赚钱的公司,可能从来不是做 AI 的那个。作者|张勇毅编辑|靖宇淘金热里最稳赚的人,从来不是淘金的,是卖铲子的。这句老话在 2026 年的科技行业又应验了一次。只不过这次卖铲子的不是英伟

时间:2026-04-07 14:49
Claude Code Harness+龙虾科研团来了!金字塔分层架构+多智能体

Claude Code Harness+龙虾科研团来了!金字塔分层架构+多智能体

Claw AI Lab团队量子位 | 公众号 QbitAI你还在一个人做科研吗?科研最难的,从来不是问题本身,而是一个想法从文献到实验再到写作,只能靠自己一点点往前推。一个人方向偏了没人提醒,遇到歧

时间:2026-04-07 14:43
让离线强化学习从「局部描摹」变「全局布局」丨ICLR'26

让离线强化学习从「局部描摹」变「全局布局」丨ICLR'26

面对复杂连续任务的长程规划,现有的生成式离线强化学习方法往往会暴露短板。它们生成的轨迹经常陷入局部合理但全局偏航的窘境。它们太关注眼前的每一步,却忘了最终的目的地。针对这一痛点,厦门大学和香港科技大

时间:2026-04-07 14:37
美国犹他州启动新试点项目:AI为患者开具精神类药物处方

美国犹他州启动新试点项目:AI为患者开具精神类药物处方

IT之家 4 月 5 日消息,据外媒 PC Mag 当地时间 4 月 4 日报道,美国医疗机构 Legion Health 在犹他州获得监管批准,启动一项试点项目,允许 AI 系统为患者开具精神类药

时间:2026-04-07 14:30
热门专题
更多
刀塔传奇破解版无限钻石下载大全 刀塔传奇破解版无限钻石下载大全
洛克王国正式正版手游下载安装大全 洛克王国正式正版手游下载安装大全
思美人手游下载专区 思美人手游下载专区
好玩的阿拉德之怒游戏下载合集 好玩的阿拉德之怒游戏下载合集
不思议迷宫手游下载合集 不思议迷宫手游下载合集
百宝袋汉化组游戏最新合集 百宝袋汉化组游戏最新合集
jsk游戏合集30款游戏大全 jsk游戏合集30款游戏大全
宾果消消消原版下载大全 宾果消消消原版下载大全
  • 日榜
  • 周榜
  • 月榜
热门教程
更多
  • 游戏攻略
  • 安卓教程
  • 苹果教程
  • 电脑教程