基于Paddle框架的YOLOX构建
本文解析YOLOX结构并基于Paddle构建网络。YOLOX改进自YOLOv3,含Decoupled Head等改进。网络分主干CSPDarknet、PAN Head、YOLO Head。文中构建各组件,如ConvBlock等,组装成各部分并测试,验证了网络结构正确性,训练预测后续讨论。

YOLOX结构解析与基于Paddle的网络构建
本Notebook对YOLOX的网络结构进行了解析,并采用PaddlePaddle框架对于YOLOX的网络结构进行了构建。
注:本Notebook仅讨论网络的构建部分,网络的训练、预测过程将在后续NoteBook进行讨论。
免费影视、动漫、音乐、游戏、小说资源长期稳定更新! 👉 点此立即查看 👈
1. YOLOX简介
YOLOX是旷视科技(Megvii)在YOLOv3基础上改进的。主要改进的部分在于 Decoupled Head、Anchor Free、SimOTA、Data Aug。另外为了yolov5对比,主干网络引入了yolov5的FOCUS、CSPNet、PAN Head、SiLU激活。
1.1 Decoupled Head
Decoupled Head是学术领域一阶段网络的标准配置。然而,以前版本的YOLO所用的预测头是一起的,分类和回归在一个1x1卷积中实现。
作者实验发现End2End的YOLOX始终比标准YOLOX低4-5个点,偶然间把原始YOLO Head换成Decoupled Head,发现差距显著缩小,认为YOLO Head的表达能力可能有所欠缺。
YOLOX中,YOLO Head将分类和回归分别实现,最后预测时才整合。经过权衡速度和性能得失,最终使用1个1x1卷积先进行降维,并在分类和回归分支里各使用了2个3x3卷积。
1.2 Anchor Free
Anchor Free有以下几个好处:
降低时间成本Anchor Based检测器为了追求最优性能需要对anchor box聚类分析,增加了时间成本.降低检测头复杂度和生成结果数量
Anchor Based检测器增加了检测头复杂度以及生成结果的数量,将大量检测结果从GPU搬运到CPU上对于边缘设备是无法容忍的。代码逻辑简单,可读性增强
Anchor Free 的解码代码逻辑更简单,可读性更高。
Anchor Free技术目前可以上YOLO,且性能不降反升,与样本匹配有密不可分的联系。
1.3 样本匹配SimOTA
样本匹配算法可以天然缓解拥挤场景检测问题、缓解极端长宽比的物体检测效果差的问题、极端大小目标正样本不均衡问题、缓解旋转物体检测效果不好的问题。
作者认为样本匹配中主要有四个重要因素:
Loss/Quality/Prediction Aware基于网络自身预测来计算anchor box或者anchor point与Groud Truth匹配关系,充分考虑不同结构/复杂度模型可能会有不同行为,是一种动态的样本匹配。
与之相对的,基于IoU阈值/In Grid(YOLOv1)/In Box or Center(FCOS)都依赖于人为定义几何先验做样本匹配,属于次优方案。Center prior
大部分场景下,目标质心与目标几何中心有一定联系,将正样本限定在目标中心一定区域内做样本匹配能很好地解决收敛不稳定问题。Dynamic k
对于不同大小的目标应该设置不同的正样本数量。对于不同大小的目标设置相同的正样本数,会导致小目标有大量低质量正样本或大目标只有几个正样本。
Dynamic k的关键在于确定k,k的估计可以采用prediction aware的,具体的作者先计算每个目标最接近的10个预测,然后把这10个预测与Groud Truth的IOU加起来求得最终的k。
此外10这个数字也不是很敏感,在 5-15之间调整几乎没有影响。全局信息
部分anchor box/point处于正样本之间交界、或者正负样本之间交界,这类anchor box/point的正负划分,归属哪个正样本,都应考虑全局信息。 最终,在权衡速度的条件下,作者仅保留了前三点,去除最优方案求解过程,将OTA转为SimOTA。
1.4 Data Augmentation
数据增强方面延用Mosaic和Mixup数据增强技术,利用了四张图片进行拼接实现数据中增强,丰富了检测物体的背景。
Mosaic方法在YOLOv4中提出,主要思想是将四张图片进行随机裁剪,再拼接到一张图上作为训练数据。好处是丰富了图片背景,且四张图片拼接在一起变相提高batch_size,在进行batch normalization的时候也会计算四张图片,对本身batch_size不是很依赖。
具体可参考论文:YOLOv4: Optimal Speed and Accuracy of Object Detection
Mixup方法使用朴素的线性插值方法得到新扩展数据。
具体可参考论文:mixup: Beyond Empirical Risk Minimization
2. 网络结构剖析
参考B站Up主Bubbliiiing绘制的网络结构图,网络整体可以分为三个部分:主干网络CSPDarknet、特征加强的PAN Head、检测头YOLO Head。
# 引入库import paddlefrom paddle import nn登录后复制
2.1 主干网络 CSPDarknet
2.1.1 ConvBlock
基本卷积块包含卷积、批归一化和激活函数。基本卷积块采用等大填充Same Padding,包含一般卷积(BaseConv)和深度可分离卷积(DWConv)两种类型。
BaseConv结构示意
Bottleneck残差卷积块,主干采用2个基本卷积块,卷积核大小分别为1和3,残差部分保持原输入,结果输出主干与残差边之和。
Bottleneck结构示意
In [2]## 构建卷积块class BaseConv(nn.Layer): def __init__(self, in_channels, out_channels, kernel_size, stride, groups=1, act='silu'): super().__init__() padding = (kernel_size-1)//2 self.conv = nn.Conv2D(in_channels, out_channels, kernel_size, stride, padding, groups=groups) self.bn = nn.BatchNorm2D(out_channels,momentum=0.03, epsilon=0.001) if act == 'silu': self.act = nn.Silu() elif act == 'relu': self.act = nn.ReLU() elif act == 'lrelu': self.act = nn.LeakyReLU(0.1) def forward(self, x): return self.act(self.bn(self.conv(x)))登录后复制 In [3]
## 构建深度可分离卷积class DWConv(nn.Layer): # Some Problem def __init__(self, in_channels, out_channels, kernel_size, stride=1, act='silu'): super().__init__() self.dconv = BaseConv(in_channels, in_channels, kernel_size, stride, groups=in_channels, act=act) self.pconv = BaseConv(in_channels, out_channels, 1, 1, groups=1, act=act) def forward(self, x): x = self.dconv(x) return self.pconv(x)登录后复制 In [4]
## 构建残差结构class Bottleneck(nn.Layer): def __init__(self, in_channels, out_channels, shortcut=True, expansion=0.5, depthwise=False, act="silu"): super().__init__() hidden_channels = int(out_channels * expansion) Conv = DWConv if depthwise else BaseConv # 1x1卷积进行通道数的缩减(缩减率默认50%) self.conv1 = BaseConv(in_channels, hidden_channels, 1, stride=1, act=act) # 3x3卷积进行通道数的拓张(特征提取) self.conv2 = Conv(hidden_channels, out_channels, 3, stride=1, act=act) self.use_add = shortcut and in_channels == out_channels def forward(self, x): y = self.conv2(self.conv1(x)) if self.use_add: y = y + x return y登录后复制 In [5]
## 测试卷积模块x = paddle.ones([1, 3, 640, 640])conv1 = BaseConv(3, 64, 3, 1)conv2 = DWConv(3, 64, 3, 1)block1 = Bottleneck(3, 64)print(conv1(x).shape)print(conv2(x).shape)print(block1(x).shape)登录后复制
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/nn/layer/norm.py:653: UserWarning: When training, we now always track global mean and variance. "When training, we now always track global mean and variance.")登录后复制
[1, 64, 640, 640][1, 64, 640, 640][1, 64, 640, 640]登录后复制
2.1.2 Focus
Focus最早在YOLOv5(并无论文)中提出,具体操作是在一张图片中每隔一个像素拿到一个值,类似于邻近下采样,这样就获得了四张图片,四张图片互补,将W、H信息集中到了通道空间C,输入通道扩充为4倍,拼接起来的图片相对于原先的RGB三通道模式变成了12个通道,最后将得到的新图片再经过卷积操作,最终得到了没有信息丢失情况下的二倍下采样特征图。
Focus作用是为了提速,作者提到使用Focus层可以减少参数计算,减少Cuda使用内存。
In [6]## Focus层class Focus(nn.Layer): def __init__(self, in_channels, out_channels, ksize=1, stride=1, act="silu"): super().__init__() self.conv = BaseConv(in_channels * 4, out_channels, ksize, stride, act=act) def forward(self, x): # 分别获得4个2倍下采样结果 patch_1 = x[..., ::2, ::2] patch_2 = x[..., 1::2, ::2] patch_3 = x[..., ::2, 1::2] patch_4 = x[..., 1::2, 1::2] # 沿通道方向拼接4个下采样结果 x = paddle.concat((patch_1, patch_2, patch_3, patch_4), axis=1) # 拼接结果做卷积 out = self.conv(x) return out登录后复制 In [7]
## 测试FOCUS模块x = paddle.ones([1, 3, 640, 640])layer = Focus(3, 64)print(layer(x).shape)登录后复制
[1, 64, 320, 320]登录后复制
2.1.3 CSPLayer
CSPLayer主要结构如下图所示,在常规结构基础上,引入一条类似残差结构的分支。
主干部分采用1个基本卷积块+堆叠N个Bottleneck残差块结构提取特征,残差部分采用1个基本卷积块,最后合并两个分支再作用一次基本卷积块。
CSPLayer结构示意
In [8]## CSPLayerclass CSPLayer(nn.Layer): def __init__(self, in_channels, out_channels, n=1, shortcut=True, expansion=0.5, depthwise=False, act="silu",): super().__init__() hidden_channels = int(out_channels * expansion) # 主干部分的基本卷积块 self.conv1 = BaseConv(in_channels, hidden_channels, 1, stride=1, act=act) # 残差边部分的基本卷积块 self.conv2 = BaseConv(in_channels, hidden_channels, 1, stride=1, act=act) # 拼接主干与残差后的基本卷积块 self.conv3 = BaseConv(2 * hidden_channels, out_channels, 1, stride=1, act=act) # 根据循环次数构建多个残差块瓶颈结构 res_block = [Bottleneck(hidden_channels, hidden_channels, shortcut, 1.0, depthwise, act=act) for _ in range(n)] self.res_block = nn.Sequential(*res_block) def forward(self, x): # 主干部分 x_main = self.conv1(x) x_main = self.res_block(x_main) # 残差边部分 x_res = self.conv2(x) # 主干部分和残差边部分进行堆叠 x = paddle.concat((x_main, x_res), axis=1) # 对堆叠的结果进行卷积的处理 out = self.conv3(x) return out登录后复制 In [9]
## 测试CSPLayer模块x = paddle.ones([1, 3, 640, 640])layer = CSPLayer(3, 64, 5)print(layer(x).shape)登录后复制
[1, 64, 640, 640]登录后复制登录后复制
2.1.4 SPPBottleneck
SPPBottleneck主要结构如下图所示,采用卷积块1+4条通路+拼接+卷积块2的整体结构。
卷积块1将通道数缩减一半;4条通路下采样为原始输入和窗口大小分别为5,9,13的最大池化;沿通道方向拼接;卷积块2调整输出通道数。
SPPBottleneck结构示意
In [10]## SPPBottleneckclass SPPBottleneck(nn.Layer): def __init__(self, in_channels, out_channels, kernel_sizes=(5, 9, 13), activation="silu"): super().__init__() hidden_channels = in_channels // 2 self.conv1 = BaseConv(in_channels, hidden_channels, 1, stride=1, act=activation) self.pool_block = nn.Sequential(*[nn.MaxPool2D(kernel_size=ks, stride=1, padding=ks // 2) for ks in kernel_sizes]) conv2_channels = hidden_channels * (len(kernel_sizes) + 1) self.conv2 = BaseConv(conv2_channels, out_channels, 1, stride=1, act=activation) def forward(self, x): x = self.conv1(x) x = paddle.concat([x] + [pool(x) for pool in self.pool_block], axis=1) x = self.conv2(x) return x登录后复制 In [11]
## 测试SPPBottleneck模块x = paddle.ones([1, 3, 640, 640])layer = SPPBottleneck(3, 64)print(layer(x).shape)登录后复制
[1, 64, 640, 640]登录后复制登录后复制
2.1.5 CSPDarknet
CSPDarknet为YOLOX的主干网络用于网络的特征提取,结果将输出三个特征层(输入为[3, 640, 640],三个特征层尺寸分别为[256, 80, 80], [512, 40, 40], [1024, 20, 20])。其主要结构如下图所示,其中主要涉及到的块如Focus、BaseConv、CSPLayer、SPPBottleneck均在上文实现,下面将这些部分组装起来:
CSPDarknet结构示意
In [12]## CSPDarknetclass CSPDarknet(nn.Layer): def __init__(self, dep_mul, wid_mul, out_features=("dark3", "dark4", "dark5"), depthwise=False, act="silu",): super().__init__() assert out_features, "please provide output features of Darknet" self.out_features = out_features Conv = DWConv if depthwise else BaseConv # Image Size : [3, 640, 640] base_channels = int(wid_mul * 64) # 64 base_depth = max(round(dep_mul * 3), 1) # 3 # 利用focus网络特征提取 # [-1, 3, 640, 640] -> [-1, 64, 320, 320] self.stem = Focus(3, base_channels, ksize=3, act=act) # Resblock1[dark2] # [-1, 64, 320, 320] -> [-1, 128, 160, 160] self.dark2 = nn.Sequential( Conv(base_channels, base_channels * 2, 3, 2, act=act), CSPLayer(base_channels * 2, base_channels * 2, n=base_depth, depthwise=depthwise, act=act), ) # Resblock2[dark3] # [-1, 128, 160, 160] -> [-1, 256, 80, 80] self.dark3 = nn.Sequential( Conv(base_channels * 2, base_channels * 4, 3, 2, act=act), CSPLayer(base_channels * 4, base_channels * 4, n=base_depth * 3, depthwise=depthwise, act=act), ) # Resblock3[dark4] # [-1, 256, 80, 80] -> [-1, 512, 40, 40] self.dark4 = nn.Sequential( Conv(base_channels * 4, base_channels * 8, 3, 2, act=act), CSPLayer(base_channels * 8, base_channels * 8, n=base_depth * 3, depthwise=depthwise, act=act), ) # Resblock4[dark5] # [-1, 512, 40, 40] -> [-1, 1024, 20, 20] self.dark5 = nn.Sequential( Conv(base_channels * 8, base_channels * 16, 3, 2, act=act), SPPBottleneck(base_channels * 16, base_channels * 16, activation=act), CSPLayer(base_channels * 16, base_channels * 16, n=base_depth, shortcut=False, depthwise=depthwise, act=act), ) def forward(self, x): outputs = {} x = self.stem(x) outputs["stem"] = x x = self.dark2(x) outputs["dark2"] = x # dark3输出特征层:[256, 80, 80] x = self.dark3(x) outputs["dark3"] = x # dark4输出特征层:[512, 40, 40] x = self.dark4(x) outputs["dark4"] = x # dark5输出特征层:[1024, 20, 20] x = self.dark5(x) outputs["dark5"] = x return {k: v for k, v in outputs.items() if k in self.out_features}登录后复制 In [13]## 测试CSPDarknet模块x = paddle.ones([1, 3, 640, 640])net1 = CSPDarknet(1, 1)print(net1(x)['dark3'].shape, net1(x)['dark4'].shape, net1(x)['dark5'].shape)登录后复制
[1, 256, 80, 80] [1, 512, 40, 40] [1, 1024, 20, 20]登录后复制登录后复制
2.2 特征加强金字塔 YOLOPAFPN
YOLOPAFPN为YOLOX网络的特征加强部分,集成了FPN和PANET。通过将主干网络获得的三个特征层经过多次上采样和下采样进行特征融合,将不同尺度的特征信息进行结合。YOLOPAFPN的整体结构如下:
底层特征[1024, 20, 20]进行1次1X1卷积调整通道后获得P5特征[512, 20, 20],P5上采样与中层特征[512, 40, 40]进行结合,然后使用CSPLayer进行特征提取获得P5_upsample特征[512, 40, 40]。P5_upsample特征[512, 40, 40]进行1次1X1卷积调整通道后获得P4特征[256, 40, 40],P4进行上采样与上层特征[256, 80, 80]进行结合,然后使用CSPLayer进行特征提取P3_out特征[256, 80, 80]。P3_out特征[256, 80, 80]进行一次3x3卷积进行下采样,下采样后与P4堆叠,然后使用CSPLayer进行特征提取P4_out特征[512, 40, 40]。P4_out特征[512, 40, 40]进行一次3x3卷积进行下采样,下采样后与P5堆叠,然后使用CSPLayer进行特征提取P5_out特征[1024, 20, 20]。YOLOPAFPN结构示意
## YOLOPAFPNclass YOLOPAFPN(nn.Layer): def __init__(self, depth = 1.0, width = 1.0, in_features = ("dark3", "dark4", "dark5"), in_channels = [256, 512, 1024], depthwise = False, act = "silu"): super().__init__() Conv = DWConv if depthwise else BaseConv self.backbone = CSPDarknet(depth, width, depthwise = depthwise, act = act) self.in_features = in_features self.upsample = nn.Upsample(scale_factor=2, mode='nearest') # [-1, 1024, 20, 20] -> [-1, 512, 20, 20] self.lateral_conv0 = BaseConv(int(in_channels[2] * width), int(in_channels[1] * width), 1, 1, act=act) # [-1, 1024, 40, 40] -> [-1, 512, 40, 40] self.C3_p4 = CSPLayer( int(2 * in_channels[1] * width), int(in_channels[1] * width), round(3 * depth), False, depthwise = depthwise, act = act ) # [-1, 512, 40, 40] -> [-1, 256, 40, 40] self.reduce_conv1 = BaseConv(int(in_channels[1] * width), int(in_channels[0] * width), 1, 1, act=act) # [-1, 512, 80, 80] -> [-1, 256, 80, 80] self.C3_p3 = CSPLayer( int(2 * in_channels[0] * width), int(in_channels[0] * width), round(3 * depth), False, depthwise = depthwise, act = act ) # Bottom-Up Conv # [-1, 256, 80, 80] -> [-1, 256, 40, 40] self.bu_conv2 = Conv(int(in_channels[0] * width), int(in_channels[0] * width), 3, 2, act=act) # [-1, 512, 40, 40] -> [-1, 512, 40, 40] self.C3_n3 = CSPLayer( int(2 * in_channels[0] * width), int(in_channels[1] * width), round(3 * depth), False, depthwise = depthwise, act = act ) # [-1, 512, 40, 40] -> [-1, 512, 20, 20] self.bu_conv1 = Conv(int(in_channels[1] * width), int(in_channels[1] * width), 3, 2, act=act) # [-1, 1024, 20, 20] -> [-1, 1024, 20, 20] self.C3_n4 = CSPLayer( int(2 * in_channels[1] * width), int(in_channels[2] * width), round(3 * depth), False, depthwise = depthwise, act = act ) def forward(self, input): out_features = self.backbone(input) [feat1, feat2, feat3] = [out_features[f] for f in self.in_features] # [-1, 1024, 20, 20] -> [-1, 512, 20, 20] P5 = self.lateral_conv0(feat3) # [-1, 512, 20, 20] -> [-1, 512, 40, 40] P5_upsample = self.upsample(P5) # [-1, 512, 40, 40] + [-1, 512, 40, 40] -> [-1, 1024, 40, 40] P5_upsample = paddle.concat([P5_upsample, feat2], axis=1) # [-1, 1024, 40, 40] -> [-1, 512, 40, 40] P5_upsample = self.C3_p4(P5_upsample) # [-1, 512, 40, 40] -> [-1, 256, 40, 40] P4 = self.reduce_conv1(P5_upsample) # [-1, 256, 40, 40] -> [-1, 256, 80, 80] P4_upsample = self.upsample(P4) # [-1, 256, 80, 80] + [-1, 256, 80, 80] -> [-1, 512, 80, 80] P4_upsample = paddle.concat([P4_upsample, feat1], axis=1) # [-1, 512, 80, 80] -> [-1, 256, 80, 80] P3_out = self.C3_p3(P4_upsample) # [-1, 256, 80, 80] -> [-1, 256, 40, 40] P3_downsample = self.bu_conv2(P3_out) # [-1, 256, 40, 40] + [-1, 256, 40, 40] -> [-1, 512, 40, 40] P3_downsample = paddle.concat([P3_downsample, P4], axis=1) # [-1, 512, 40, 40] -> [-1, 512, 40, 40] P4_out = self.C3_n3(P3_downsample) # [-1, 512, 40, 40] -> [-1, 512, 20, 20] P4_downsample = self.bu_conv1(P4_out) # [-1, 512, 20, 20] + [-1, 512, 20, 20] -> [-1, 1024, 20, 20] P4_downsample = paddle.concat([P4_downsample, P5], axis=1) # [-1, 1024, 20, 20] -> [-1, 1024, 20, 20] P5_out = self.C3_n4(P4_downsample) return (P3_out, P4_out, P5_out)登录后复制 In [15]## 测试YOLOPAFPN模块features = paddle.ones([1, 256, 80, 80]), paddle.ones([1, 512, 40, 40]), paddle.ones([1, 1024, 20, 20])net2 = YOLOPAFPN()print(net2(x)[0].shape, net2(x)[1].shape, net2(x)[2].shape)登录后复制
[1, 256, 80, 80] [1, 512, 40, 40] [1, 1024, 20, 20]登录后复制登录后复制
2.3 检测头 YOLOX Head
YOLOX Head时YOLOX网络的检测头,同时起到分类器与回归器的作用,相比于传统的yolo检测头,yolox head检测头是解耦的,将分类和回归分为两个分支进行处理,最后预测的时候再进行整合,加强了网络的识别能力。
YOLOX Head结构示意
## YOLOX Headclass YOLOXHead(nn.Layer): def __init__(self, num_classes, width = 1.0, in_channels = [256, 512, 1024], act = "silu", depthwise = False,): super().__init__() Conv = DWConv if depthwise else BaseConv self.cls_convs = [] self.reg_convs = [] self.cls_preds = [] self.reg_preds = [] self.obj_preds = [] self.stems = [] for i in range(len(in_channels)): # 预处理卷积: 1个1x1卷积 self.stems.append(BaseConv(in_channels = int(in_channels[i] * width), out_channels = int(256 * width), kernel_size = 1, stride = 1, act = act)) # 分类特征提取: 2个3x3卷积 self.cls_convs.append(nn.Sequential(*[ Conv(in_channels = int(256 * width), out_channels = int(256 * width), kernel_size= 3, stride = 1, act = act), Conv(in_channels = int(256 * width), out_channels = int(256 * width), kernel_size= 3, stride = 1, act = act), ])) # 分类预测: 1个1x1卷积 self.cls_preds.append( nn.Conv2D(in_channels = int(256 * width), out_channels = num_classes, kernel_size = 1, stride = 1, padding = 0) ) # 回归特征提取: 2个3x3卷积 self.reg_convs.append(nn.Sequential(*[ Conv(in_channels = int(256 * width), out_channels = int(256 * width), kernel_size = 3, stride = 1, act = act), Conv(in_channels = int(256 * width), out_channels = int(256 * width), kernel_size = 3, stride = 1, act = act) ])) # 回归预测(位置): 1个1x1卷积 self.reg_preds.append( nn.Conv2D(in_channels = int(256 * width), out_channels = 4, kernel_size = 1, stride = 1, padding = 0) ) # 回归预测(是否含有物体): 1个1x1卷积 self.obj_preds.append( nn.Conv2D(in_channels = int(256 * width), out_channels = 1, kernel_size = 1, stride = 1, padding = 0) ) def forward(self, inputs): # 输入[P3_out, P4_out, P5_out] # P3_out: [-1, 256, 80, 80] # P4_out: [-1, 512, 40, 40] # P5_out: [-1, 1024, 20, 20] outputs = [] for k, x in enumerate(inputs): # 1x1卷积通道整合 x = self.stems[k](x) # 2个3x3卷积特征提取 cls_feat = self.cls_convs[k](x) # 1个1x1卷积预测类别 # 分别输出: [-1, num_classes, 80, 80], [-1, num_classes, 40, 40], [-1, num_classes, 20, 20] cls_output = self.cls_preds[k](cls_feat) # 2个3x3卷积特征提取 reg_feat = self.reg_convs[k](x) # 1个1x1卷积预测位置 # 分别输出: [-1, 4, 80, 80], [-1, 4, 40, 40], [-1, 4, 20, 20] reg_output = self.reg_preds[k](reg_feat) # 1个1x1卷积预测是否有物体 # 分别输出: [-1, 1, 80, 80], [-1, 1, 40, 40], [-1, 1, 20, 20] obj_output = self.obj_preds[k](reg_feat) # 整合结果 # 输出: [-1, num_classes+5, 80, 80], [-1, num_classes+5, 40, 40], [-1, num_classes+5, 20, 20] output = paddle.concat([reg_output, obj_output, cls_output], 1) outputs.append(output) return outputs登录后复制 In [17]
## 测试YOLOX Head模块features = paddle.ones([1, 256, 80, 80]), paddle.ones([1, 512, 40, 40]), paddle.ones([1, 1024, 20, 20])net3 = YOLOXHead(10)print(net3(features)[0].shape, net3(features)[1].shape, net3(features)[2].shape)登录后复制
[1, 15, 80, 80] [1, 15, 40, 40] [1, 15, 20, 20]登录后复制登录后复制
2.4 结构整合 YOLO Body
In [18]class YoloBody(nn.Layer): def __init__(self, num_classes, kind): super().__init__() depth_dict = {'nano': 0.33, 'tiny': 0.33, 's' : 0.33, 'm' : 0.67, 'l' : 1.00, 'x' : 1.33,} width_dict = {'nano': 0.25, 'tiny': 0.375, 's' : 0.50, 'm' : 0.75, 'l' : 1.00, 'x' : 1.25,} depth, width = depth_dict[kind], width_dict[kind] depthwise = True if kind == 'nano' else False self.backbone = YOLOPAFPN(depth, width, depthwise=depthwise) self.head = YOLOXHead(num_classes, width, depthwise=depthwise) def forward(self, x): fpn_outs = self.backbone.forward(x) outputs = self.head.forward(fpn_outs) return outputs登录后复制 代码解释In [19]## 测试YOLO Body模块x = paddle.ones([1, 3, 640, 640])net4 = YoloBody(10, 'x')print(net4(x)[0].shape, net4(x)[1].shape, net4(x)[2].shape)登录后复制
[1, 15, 80, 80] [1, 15, 40, 40] [1, 15, 20, 20]登录后复制登录后复制
游乐网为非赢利性网站,所展示的游戏/软件/文章内容均来自于互联网或第三方用户上传分享,版权归原作者所有,本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容,请联系youleyoucom@outlook.com。
同类文章
华为AI深度布局:如何引领科技变革新格局
新智元报道编辑:艾伦【新智元导读】华为诺亚方舟实验室主任王云鹤官宣离职。我们梳理了王云鹤的经历。王云鹤今日在朋友圈官宣,将辞去华为诺亚方舟实验室主任职位,告别华为。从 2025 年 3 月到今天,王
Claude限流,Altman紧急叫停Sora:AI竞争格局改变
新智元报道编辑:元宇【新智元导读】Sora应用关停,Claude却因太火而被限流:一个烧不起,一个供不上,算力墙面前,AI竞赛的胜负手突然变了。一夜之间,打工人突然发现:Claude开始限流了。An
谷歌内存论文疑被抄袭,华人学者控诉业内学术不公
新智元报道编辑:好困 Aeneas【新智元导读】把闪存股一夜干崩的谷歌顶会论文,出大事了。TurboQuant的核心方法,两年前就被一位华人学者做完、发完顶会、代码全部开源了。谷歌不仅没正面提及,而
OpenClaw漏洞威胁:智能家居被反锁与隐私泄露风险
新智元报道编辑:倾倾【新智元导读】2025年底,极客圈发生了一场数字哗变,Anthropic的遮羞布被Peter Steinberger撕了个精光。从OpenClaw开源到Claude被扒出80页「
华为大模型负责人离职,重大人事变动引发行业关注
智东西作者|江宇编辑|冰倩智东西3月28日报道,今日,华为诺亚方舟实验室主任、华为盘古大模型负责人王云鹤在朋友圈发文,确认离职。王云鹤于2017年以华为北京部门首位实习生身份加入,至今已接近9年。在
- 日榜
- 周榜
- 月榜
相关攻略
2015-03-10 11:25
2015-03-10 11:05
2021-08-04 13:30
2015-03-10 11:22
2015-03-10 12:39
2022-05-16 18:57
2025-05-23 13:43
2025-05-23 14:01
热门教程
- 游戏攻略
- 安卓教程
- 苹果教程
- 电脑教程

