一个处女座的程序猿

Paper：《YOLOv4: Optimal Speed and Accuracy of Object Detection》的翻译与解读

YOLOv4的评价

1、四个改进和一个创新

YOLOv4: Optimal Speed and Accuracy of Object Detection

Abstract

1. Introduction

2. Related work

2.1. Object detection models

2.2. Bag of freebies

2.3. Bag of specials

3. Methodology

3.1. Selection of architecture

3.2. Selection of BoF and BoS

3.3. Additional improvements

3.4. YOLOv4

4. Experiments

4.1. Experimental setup

4.2. Influence of different features on Classifier training

4.3. Influence of different features on

4.4. Influence of different backbones and pretrained weightings on Detector training

4.5. Influence of different mini-batch size on Detector training

5. Results

6. Conclusions

7. Acknowledgements

YOLOv4的评价

1、四个改进和一个创新

这篇文章主要有四个改进+一个创新，但组合了大约20项近几年来各种深度学习和目标检测领域的tricks。可以说，这篇论文有创新和改进，但多数是微小的改进。然而这篇文章对比了大量的、近几年新出来的tricks（大约有20多个），肯定得花费大量的时间和精力，工作量巨大。

MosaicDA（马赛克数据增强, Mosaic Data Augmentation）：把四张图拼成一张图来训练，变相的等价于增大了mini-batch。这是从CutMix混合两张图的基础上改进；
CmBN（跨最小批的归一化, Cross mini-batch Normal）：在CBN的基础上改进；
修改的SAM：从SAM的逐空间的attention，到逐点的attention；
修改的PAN：把通道从相加（add）改变为concat，改变很小；
SAT(自对抗训练, Self-Adversarial Training)：这是在一张图上，让神经网络反向更新图像，对图像做改变扰动，然后在这个图像上训练。这个方法，是图像风格化的主要方法，让网络反向更新图像来风格化图像。

来自知乎：https://zhuanlan.zhihu.com/p/135980432

YOLOv4: Optimal Speed and Accuracy of Object Detection

论文地址：https://arxiv.org/abs/2004.10934
代码地址：https://github.com/AlexeyAB/darknet

Abstract

There are a huge number of features which are said to improve Convolutional Neural Network (CNN) accuracy. Practical testing of combinations of such features on large datasets, and theoretical justification of the result, is required. Some features operate on certain models exclusively and for certain problems exclusively, or only for small-scale datasets; while some features, such as batch-normalization and residual-connections, are applicable to the majority of models, tasks, and datasets. We assume that such universal features include Weighted-Residual-Connections (WRC), Cross-Stage-Partial-connections (CSP), Cross mini-Batch Normalization (CmBN), Self-adversarial-training (SAT) and Mish-activation. We use new features: WRC, CSP, CmBN, SAT, Mish activation, Mosaic data augmentation, CmBN, DropBlock regularization, and CIoU loss, and combine some of them to achieve state-of-the-art results: 43.5% AP (65.7% AP50) for the MS COCO dataset at a realtime speed of ~65 FPS on Tesla V100. Source code is at this https URL
https://github.com/AlexeyAB/darknet.

有大量的特征被认为可以提高卷积神经网络（CNN）的精度。需要在大型数据集上对这些特性的组合进行实际测试，并对结果进行理论验证。有些特性只对某些模型起作用，只对某些问题起作用，或者只对小规模数据集起作用;而一些特性，如批处理规范化和调整大小连接，则适用于大多数模型、任务和数据集。我们假设这些通用特性包括加权残差连接(WRC)、跨阶段部分连接(CSP)、跨小批量标准化(CmBN)、自反训练(SAT)和Mish激活。我们使用新特性:WRC,CSP, CmBN,SAT,Mish激活,马赛克数据增强,CmBN, DropBlock正规化,CIoU损失,并结合一些实现先进的结果:基于MS COCO数据集，可以得到43.5%的AP(65.7% AP50)，Tesla V100上的实时速度为 ~ 65 FPS。

总结来说：就是什么技巧最先进，我都拿来用用，组合成一个更完美的结果！

1. Introduction

The majority of CNN-based object detectors are largely applicable only for recommendation systems. For example, searching for free parking spaces via urban video cameras is executed by slow accurate models, whereas car collision warning is related to fast inaccurate models. Improving the real-time object detector accuracy enables using them not only for hint generating recommendation systems, but also for stand-alone process management and human input reduction. Real-time object detector operation on conventional Graphics Processing Units (GPU) allows their mass usage at an affordable price. The most accurate modern neural networks do not operate in real time and require large number of GPUs for training with a large mini-batch-size. We address such problems through creating a CNN that operates in real-time on a conventional GPU, and for which training requires only one conventional GPU.	大多数基于cnn的对象检测器主要只适用于推荐系统。例如，通过城市摄像机寻找免费停车位是由缓慢准确的模型执行，而汽车碰撞预警与快速不准确的模型相关。提高实时目标探测器的准确性，不仅可以将其用于提示生成推荐系统，还可以用于独立流程管理和减少人工输入。传统图形处理单元(GPU)上的实时对象检测操作允许它们以可负担的价格大量使用。最精确的现代神经网络不能实时运行，需要大量的gpu来进行小批量的训练。我们通过创建一个在传统GPU上实时运行的CNN来解决这些问题，而训练只需要一个传统GPU。
Figure 1: Comparison of the proposed YOLOv4 and other state-of-the-art object detectors. YOLOv4 runs twice faster than EfficientDet with comparable performance. Improves YOLOv3’s AP and FPS by 10% and 12%, respectively.	图1:所提议的YOLOv4与其他最先进的对象检测器的比较。在性能相当的情况下，YOLOv4的运行速度比EfficientDet快两倍。分别提高YOLOv3的AP和FPS 10%和12%。
The main goal of this work is designing a fast operating speed of an object detector in production systems and optimization for parallel computations, rather than the low computation volume theoretical indicator (BFLOP). We hope that the designed object can be easily trained and used. For example, anyone who uses a conventional GPU to train and test can achieve real-time, high quality, and convincing object detection results, as the YOLOv4 results shown in Figure 1. Our contributions are summarized as follows: 1. We develope an efficient and powerful object detection model. It makes everyone can use a 1080 Ti or 2080 Ti GPU to train a super fast and accurate object detector. 2. We verify the influence of state-of-the-art Bag-ofFreebies and Bag-of-Specials methods of object detection during the detector training. 3. We modify state-of-the-art methods and make them more effecient and suitable for single GPU training, including CBN [89], PAN [49], SAM [85], etc.	这项工作的主要目标是设计一个快速运行的目标探测器在生产系统和并行计算优化，而不是低计算量理论指标(BFLOP)。我们希望设计的对象可以很容易地训练和使用。例如，任何使用传统GPU进行训练和测试的人都可以实现实时、高质量、令人信服的对象检测结果，如图1所示的YOLOv4结果。我们的贡献总结如下:1。我们开发了一个高效、强大的目标检测模型。它使每个人都可以使用1080 Ti或2080 Ti GPU来训练一个超级快速和精确的目标探测器。2. 我们验证了在检测器训练过程中，最先进的袋- offreebies和袋- specials的对象检测方法的影响。3.我们修改了最先进的方法，使其更有效，更适合于单GPU训练，包括CBN[89]、PAN[49]、SAM[85]等。

2. Related work

2.1. Object detection models

A modern detector is usually composed of two parts, a backbone which is pre-trained on ImageNet and a head which is used to predict classes and bounding boxes of objects. For those detectors running on GPU platform, their backbone could be VGG [68], ResNet [26], ResNeXt [86], or DenseNet [30]. For those detectors running on CPU platform, their backbone could be SqueezeNet [31], MobileNet [28, 66, 27, 74], or ShuffleNet [97, 53]. As to the head part, it is usually categorized into two kinds, i.e., one-stage object detector and two-stage object detector. The most representative two-stage object detector is the R-CNN [19] series, including fast R-CNN [18], faster R-CNN [64], R-FCN [9], and Libra R-CNN [58]. It is also possible to make a twostage object detector an anchor-free object detector, such as RepPoints [87]. As for one-stage object detector, the most representative models are YOLO [61, 62, 63], SSD [50], and RetinaNet [45]. In recent years, anchor-free one-stage object detectors are developed. The detectors of this sort are CenterNet [13], CornerNet [37, 38], FCOS [78], etc. Object detectors developed in recent years often insert some layers between backbone and head, and these layers are usually used to collect feature maps from different stages. We can call it the neck of an object detector. Usually, a neck is composed of several bottom-up paths and several topdown paths. Networks equipped with this mechanism include Feature Pyramid Network (FPN) [44], Path Aggregation Network (PAN) [49], BiFPN [77], and NAS-FPN [17]. In addition to the above models, some researchers put their emphasis on directly building a new backbone (DetNet [43], DetNAS [7]) or a new whole model (SpineNet [12], HitDetector [20]) for object detection.

现代探测器通常由两个部分组成，一个是在ImageNet上预先训练的主干，另一个是用来预测物体的类和边界盒的头部。对于那些在GPU平台上运行的检测器，它们的主干可以是VGG[68]、ResNet[26]、ResNeXt[86]或DenseNet[30]。对于那些在CPU平台上运行的检测器，它们的主干可以是SqueezeNet[31]、MobileNet[28,666,27,74]或ShuffleNet[97,53]。至于头部，通常分为两类，即、一级目标探测器和二级目标探测器。最具代表性的两级目标探测器是R-CNN[19]系列，包括fast R-CNN [18]， faster R-CNN [64]， R-FCN [9]， Libra R-CNN[58]。也可以使两级对象检测器成为无锚对象检测器，如RepPoints[87]。单级目标探测器最具代表性的型号有YOLO[61, 62, 63]、SSD[50]、RetinaNet[45]。近年来，无锚单级目标探测器得到了广泛的应用。这类探测器有CenterNet[13]、CornerNet[37,38]、FCOS[78]等。近年来发展起来的物体探测器常常在基干和头部之间插入一些层，这些层通常用来收集不同阶段的特征图。我们可以称之为物体探测器的颈部。通常，一个领由几个自下而上的路径和几个自上而下的路径组成。具有该机制的网络包括特征金字塔网络(Feature Pyramid Network, FPN)[44]、路径汇聚网络(Path Aggregation Network, PAN)[49]、BiFPN[77]和NAS-FPN[17]。除了上述模型外，一些研究者还着重于直接构建一个新的主干(DetNet [43]， DetNAS[7])或一个新的整体模型(SpineNet [12]， HitDetector[20])用于对象检测。

To sum up, an ordinary object detector is composed of several parts:

Input: Image, Patches, Image Pyramid •
Backbones: VGG16 [68], ResNet-50 [26], SpineNet [12], EfficientNet-B0/B7 [75], CSPResNeXt50 [81], CSPDarknet53 [81] •
Neck: • Additional blocks: SPP [25], ASPP [5], RFB [47], SAM [85] • Path-aggregation blocks: FPN [44], PAN [49], NAS-FPN [17], Fully-connected FPN, BiFPN [77], ASFF [48], SFAM [98] •
Heads:: • Dense Prediction (one-stage): ◦ RPN [64], SSD [50], YOLO [61], RetinaNet [45] (anchor based) ◦ CornerNet [37], CenterNet [13], MatrixNet [60], FCOS [78] (anchor free) • Sparse Prediction (two-stage): ◦ Faster R-CNN [64], R-FCN [9], Mask RCNN [23] (anchor based) ◦ RepPoints [87] (anchor free)

综上所述，一个普通的物体探测器由以下几个部分组成:

输入:图像，补丁，图像金字塔•
骨架:VGG16 [68]， ResNet-50 [26]， SpineNet[12]，效率网- b0 /B7 [75]， CSPResNeXt50 [81]， CSPDarknet53 [81]
•路径聚合块:FPN[44]、PAN[49]、NAS-FPN[17]、全连接FPN、BiFPN[77]、ASFF[48]、SFAM [98]
◦RPN [64]， SSD [50]， YOLO [61]， RetinaNet[45](锚网)，CenterNet [13]， MatrixNet [60]， FCOS[78](无锚)•稀疏预测(两阶段):◦更快的R-CNN [64]， R-FCN[9]，口罩RCNN[23](锚网)[87](无锚)

2.2. Bag of freebies

Usually, a conventional object detector is trained offline. Therefore, researchers always like to take this advantage and develop better training methods which can make the object detector receive better accuracy without increasing the inference cost. We call these methods that only change the training strategy or only increase the training cost as “bag of freebies.” What is often adopted by object detection methods and meets the definition of bag of freebies is data augmentation. The purpose of data augmentation is to increase the variability of the input images, so that the designed object detection model has higher robustness to the images obtained from different environments. For examples, photometric distortions and geometric distortions are two commonly used data augmentation method and they definitely benefit the object detection task. In dealing with photometric distortion, we adjust the brightness, contrast, hue, saturation, and noise of an image. For geometric distortion, we add random scaling, cropping, flipping, and rotating.	通常，传统的目标探测器是离线训练的。因此，研究人员总是希望利用这一优势，开发出更好的训练方法，使目标探测器在不增加推理成本的情况下获得更好的精度。我们把这些只会改变培训策略或只会增加培训成本的方法称为“免费包”。“对象检测方法经常采用的，符合免费赠品包定义的是数据扩充。数据扩充的目的是增加输入图像的可变性，使所设计的目标检测模型对不同环境下获得的图像具有更高的鲁棒性。例如，光度畸变和几何畸变是两种常用的数据增强方法，它们对目标检测任务无疑是有益的。在处理光度失真时，我们调整图像的亮度、对比度、色调、饱和度和噪声。对于几何畸变，我们添加了随机缩放、剪切、翻转和旋转。
The data augmentation methods mentioned above are all pixel-wise adjustments, and all original pixel information in the adjusted area is retained. In addition, some researchers engaged in data augmentation put their emphasis on simulating object occlusion issues. They have achieved good results in image classification and object detection. For example, random erase [100] and CutOut [11] can randomly select the rectangle region in an image and fill in a random or complementary value of zero. As for hide-and-seek [69] and grid mask [6], they randomly or evenly select multiple rectangle regions in an image and replace them to all zeros. If similar concepts are applied to feature maps, there are DropOut [71], DropConnect [80], and DropBlock [16] methods. In addition, some researchers have proposed the methods of using multiple images together to perform data augmentation. For example, MixUp [92] uses two images to multiply and superimpose with different coefficient ratios, and then adjusts the label with these superimposed ratios. As for CutMix [91], it is to cover the cropped image to rectangle region of other images, and adjusts the label according to the size of the mix area. In addition to the above mentioned methods, style transfer GAN [15] is also used for data augmentation, and such usage can effectively reduce the texture bias learned by CNN.	上述数据增强方法均为像素级调整，并保留调整区域内的所有原始像素信息。此外，一些从事数据扩充的研究人员将重点放在模拟物体遮挡问题上。在图像分类和目标检测方面取得了较好的效果。例如，随机擦除[100]和CutOut[11]可以随机选择图像中的矩形区域，并填充一个随机的或互补的零值。对于捉迷藏[69]和网格掩码[6]，它们随机或均匀地选择图像中的多个矩形区域，并将其全部替换为零。如果将类似的概念应用于特征图，则有DropOut[71]、DropConnect[80]和DropBlock[16]方法。此外，一些研究者提出了将多幅图像结合在一起进行数据增强的方法。例如，MixUp[92]使用两张图像以不同的系数比率进行相乘和叠加，然后用这些叠加比率调整标签。CutMix[91]是将裁剪后的图像覆盖到其他图像的矩形区域，并根据混合区域的大小调整标签。除了上述方法外，还使用style transfer GAN[15]进行数据扩充，这样可以有效减少CNN学习到的纹理偏差。
Different from the various approaches proposed above, some other bag of freebies methods are dedicated to solving the problem that the semantic distribution in the dataset may have bias. In dealing with the problem of semantic distribution bias, a very important issue is that there is a problem of data imbalance between different classes, and this problem is often solved by hard negative example mining [72] or online hard example mining [67] in two-stage object detector. But the example mining method is not applicable to one-stage object detector, because this kind of detector belongs to the dense prediction architecture. Therefore Lin et al. [45] proposed focal loss to deal with the problem of data imbalance existing between various classes. Another very important issue is that it is difficult to express the relationship of the degree of association between different categories with the one-hot hard representation. This representation scheme is often used when executing labeling. The label smoothing proposed in [73] is to convert hard label into soft label for training, which can make model more robust. In order to obtain a better soft label, Islam et al. [33] introduced the concept of knowledge distillation to design the label refinement network.	与上面提出的各种方法不同，其他一些免费包方法致力于解决数据集中的语义分布可能存在偏差的问题。在处理语义分布偏差问题时，一个非常重要的问题是不同类之间存在数据不平衡的问题，这个问题往往通过两阶段对象检测器中的硬反例挖掘[72]或在线硬例挖掘[67]来解决。但由于该方法属于稠密预测结构，因此不适用于单级目标检测。因此，Lin等人提出了焦损来处理各个类之间存在的数据不平衡问题。另一个非常重要的问题是，很难用一个热硬表示法来表达不同类别之间关联程度的关系。这种表示法常用于执行标记。文献[73]提出的标签平滑是将硬标签转换为软标签进行训练，使模型更加稳健。为了获得更好的软标签，Islam等人引入了知识蒸馏的概念来设计标签细化网络。
The last bag of freebies is the objective function of Bounding Box (BBox) regression. The traditional object detector usually uses Mean Square Error (MSE) to directly perform regression on the center point coordinates and height and width of the BBox, i.e., {xcenter, ycenter, w, h}, or the upper left point and the lower right point, i.e., {xtop lef t, ytop lef t, xbottom right, ybottom right}. As for anchor-based method, it is to estimate the corresponding offset, for example {xcenter of f set, ycenter of f set, wof f set, hof f set} and {xtop lef t of f set, ytop lef t of f set, xbottom right of f set, ybottom right of f set}. However, to directly estimate the coordinate values of each point of the BBox is to treat these points as independent variables, but in fact does not consider the integrity of the object itself. In order to make this issue processed better, some researchers recently proposed IoU loss [90], which puts the coverage of predicted BBox area and ground truth BBox area into consideration. The IoU loss computing process will trigger the calculation of the four coordinate points of the BBox by executing IoU with the ground truth, and then connecting the generated results into a whole code. Because IoU is a scale invariant representation, it can solve the problem that when traditional methods calculate the l1 or l2 loss of {x, y, w, h}, the loss will increase with the scale. Recently, some researchers have continued to improve IoU loss. For example, GIoU loss [65] is to include the shape and orientation of object in addition to the coverage area. They proposed to find the smallest area BBox that can simultaneously cover the predicted BBox and ground truth BBox, and use this BBox as the denominator to replace the denominator originally used in IoU loss. As for DIoU loss [99], it additionally considers the distance of the center of an object, and CIoU loss [99], on the other hand simultaneously considers the overlapping area, the distance between center points, and the aspect ratio. CIoU can achieve better convergence speed and accuracy on the BBox regression problem.	最后一袋赠品是边界盒(BBox)回归的目标函数。传统的目标检测器通常使用均方误差(MSE)直接对BBox的中心点坐标和高度、宽度进行回归，即， {xcenter, ycenter, w, h}，或左上点和右下点，即， {xtop lef t, ytop lef t, xbottom right, ybottom right}。基于锚的方法是估计相应的偏移量，例如{f集合的xcenter, f集合的ycenter, wof f集合，hof f集合}和{xtop f集合的lef t, ytop f集合的lef t, xbottom right off集合，ybottom right off集合}。但是，直接估计BBox中每个点的坐标值，就是把这些点当作自变量，而实际上并不考虑对象本身的完整性。为了更好地处理这一问题，一些研究者最近提出了IoU损失[90]，将预测BBox区域的覆盖范围和地面真实BBox区域考虑在内。IoU损失计算过程通过执行IoU和ground truth，触发BBox四个坐标点的计算，然后将生成的结果连接成一个完整的代码。由于IoU是尺度不变的表示，可以解决传统方法在计算{x, y, w, h}的l1或l2损耗时，损耗会随着尺度的增大而增大的问题。最近，一些研究人员继续改善欠条损失。例如，GIoU loss[65]除了覆盖区域外，还包括了物体的形状和方向。他们提出寻找能够同时覆盖预测BBox和地面真实BBox的最小面积BBox，并以此BBox作为分母来代替IoU损失中原来使用的分母。对于DIoU loss[99]，它额外考虑了物体中心的距离，而CIoU loss[99]则同时考虑了重叠区域、中心点之间的距离和纵横比。对于BBox回归问题，CIoU具有更好的收敛速度和精度。

2.3. Bag of specials

For those plugin modules and post-processing methods that only increase the inference cost by a small amount but can significantly improve the accuracy of object detection, we call them “bag of specials”. Generally speaking, these plugin modules are for enhancing certain attributes in a model, such as enlarging receptive field, introducing attention mechanism, or strengthening feature integration capability, etc., and post-processing is a method for screening model prediction results.

Common modules that can be used to enhance receptive field are SPP [25], ASPP [5], and RFB [47]. The SPP module was originated from Spatial Pyramid Matching (SPM) [39], and SPMs original method was to split feature map into several d × d equal blocks, where d can be {1, 2, 3, ...}, thus forming spatial pyramid, and then extracting bag-of-word features. SPP integrates SPM into CNN and use max-pooling operation instead of bag-of-word operation. Since the SPP module proposed by He et al. [25] will output one dimensional feature vector, it is infeasible to be applied in Fully Convolutional Network (FCN). Thus in the design of YOLOv3 [63], Redmon and Farhadi improve SPP module to the concatenation of max-pooling outputs with kernel size k × k, where k = {1, 5, 9, 13}, and stride equals to 1. Under this design, a relatively large k × k maxpooling effectively increase the receptive field of backbone feature. After adding the improved version of SPP module, YOLOv3-608 upgrades AP50 by 2.7% on the MS COCO object detection task at the cost of 0.5% extra computation. The difference in operation between ASPP [5] module and improved SPP module is mainly from the original k×k kernel size, max-pooling of stride equals to 1 to several 3 × 3 kernel size, dilated ratio equals to k, and stride equals to 1 in dilated convolution operation. RFB module is to use several dilated convolutions of k×k kernel, dilated ratio equals to k, and stride equals to 1 to obtain a more comprehensive spatial coverage than ASPP. RFB [47] only costs 7% extra inference time to increase the AP50 of SSD on MS COCO by 5.7%.

对于那些仅增加少量推理成本，却能显著提高目标检测精度的插件模块和后处理方法，我们称之为“特价包”。一般来说，这些插件模块是为了增强模型中的某些属性，如扩大接受域、引入注意机制、增强特征集成能力等，后处理是筛选模型预测结果的一种方法。

可以用来增强感受野的常用模块有SPP[25]、ASPP[5]和RFB[47]。SPP模块起源于空间金字塔匹配(SPM) [39]， SPMs原始方法是将feature map分割成若干个d×d等块，其中d可以是{1,2,3，…，从而形成空间金字塔，然后提取bag-of-word特征。SPP将SPM集成到CNN中，使用max-pooling操作，而不是bag-of-word操作。由于He等人提出的SPP模块将输出一维特征向量，因此在全卷积网络(FCN)中应用是不可实现的。因此，在YOLOv3[63]的设计中，Redmon和Farhadi将SPP模块改进为最大池输出的串联，其内核大小为k×k，其中k = {1,5,9,13}， stride = 1。在本设计中，较大的k×k maxpooling有效地增加了主干特征的接受域。在添加SPP模块的改进版本后，YOLOv3-608在MS COCO对象检测任务上对AP50进行了2.7%的升级，增加了0.5%的额外计算量。ASPP[5]模块与改进后的SPP模块在运算上的差异主要来自于原始的k×k核大小，stride的最大池等于1到几个3×3核大小，扩展比等于k，在扩展卷积运算中stride等于1。RFB模块是利用k×k kernel的几个膨胀卷积，膨胀比等于k, stride等于1，得到比ASPP更全面的空间覆盖。RFB[47]仅花费7%的额外推断时间，使MS COCO上SSD的AP50增加5.7%。

The attention module that is often used in object detection is mainly divided into channel-wise attention and pointwise attention, and the representatives of these two attention models are Squeeze-and-Excitation (SE) [29] and Spatial Attention Module (SAM) [85], respectively. Although SE module can improve the power of ResNet50 in the ImageNet image classification task 1% top-1 accuracy at the cost of only increasing the computational effort by 2%, but on a GPU usually it will increase the inference time by about 10%, so it is more appropriate to be used in mobile devices. But for SAM, it only needs to pay 0.1% extra calculation and it can improve ResNet50-SE 0.5% top-1 accuracy on the ImageNet image classification task. Best of all, it does not affect the speed of inference on the GPU at all.

In terms of feature integration, the early practice is to use skip connection [51] or hyper-column [22] to integrate lowlevel physical feature to high-level semantic feature. Since multi-scale prediction methods such as FPN have become popular, many lightweight modules that integrate different feature pyramid have been proposed. The modules of this sort include SFAM [98], ASFF [48], and BiFPN [77]. The main idea of SFAM is to use SE module to execute channelwise level re-weighting on multi-scale concatenated feature maps. As for ASFF, it uses softmax as point-wise level reweighting and then adds feature maps of different scales. In BiFPN, the multi-input weighted residual connections is proposed to execute scale-wise level re-weighting, and then add feature maps of different scales.

在目标检测中常用的注意模块主要分为通道式注意和点态注意，这两种注意模型的代表分别是挤压-激励(SE)[29]和空间注意模块(SAM)[85]。尽管SE模块可以改善的力量ResNet50 ImageNet图像分类任务中排名前1%精度的代价只会增加2%的计算工作,但是在GPU通常会增加推理时间约10%,所以它更适合用于移动设备。但是对于SAM来说，它只需要额外支付0.1%的计算量，就可以在ImageNet图像分类任务中提高ResNet50-SE 0.5% top-1准确率。最重要的是，它完全不影响GPU的推理速度。

在特征集成方面，早期的实践是使用跳跃连接[51]或超列[22]将低级物理特征集成到高级语义特征。随着FPN等多尺度预测方法的流行，人们提出了许多融合不同特征金字塔的轻量级预测模块。这类模块包括SFAM[98]、ASFF[48]和BiFPN[77]。SFAM的主要思想是利用SE模块在多尺度的拼接特征图上进行信道级重加权。对于ASFF，它使用softmax作为点向水平重加权，然后添加不同尺度的特征图。在BiFPN中，提出了多输入加权剩余连接来执行按比例的水平重加权，然后添加不同比例的特征图。

In the research of deep learning, some people put their focus on searching for good activation function. A good activation function can make the gradient more efficiently propagated, and at the same time it will not cause too much extra computational cost. In 2010, Nair and Hinton [56] propose ReLU to substantially solve the gradient vanish problem which is frequently encountered in traditional tanh and sigmoid activation function. Subsequently, LReLU [54], PReLU [24], ReLU6 [28], Scaled Exponential Linear Unit (SELU) [35], Swish [59], hard-Swish [27], and Mish [55], etc., which are also used to solve the gradient vanish problem, have been proposed. The main purpose of LReLU and PReLU is to solve the problem that the gradient of ReLU is zero when the output is less than zero. As for ReLU6 and hard-Swish, they are specially designed for quantization networks. For self-normalizing a neural network, the SELU activation function is proposed to satisfy the goal. One thing to be noted is that both Swish and Mish are continuously differentiable activation function.

The post-processing method commonly used in deeplearning-based object detection is NMS, which can be used to filter those BBoxes that badly predict the same object, and only retain the candidate BBoxes with higher response. The way NMS tries to improve is consistent with the method of optimizing an objective function. The original method proposed by NMS does not consider the context information, so Girshick et al. [19] added classification confidence score in R-CNN as a reference, and according to the order of confidence score, greedy NMS was performed in the order of high score to low score. As for soft NMS [1], it considers the problem that the occlusion of an object may cause the degradation of confidence score in greedy NMS with IoU score. The DIoU NMS [99] developers way of thinking is to add the information of the center point distance to the BBox screening process on the basis of soft NMS. It is worth mentioning that, since none of above postprocessing methods directly refer to the captured image features, post-processing is no longer required in the subsequent development of an anchor-free method.

在深度学习的研究中，一些人把重点放在寻找好的激活功能上。一个好的激活函数可以使梯度更有效地传播，同时也不会造成过多的计算开销。2010年，Nair和Hinton[56]提出ReLU从本质上解决传统tanh和sigmoid激活函数中经常遇到的梯度消失问题。随后，LReLU[54]、PReLU[24]、ReLU6[28]、标度指数线性单元(SELU)[35]、Swish[59]、hard-Swish[27]、Mish[55]等也被用于解决梯度消失问题。LReLU和PReLU的主要目的是解决输出小于0时ReLU的梯度为零的问题。对于ReLU6和hard-Swish，它们是专门为量化网络设计的。针对神经网络的自归一化问题，提出了SELU激活函数。需要注意的是，Swish和Mish都是连续可微的激活函数。

基于深度挖掘的对象检测中常用的后处理方法是NMS, NMS可以对预测较差的bbox进行过滤，只保留响应较高的候选bbox。NMS试图改进的方法与优化目标函数的方法是一致的。NMS提出的原始方法没有考虑上下文信息，所以Girshick等人[19]在R-CNN中加入了分类的置信分作为参考，按照置信分的顺序，从高到低依次进行贪心NMS。对于软NMS[1]，它考虑了对象的遮挡可能导致带IoU分数的贪婪NMS的信心分数下降的问题。DIoU NMS[99]开发人员的思路是在软NMS的基础上，将中心点距离信息添加到BBox筛选过程中。值得一提的是，由于以上的后处理方法都没有直接引用捕获的图像特征，因此在后续的无锚方法开发中不再需要后处理。

Paper：《YOLOv4: Optimal Speed and Accuracy of Object Detection》的翻译与解读_第3张图片

3. Methodology

The basic aim is fast operating speed of neural network, in production systems and optimization for parallel computations, rather than the low computation volume theoretical indicator (BFLOP). We present two options of real-time neural networks:

For GPU we use a small number of groups (1 - 8) in convolutional layers: CSPResNeXt50 / CSPDarknet53
For VPU - we use grouped-convolution, but we refrain from using Squeeze-and-excitement (SE) blocks - specifically this includes the following models: EfficientNet-lite / MixNet [76] / GhostNet [21] / MobileNetV3

其基本目标是在生产系统和并行计算优化中提高神经网络的运行速度，而不是低计算量理论指标(BFLOP)。我们提出了实时神经网络的两种选择:

对于GPU，我们在卷积层中使用少量的组(1 - 8):CSPResNeXt50 / CSPDarknet53
对于VPU -我们使用群卷积，但我们不使用挤压和兴奋(SE)块-具体来说，这包括以下模型:效率网-lite / MixNet [76] / GhostNet [21] / MobileNetV3

3.1. Selection of architecture

Our objective is to find the optimal balance among the input network resolution, the convolutional layer number, the parameter number (filter size2 * filters * channel / groups), and the number of layer outputs (filters). For instance, our numerous studies demonstrate that the CSPResNext50 is considerably better compared to CSPDarknet53 in terms of object classification on the ILSVRC2012 (ImageNet) dataset [10]. However, conversely, the CSPDarknet53 is better compared to CSPResNext50 in terms of detecting objects on the MS COCO dataset [46].

我们的目标是在输入网络分辨率、卷积层数、参数数(filter size2 * filters * channel / groups)和层输出数(filters)之间找到最佳平衡。例如，我们的大量研究表明，在ILSVRC2012 (ImageNet)数据集[10]上的对象分类方面，CSPResNext50要比CSPDarknet53好得多。然而，相反地，在MS COCO数据集[46]上检测对象方面，CSPDarknet53比CSPResNext50更好。

The next objective is to select additional blocks for increasing the receptive field and the best method of parameter aggregation from different backbone levels for different detector levels: e.g. FPN, PAN, ASFF, BiFPN.

A reference model which is optimal for classification is not always optimal for a detector. In contrast to the classifier, the detector requires the following:

Higher input network size (resolution) – for detecting multiple small-sized objects.
More layers – for a higher receptive field to cover the increased size of input network.
More parameters – for greater capacity of a model to detect multiple objects of different sizes in a single image.

下一个目标是选择额外的块来增加感受野，并从不同的主干水平对不同的检测器水平(如FPN、PAN、ASFF、BiFPN)进行参数聚合的最佳方法。

对于分类来说是最优的参考模型对于检测器来说并不总是最优的。与分类器相比，检测器需要满足以下条件:

更高的输入网络大小(分辨率)-用于检测多个小型对象。
更多的层-一个更高的接受域，以覆盖增加的输入网络的大小。
更多的参数-更大的能力，一个模型，以检测多个对象的不同大小在一个单一的形象。

Hypothetically speaking, we can assume that a model with a larger receptive field size (with a larger number of convolutional layers 3 × 3) and a larger number of parameters should be selected as the backbone. Table 1 shows the information of CSPResNeXt50, CSPDarknet53, and EfficientNet B3. The CSPResNext50 contains only 16 convolutional layers 3 × 3, a 425 × 425 receptive field and 20.6 M parameters, while CSPDarknet53 contains 29 convolutional layers 3 × 3, a 725 × 725 receptive field and 27.6 M parameters. This theoretical justification, together with our numerous experiments, show that CSPDarknet53 neural network is the optimal model of the two as the backbone for a detector.The influence of the receptive field with different sizes is summarized as follows:

Up to the object size - allows viewing the entire object
Up to network size - allows viewing the context around the object
Exceeding the network size - increases the number of connections between the image point and the final activation

假设，我们可以假设一个模型的主干是一个更大的接受域大小(包含更多的convolutional layers 3×3)和更多的参数。表1显示了CSPResNeXt50, CSPDarknet53，和efficient entnet B3的信息。CSPResNext50只包含16个卷积层3×3，一个425×425的接受域和20.6 M的参数，而CSPDarknet53包含29个卷积层3×3，一个725×725的接受域和27.6 M的参数。这一理论证明，以及我们的大量实验，表明CSPDarknet53神经网络是两个作为骨干的检测器的最佳模型。不同大小的感受野的影响总结如下:

直到对象大小-允许查看整个对象
网络大小-允许查看对象周围的上下文
超过网络大小-增加图像点和最终激活之间的连接数

We add the SPP block over the CSPDarknet53, since it significantly increases the receptive field, separates out the most significant context features and causes almost no reduction of the network operation speed. We use PANet as the method of parameter aggregation from different backbone levels for different detector levels, instead of the FPN used in YOLOv3.

Finally, we choose CSPDarknet53 backbone, SPP additional module, PANet path-aggregation neck, and YOLOv3 (anchor based) head as the architecture of YOLOv4.

In the future we plan to expand significantly the content of Bag of Freebies (BoF) for the detector, which theoretically can address some problems and increase the detector accuracy, and sequentially check the influence of each feature in an experimental fashion.

We do not use Cross-GPU Batch Normalization (CGBN or SyncBN) or expensive specialized devices. This allows anyone to reproduce our state-of-the-art outcomes on a conventional graphic processor e.g. GTX 1080Ti or RTX 2080Ti.

我们在CSPDarknet53上添加了SPP块，因为它显著增加了接受域，分离出了最重要的上下文特性，并且几乎不会降低网络运行速度。我们使用PANet作为不同检测层的不同主干层的参数聚合方法，而不是YOLOv3中使用的FPN。

最后，我们选择CSPDarknet53主干、SPP附加模块、PANet路径聚合颈和基于锚的YOLOv3头部作为YOLOv4的架构。

在未来，我们计划大幅扩展探测器的免费赠品包(BoF)的内容，这在理论上可以解决一些问题，提高探测器的精度，并以实验的方式依次检查每个特征的影响。

我们不使用跨gpu批处理标准化(CGBN或SyncBN)或昂贵的专用设备。这使得任何人都可以在传统的图形处理器(如GTX 1080Ti或RTX 2080Ti)上重现我们最先进的结果。

3.2. Selection of BoF and BoS

For improving the object detection training, a CNN usually uses the following:

Activations: ReLU, leaky-ReLU, parametric-ReLU, ReLU6, SELU, Swish, or Mish
Bounding box regression loss: MSE, IoU, GIoU, CIoU, DIoU
Data augmentation: CutOut, MixUp, CutMix
Regularization method: DropOut, DropPath [36], Spatial DropOut [79], or DropBlock
Normalization of the network activations by their mean and variance: Batch Normalization (BN) [32], Cross-GPU Batch Normalization (CGBN or SyncBN) [93], Filter Response Normalization (FRN) [70], or Cross-Iteration Batch Normalization (CBN) [89]
Skip-connections: Residual connections, Weighted residual connections, Multi-input weighted residual connections, or Cross stage partial connections (CSP)

为了提高目标检测训练，CNN通常使用以下方法:

激活:ReLU、leaky-ReLU、参变ReLU、ReLU6、SELU、Swish或Mish
回归损失:MSE, IoU, GIoU, CIoU, DIoU
数据增强:中断、混合、切断
正则化方法:DropOut, DropPath [36]， Spatial DropOut[79]，或DropBlock
通过均值和方差对网络激活进行归一化:批处理归一化(BN)[32]，交叉gpu批处理归一化(CGBN或SyncBN)[93]，滤波器响应归一化(FRN)[70]，或交叉迭代批处理归一化(CBN)[89]。
跳转连接:剩余连接、加权剩余连接、多输入加权剩余连接或跨级部分连接(CSP)

As for training activation function, since PReLU and SELU are more difficult to train, and ReLU6 is specifically designed for quantization network, we therefore remove the above activation functions from the candidate list. In the method of reqularization, the people who published DropBlock have compared their method with other methods in detail, and their regularization method has won a lot. Therefore, we did not hesitate to choose DropBlock as our regularization method. As for the selection of normalization method, since we focus on a training strategy that uses only one GPU, syncBN is not considered.

对于训练激活函数，由于PReLU和SELU更难训练，而ReLU6是专门为量化网络设计的，因此我们将上述激活函数从候选列表中删除。在reqularization方法中，发表了DropBlock的人将他们的方法与其他方法进行了详细的比较，并且他们的regularization方法取得了很大的成果。因此，我们毫不犹豫的选择了DropBlock作为我们的regularization方法。对于归一化方法的选择，由于我们关注的是只使用一个GPU的训练策略，所以没有考虑syncBN。

3.3. Additional improvements

In order to make the designed detector more suitable for training on single GPU, we made additional design and improvement as follows:

We introduce a new method of data augmentation Mosaic, and Self-Adversarial Training (SAT)
We select optimal hyper-parameters while applying genetic algorithms
We modify some exsiting methods to make our design suitble for efficient training and detection - modified SAM, modified PAN, and Cross mini-Batch Normalization (CmBN)

为了使设计的检测器更适合于单GPU上的训练，我们做了以下额外的设计和改进:

介绍了一种新的数据增强镶嵌和自对抗训练(SAT)方法。
应用遗传算法选择最优超参数
我们修改了一些现有的方法，使我们的设计适合于有效的培训和检测-修改的SAM、修改的PAN和交叉小批量标准化(CmBN)

Figure 3: Mosaic represents a new method of data augmentation.

Figure 4: Cross mini-Batch Normalization

图3:Mosaic表示一种新的数据扩充方法。

图4:跨迷你批处理规范化

Mosaic represents a new data augmentation method that mixes 4 training images. Thus 4 different contexts are mixed, while CutMix mixes only 2 input images. This allows detection of objects outside their normal context. In addition, batch normalization calculates activation statistics from 4 different images on each layer. This significantly reduces the need for a large mini-batch size.

Self-Adversarial Training (SAT) also represents a new data augmentation technique that operates in 2 forward backward stages. In the 1st stage the neural network alters the original image instead of the network weights. In this way the neural network executes an adversarial attack on itself, altering the original image to create the deception that there is no desired object on the image. In the 2nd stage, the neural network is trained to detect an object on this modified image in the normal way.

CmBN represents a CBN modified version, as shown in Figure 4, defined as Cross mini-Batch Normalization (CmBN). This collects statistics only between mini-batches within a single batch. We modify SAM from spatial-wise attention to pointwise attention, and replace shortcut connection of PAN to concatenation, as shown in Figure 5 and Figure 6, respectively.

提出了一种混合4幅训练图像的数据增强方法。因此4个不同的上下文被混合，而CutMix只混合了2个输入图像。这允许检测正常上下文之外的对象。此外，批处理归一化计算激活统计从4个不同的图像在每一层。这大大减少了对大型迷你批处理大小的需求。

自对抗训练(SAT)也代表了一种新的数据增加技术，在两个前后阶段操作。在第一阶段，神经网络改变原始图像而不是网络权值。通过这种方式，神经网络对自己执行一种对抗性攻击，改变原始图像，以制造图像上没有期望对象的假象。在第二阶段，训练神经网络以正常的方式对修改后的图像进行目标检测。

CmBN表示CBN修改后的版本，如图4所示，定义为交叉微批标准化(Cross mini-Batch Normalization, CmBN)。这只在单个批内的小批之间收集统计信息。我们将SAM从空间上的注意修改为点态注意，并将PAN的快捷连接替换为拼接，分别如图5和图6所示。

Paper：《YOLOv4: Optimal Speed and Accuracy of Object Detection》的翻译与解读_第6张图片

3.4. YOLOv4

In this section, we shall elaborate the details of YOLOv4. YOLOv4 consists of:

Backbone: CSPDarknet53 [81]
Neck: SPP [25], PAN [49]
Head: YOLOv3 [63]

YOLO v4 uses:

Bag of Freebies (BoF) for backbone: CutMix and Mosaic data augmentation, DropBlock regularization, Class label smoothing
Bag of Specials (BoS) for backbone: Mish activation, Cross-stage partial connections (CSP), Multiinput weighted residual connections (MiWRC)
Bag of Freebies (BoF) for detector: CIoU-loss, CmBN, DropBlock regularization, Mosaic data augmentation, Self-Adversarial Training, Eliminate grid sensitivity, Using multiple anchors for a single ground truth, Cosine annealing scheduler [52], Optimal hyperparameters, Random training shapes
Bag of Specials (BoS) for detector: Mish activation, SPP-block, SAM-block, PAN path-aggregation block, DIoU-NMS

在本节中，我们将详细介绍YOLOv4。YOLOv4包括:

骨干:CSPDarknet53 [81]
颈部:SPP [25]， PAN [49]
头:YOLOv3 [63]

YOLO v4意思用途:

骨干网的免费包(BoF): CutMix和Mosaic数据增强，DropBlock正则化，类标签平滑
主干专用包(BoS): Mish激活、跨级部分连接(CSP)、多输入加权剩余连接(MiWRC)
检测器的免费包(BoF): ciu -loss, CmBN, DropBlock正则化，马赛克数据增强，自对抗训练，消除网格敏感性，使用多个锚为一个单一的地面真值，余弦退火调度[52]，最优超参数，随机训练形状
检测器专用包(BoS): Mish激活、sp -block、SAM-block、PAN路径聚合块、DIoU-NMS

4. Experiments

We test the influence of different training improvement techniques on accuracy of the classifier on ImageNet (ILSVRC 2012 val) dataset, and then on the accuracy of the detector on MS COCO (test-dev 2017) dataset.

我们在ImageNet (ILSVRC 2012 val)数据集上测试了不同的训练改进技术对分类器精度的影响，然后在MS COCO (test-dev 2017)数据集上测试了检测器的精度。

4.1. Experimental setup

In ImageNet image classification experiments, the default hyper-parameters are as follows: the training steps is 8,000,000; the batch size and the mini-batch size are 128 and 32, respectively; the polynomial decay learning rate scheduling strategy is adopted with initial learning rate 0.1; the warm-up steps is 1000; the momentum and weight decay are respectively set as 0.9 and 0.005. All of our BoS experiments use the same hyper-parameter as the default setting, and in the BoF experiments, we add an additional 50% training steps. In the BoF experiments, we verify MixUp, CutMix, Mosaic, Bluring data augmentation, and label smoothing regularization methods. In the BoS experiments, we compared the effects of LReLU, Swish, and Mish activation function. All experiments are trained with a 1080 Ti or 2080 Ti GPU.

在ImageNet图像分类实验中，默认超参数为:训练步骤为8,000,000;批大小和小批大小分别为128和32;采用多项式衰减学习率调度策略，初始学习率为0.1;热身步长1000;动量衰减为0.9，重量衰减为0.005。我们所有的BoS实验都使用与默认设置相同的超参数，在BoF实验中，我们添加了额外的50%的训练步骤。在BoF实验中，我们验证了混合、切分、镶嵌、Bluring数据增强和标签平滑正则化等方法。在BoS实验中，我们比较了LReLU、Swish和Mish激活函数的作用。所有实验均使用1080 Ti或2080 Ti GPU进行训练。

In MS COCO object detection experiments, the default hyper-parameters are as follows: the training steps is 500,500; the step decay learning rate scheduling strategy is adopted with initial learning rate 0.01 and multiply with a factor 0.1 at the 400,000 steps and the 450,000 steps, respectively; The momentum and weight decay are respectively set as 0.9 and 0.0005. All architectures use a single GPU to execute multi-scale training in the batch size of 64 while mini-batch size is 8 or 4 depend on the architectures and GPU memory limitation. Except for using genetic algorithm for hyper-parameter search experiments, all other experiments use default setting. Genetic algorithm used YOLOv3-SPP to train with GIoU loss and search 300 epochs for min-val 5k sets. We adopt searched learning rate 0.00261, momentum 0.949, IoU threshold for assigning ground truth 0.213, and loss normalizer 0.07 for genetic algorithm experiments. We have verified a large number of BoF, including grid sensitivity elimination, mosaic data augmentation, IoU threshold, genetic algorithm, class label smoothing, cross mini-batch normalization, selfadversarial training, cosine annealing scheduler, dynamic mini-batch size, DropBlock, Optimized Anchors, different kind of IoU losses. We also conduct experiments on various BoS, including Mish, SPP, SAM, RFB, BiFPN, and Gaussian YOLO [8]. For all experiments, we only use one GPU for training, so techniques such as syncBN that optimizes multiple GPUs are not used.

在MS COCO对象检测实验中，默认的超参数为:训练步骤为500500;采用步进衰减学习率调度策略，初始学习率为0.01，在400,000步和450,000步分别乘以因子0.1;动量衰减为0.9，重量衰减为0.0005。所有的架构都使用一个GPU来执行批处理大小为64的多尺度训练，而小批处理大小为8或4取决于架构和GPU内存限制。除超参数搜索实验采用遗传算法外，其他实验均采用默认设置。遗传算法利用YOLOv3-SPP进行带GIoU损失的训练，搜索最小值5k集的300个epoch。遗传算法实验采用搜索学习率0.00261、动量0.949、IoU阈值分配ground truth 0.213和损失正态值0.07。我们已经验证了大量的BoF，包括网格敏感性消除、马赛克数据增强、IoU阈值、遗传算法、类标记平滑、交叉小批量标准化、自对抗训练、余弦退火调度器、动态小批量大小、DropBlock、优化锚，不同类型的IoU损失。我们还对各种BoS进行了实验，包括Mish、SPP、SAM、RFB、BiFPN和Gaussian YOLO[8]。对于所有的实验，我们只使用一个GPU进行训练，所以像syncBN这样的优化多个GPU的技术并没有被使用。

4.2. Influence of different features on Classifier training

First, we study the influence of different features on classifier training; specifically, the influence of Class label smoothing, the influence of different data augmentation techniques, bilateral blurring, MixUp, CutMix and Mosaic, as shown in Fugure 7, and the influence of different activations, such as Leaky-ReLU (by default), Swish, and Mish.

首先，研究了不同特征对分类器训练的影响;具体来说，类标签平滑的影响，不同数据增强技术的影响，双边模糊，混合，CutMix和马赛克的影响，如Fugure 7所示，和不同的激活的影响，如泄漏- relu(默认)，Swish，和Mish。

Figure 7: Various method of data augmentation

图7:各种数据增强方法

In our experiments, as illustrated in Table 2, the classifier’s accuracy is improved by introducing the features such as: CutMix and Mosaic data augmentation, Class label smoothing, and Mish activation. As a result, our BoFbackbone (Bag of Freebies) for classifier training includes the following: CutMix and Mosaic data augmentation and Class label smoothing. In addition we use Mish activation as a complementary option, as shown in Table 2 and Table 3.

在我们的实验中，如表2所示，通过引入特征如:CutMix和Mosaic数据增强、Class label平滑、Mish激活等，提高了分类器的准确率。因此，我们用于分类器训练的bof主干(免费包)包括以下内容:CutMix和Mosaic数据增强和类标签平滑。此外，我们使用Mish激活作为补充选项，如表2和表3所示。

4.3. Influence of different features on

Detector training Further study concerns the influence of different Bag-ofFreebies (BoF-detector) on the detector training accuracy, as shown in Table 4. We significantly expand the BoF list through studying different features that increase the detector accuracy without affecting FPS:

M: Mosaic data augmentation - using the 4-image mosaic during training instead of single image
IT: IoU threshold - using multiple anchors for a single ground truth IoU (truth, anchor) > IoU threshold
GA: Genetic algorithms - using genetic algorithms for selecting the optimal hyperparameters during network training on the first 10% of time periods
LS: Class label smoothing - using class label smoothing for sigmoid activation
CBN: CmBN - using Cross mini-Batch Normalization for collecting statistics inside the entire batch, instead of collecting statistics inside a single mini-batch
CA: Cosine annealing scheduler - altering the learning rate during sinusoid training
DM: Dynamic mini-batch size - automatic increase of mini-batch size during small resolution training by using Random training shapes
OA: Optimized Anchors - using the optimized anchors for training with the 512x512 network resolution
GIoU, CIoU, DIoU, MSE - using different loss algorithms for bounded box regression

进一步的研究关注不同的口袋- offreebies (BoF-detector)对检测器训练精度的影响，如表4所示。我们通过研究在不影响FPS的情况下提高检测精度的不同特征，显著扩展了BoF list:

M:马赛克数据增强——在训练中使用4幅图像的马赛克代替单幅图像
它:IoU阈值-使用多个锚为一个单一的地面真实IoU(真相，锚)> IoU阈值
遗传算法-使用遗传算法选择最佳超参数在网络训练的前10%的时间周期
类标签平滑——使用类标签平滑来激活乙状结肠
CBN: CmBN -使用交叉小批标准化来收集整个批内的统计数据，而不是收集单个小批内的统计数据
余弦退火调度器-改变正弦波训练的学习速率
DM:动态小批量大小-在小分辨率训练过程中使用随机训练形状自动增加小批量大小
优化锚点-使用优化的锚点进行训练，网络分辨率为512x512
GIoU, CIoU, DIoU, MSE -使用不同的损失算法进行有界盒回归

Further study concerns the influence of different Bagof-Specials (BoS-detector) on the detector training accuracy, including PAN, RFB, SAM, Gaussian YOLO (G), and ASFF, as shown in Table 5. In our experiments, the detector gets best performance when using SPP, PAN, and SAM.

进一步研究不同的Bagof-Specials (boss -detector)对检测器训练精度的影响，包括PAN、RFB、SAM、Gaussian YOLO (G)、ASFF，如表5所示。在我们的实验中，当使用SPP、PAN和SAM时，检测器的性能最佳。

4.4. Influence of different backbones and pretrained weightings on Detector training

Further on we study the influence of different backbone models on the detector accuracy, as shown in Table 6. We notice that the model characterized with the best classification accuracy is not always the best in terms of the detector accuracy.

First, although classification accuracy of CSPResNeXt50 models trained with different features is higher compared to CSPDarknet53 models, the CSPDarknet53 model shows higher accuracy in terms of object detection.

进一步研究不同骨架模型对检测精度的影响，如表6所示。我们注意到，在检测器精度方面，具有最佳分类精度的模型并不总是最佳的。

首先，虽然不同特征训练的CSPResNeXt50模型的分类精度要高于CSPDarknet53模型，但CSPDarknet53模型在目标检测方面具有更高的精度。

Second, using BoF and Mish for the CSPResNeXt50 classifier training increases its classification accuracy, but further application of these pre-trained weightings for detector training reduces the detector accuracy. However, using BoF and Mish for the CSPDarknet53 classifier training increases the accuracy of both the classifier and the detector which uses this classifier pre-trained weightings. The net result is that backbone CSPDarknet53 is more suitable for the detector than for CSPResNeXt50.

We observe that the CSPDarknet53 model demonstrates a greater ability to increase the detector accuracy owing to various improvements.

其次，在CSPResNeXt50分类器训练中使用BoF和Mish可以提高分类精度，但是在检测器训练中进一步使用这些预训练权重会降低检测器的精度。然而，在CSPDarknet53分类器训练中使用BoF和Mish提高了分类器和使用预先训练权重的检测器的准确性。结果表明，基干CSPDarknet53比CSPResNeXt50更适合于检测器。

我们观察到，CSPDarknet53模型由于各种改进而显示出更大的提高检测器准确度的能力。

4.5. Influence of different mini-batch size on Detector training

Finally, we analyze the results obtained with models trained with different mini-batch sizes, and the results are shown in Table 7. From the results shown in Table 7, we found that after adding BoF and BoS training strategies, the mini-batch size has almost no effect on the detector’s performance. This result shows that after the introduction of BoF and BoS, it is no longer necessary to use expensive GPUs for training. In other words, anyone can use only a conventional GPU to train an excellent detector.

最后，我们对不同小批尺寸训练模型的结果进行了分析，结果如表7所示。从表7的结果可以看出，加入BoF和BoS训练策略后，小批量大小对检测器的性能几乎没有影响。这一结果表明，在引入BoF和BoS之后，不再需要使用昂贵的gpu进行培训。换句话说，任何人都只能使用传统的GPU来训练优秀的检测器。

Paper：《YOLOv4: Optimal Speed and Accuracy of Object Detection》的翻译与解读_第13张图片

Figure 8: Comparison of the speed and accuracy of different object detectors. (Some articles stated the FPS of their detectors for only one of the GPUs: Maxwell/Pascal/Volta)

图8:不同目标探测器速度和精度的比较。(一些文章指出他们的检测器的FPS只适用于其中一个gpu: Maxwell/Pascal/Volta)

5. Results

Comparison of the results obtained with other stateof-the-art object detectors are shown in Figure 8. Our YOLOv4 are located on the Pareto optimality curve and are superior to the fastest and most accurate detectors in terms of both speed and accuracy.

得到的结果与其他最先进的物体探测器的比较如图8所示。我们的YOLOv4位于Pareto最优曲线上，无论是速度还是精度都优于最快最准确的检测器。

Since different methods use GPUs of different architectures for inference time verification, we operate YOLOv4 on commonly adopted GPUs of Maxwell, Pascal, and Volta architectures, and compare them with other state-of-the-art methods. Table 8 lists the frame rate comparison results of using Maxwell GPU, and it can be GTX Titan X (Maxwell) or Tesla M40 GPU. Table 9 lists the frame rate comparison results of using Pascal GPU, and it can be Titan X (Pascal), Titan Xp, GTX 1080 Ti, or Tesla P100 GPU. As for Table 10, it lists the frame rate comparison results of using Volta GPU, and it can be Titan Volta or Tesla V100 GPU.

由于不同的方法使用不同架构的gpu进行推理时间验证，我们在Maxwell架构、Pascal架构和Volta架构常用的gpu上运行YOLOv4，并与其他最先进的方法进行比较。表8列出了使用Maxwell GPU的帧速率比较结果，可以是GTX Titan X (Maxwell)或者Tesla M40 GPU。表9列出了使用Pascal GPU的帧率比较结果，可以是Titan X (Pascal)、Titan Xp、GTX 1080 Ti或Tesla P100 GPU。表10列出了使用Volta GPU的帧速率比较结果，可以是Titan Volta或者Tesla V100 GPU。

6. Conclusions

We offer a state-of-the-art detector which is faster (FPS) and more accurate (MS COCO AP50...95 and AP50) than all available alternative detectors. The detector described can be trained and used on a conventional GPU with 8-16 GB-VRAM this makes its broad use possible. The original concept of one-stage anchor-based detectors has proven its viability. We have verified a large number of features, and selected for use such of them for improving the accuracy of both the classifier and the detector. These features can be used as best-practice for future studies and developments.

我们提供最先进的检测器，更快(FPS)和更准确(MS COCO AP50…超过了所有可用的替代探测器。所述的检测器可以在传统GPU上训练和使用8-16 GB-VRAM，这使其广泛使用成为可能。基于锚点的单级探测器的概念已经被证明是可行的。我们已经验证了大量的特征，并选择使用这些特征来提高分类器和检测器的准确性。这些特性可以作为未来研究和开发的最佳实践。

7. Acknowledgements

The authors wish to thank Glenn Jocher for the ideas of Mosaic data augmentation, the selection of hyper-parameters by using genetic algorithms and solving the grid sensitivity problem https://github.com/ ultralytics/yolov3.

作者感谢Glenn Jocher提出的镶嵌数据增强、利用遗传算法选择超参数以及解决网格敏感性问题的思路:https://github.com/ultralytics /yolov3。

你可能感兴趣的:(Paper)

python os.environ_python os.environ 读取和设置环境变量 weixin_39605414 python os.environ
>>>importos>>>os.environ.keys()['LC_NUMERIC','GOPATH','GOROOT','GOBIN','LESSOPEN','SSH_CLIENT','LOGNAME','USER','HOME','LC_PAPER','PATH','DISPLAY','LANG','TERM','SHELL','J2REDIR','LC_MONETARY','QT_QPA
自动写论文的网站推荐这5款实用类工具小猪包333 写论文人工智能深度学习计算机视觉 AI写作
在当今学术研究和写作领域，AI论文写作工具的出现极大地提高了写作效率和质量。这些工具不仅能够帮助研究人员快速生成论文草稿，还能进行内容优化、查重和排版等操作。以下是五款实用类工具推荐，特别是千笔-AIPassPaper。1.千笔-AIPassPaper千笔-AIPassPaper是一款功能强大且全面的AI论文写作助手，用户只需输入基本的研究需求和关键词，便能迅速生成一篇完整的论文。该工具利用先进的
推荐3家毕业AI论文可五分钟一键生成！文末附免费教程！小猪包333 写论文人工智能 AI写作深度学习计算机视觉
在当前的学术研究和写作领域，AI论文生成器已经成为许多研究人员和学生的重要工具。这些工具不仅能够帮助用户快速生成高质量的论文内容，还能进行内容优化、查重和排版等操作。以下是三款值得推荐的AI论文生成器：千笔-AIPassPaper、懒人论文以及AIPaperPass。千笔-AIPassPaper千笔-AIPassPaper是一款基于深度学习和自然语言处理技术的AI写作助手，旨在帮助用户快速生成高质
4款毕业论文参考文献格式生成器（附加详细步骤）小猪包333 写论文人工智能深度学习计算机视觉 AI写作
在撰写毕业论文时，参考文献的格式规范是至关重要的。为了帮助学生和学者们更高效地生成符合要求的参考文献格式，本文将详细介绍四款推荐的参考文献格式生成器，并提供详细的使用步骤。1.千笔-AIPassPaper千笔-AIPassPaper是一款先进的AI辅助论文写作工具，不仅能够自动生成大纲、开题报告，还能一键生成参考文献。AI论文，免费大纲，10分钟3万字https://www.aipaperpass
AI论文写作推荐哪个好？分享5款AI论文写作带数据图表网站小猪包333 写论文人工智能深度学习计算机视觉
在当今学术研究和写作领域，AI论文写作工具的出现极大地提高了写作效率和质量。这些工具不仅能够帮助研究人员快速生成论文草稿，还能进行内容优化、查重和排版等操作。以下是五款推荐的AI论文写作工具，包括千笔-AIPassPaper。千笔-AIPassPaper千笔-AIPassPaper是一款功能强大的AI论文写作助手，旨在帮助用户快速生成高质量的论文内容。AI论文，免费大纲，10分钟3万字https:
AI论文题目生成器怎么用？9款论文写作网站简单3步搞定小猪包333 写论文人工智能深度学习计算机视觉
在当今信息爆炸的时代，AI写作工具的出现极大地提高了写作效率和质量。本文将详细介绍9款优秀的论文写作网站，并重点推荐千笔-AIPassPaper。一、千笔-AIPassPaper千笔-AIPassPaper是一款功能强大的AI论文生成器，基于最新的自然语言处理技术，能够一键生成高质量的毕业论文、开题报告等文本内容。它不仅提供智能选题、文献推荐和论文润色等功能，还具有较高的用户评价。其文献综述生成功
毕业论文附录一般都写什么?大学生写论文是干嘛用的写个原创论文人工智能深度学习 AI写作 chatgpt 论文阅读
毕业论文的附录通常包含一些在正文中不便于展示或详细阐述的内容，但对理解论文整体又具有重要意义的资料。具体来说，附录可能包含以下内容：AI论文，免费大纲，10分钟3万字，查重高于15%退费，支持数据图表！！AIPaperPass-AI论文写作指导平台AIPaperPass是AI原创论文写作平台，免费千字大纲，5分钟生成3万字初稿，提供答辩汇报ppt、开题报告、任务书等，40篇真实中英文知网参考文献，
端到端的自动驾驶论文与代码整理大别山伧父自动驾驶
LearningbyCheatinggithubcodearxivpaperconferenceonrobotlearning最新进展(May2021)Checkoutourlatestfollow-upwork:WorldonRails(2020)Checkoutoursubmissiontothe2020CARLAChallenge!pass
EP6 同一组件通过传递不同属性展示不同效果京城五 uniapp壁纸小程序项目实践前端学习脚步 css 前端 html
文件路径：E:/homework/uniappv3tswallpaper/pages/index/index.vue公告文字内容文字内容文字内容文字内容文字内容文字内容文字内容文字内容文字内容文字内容文字内容文字内容文字内容文字内容文字内容文字内容每日推荐专题精选More+.homeLayout{.banner{width:750rpx;padding:30rpx0;swiper{width:10
EP7 底部tab切换页面标签京城五 uniapp壁纸小程序项目实践前端知识杂合前端 uniapp 小程序
文件路径：E:/homework/uniappv3tswallpaper/pages/classify/classify.vue.classify{padding:30rpx;display:grid;grid-template-columns:repeat(3,1fr);gap:15rpx;}文件路径：E:/homework/uniappv3tswallpaper/pages/user/user
探索任务的隐秘世界：推荐Task2Vec 邓越浪Henry
探索任务的隐秘世界：推荐Task2Vecaws-cv-task2vecOfficialcodeforthepaper"Task2Vec:TaskEmbeddingforMeta-Learning"(https://arxiv.org/abs/1902.03545,ICCV2019)项目地址:https://gitcode.com/gh_mirrors/aw/aws-cv-task2vec在机器学习
Coding and Paper Letter（十四） G小调的Qing歌
资源整理。1Coding:1.R语言包ungeviz，ggplot2的拓展包，专门用来作不确定性的可视化。ungeviz2.计算机图形学相关开源项目。计算机图形学光线追踪开源项目C++源码。computergraphicsraytracing计算机图形学格网开源项目C++源码。computergraphicsmeshes计算机图形学介绍开源项目。computergraphics3.R语言包GLMM
05-树9 Huffman Codes（C） L_glonar c语言数据结构
日常，这一次，耗费我三天，其实第二天时便已经将对整个框架有清晰的了解了，（看了解析了），但是一步步排除，确实让我学到了很多。In1953,DavidA.Huffmanpublishedhispaper"AMethodfortheConstructionofMinimum-RedundancyCodes",andhenceprintedhisnameinthehistoryofcomputersci
线性代数|机器学习-P33卷积神经网络ImageNet和卷积规则取个名字真难呐算法机器学习矩阵人工智能线性代数
文章目录1.ImageNet2.卷积计算2.1两个多项式卷积2.2函数卷积2.3循环卷积3.周期循环矩阵和非周期循环矩阵4.循环卷积特征值4.1卷积计算的分解4.2运算量4.3二维卷积公式5.KroneckerProduct1.ImageNetImageNet的论文paper链接如下：详细请直接阅读相关论文即可通过网盘分享的文件：imagenet_cvpr09.pdf链接:https://pan.
IJCAI2024 无脑敲代码，bug漫天飞会议
CallforPapers–IJCAI2024重要日期(所有时间都是地球上的任何地方，UTC-12)摘要提交截止日期:2024年1月10日作者信息截止日期:2024年1月16日论文全文截止日期:2024年1月17日附录和重新提交信息截止日期:2024年1月24日简易拒绝通知:2024年2月22日作者回复时间:2024年3月18日至21日书面通知:2024年4月16日会议:2024年8月3日星期六至
2019-01-12 q若水
Youcan'trewriteyourpast,butyoucangrabacleansheetofpaperandwriteyourfuture.你不能重写过去，但是你可以用一张干净的纸去书写你的未来。
第66期 | GPTSecurity周报云起无垠 GPTSecurity AIGC gpt
GPTSecurity是一个涵盖了前沿学术研究和实践经验分享的社区，集成了生成预训练Transformer（GPT）、人工智能生成内容（AIGC）以及大语言模型（LLM）等安全领域应用的知识。在这里，您可以找到关于GPT/AIGC/LLM最新的研究论文、博客文章、实用的工具和预设指令（Prompts）。现为了更好地知悉近一周的贡献内容，现总结如下。SecurityPapers1.利用高级大语言模型
Bilingual engineering 201707 No.360 Alyee AlyeeBonnie
GamesandDailylife:Makealittlemousewithher.Steps1.Useorangepapertomakeacone2.Maketworoundearsandalongtailwiththeorangepaper3.Cutasmallpieceofblackpapertomakethemouseswhiskers4.Pasteallthepartstogether5
IROS2023 马少爷学术人工智能自然语言处理
1、论文要求论文征集提交给IROS会议文件审查委员会作为同行评审的档案出版物，所有被接受的论文都将在IEEEXplore上托管。邀请潜在作者提交代表原创作品的高质量论文。欢迎就主题以及智能机器人和应用的所有领域提交意见。请通过传统的PaperPlaza流程提交论文。格式指南LaTex模板MSWord模板论文长度应为六页（美国字母大小），最多可多出两页（每多出一页收费205美元，应在验收后付款）。页
探索智能边缘计算：Game-Theoretic-Deep-Reinforcement-Learning 瞿旺晟
探索智能边缘计算：Game-Theoretic-Deep-Reinforcement-LearningGame-Theoretic-Deep-Reinforcement-LearningCodeofPaper"JointTaskOffloadingandResourceOptimizationinNOMA-basedVehicularEdgeComputing:AGame-TheoreticDRL
乡村振兴战略下传统村落文化旅游设计 Paperback – Aug. 1 2022 Chinese edition by XU SHAO HUI (Author) 光明理论旅游人工智能媒体生活科技产品运营内容运营
乡村振兴战略下传统村落文化旅游设计Paperback–Aug.12022ChineseeditionbyXUSHAOHUI(Author)Language:Chinese.paperback.PubDate:2022-08-01.publisher:ChinaBuildingIndustryPress.description:Paperback.PubDate:2022-08-01Pages:20
第65期 | GPTSecurity周报云起无垠 GPTSecurity 人工智能网络安全语言模型
GPTSecurity是一个涵盖了前沿学术研究和实践经验分享的社区，集成了生成预训练Transformer（GPT）、人工智能生成内容（AIGC）以及大语言模型（LLM）等安全领域应用的知识。在这里，您可以找到关于GPT/AIGC/LLM最新的研究论文、博客文章、实用的工具和预设指令（Prompts）。现为了更好地知悉近一周的贡献内容，现总结如下。SecurityPapers1.基于第一性原理的大
特征点提取与匹配原文论文下载长沙有肥鱼视觉SLAM十四讲计算机视觉
ORB原文下载链接：(PDF)ORB:anefficientalternativetoSIFTorSURFSIFT原文下载链接：https://www.cs.ubc.ca/~lowe/papers/ijcv04.pdfSURF原文下载链接:https://www.cs.jhu.edu/~misha/ReadingSeminar/Papers/Bay08.pdfORB和AKAZE对比论文下载链接：h
后端JOIN、LEFT JOIN、RIGHT JOIN的理解 I like Code? java 后端
SELECTf_exam_record.*,f_exam_paper.PaperName,f_exam_paper.PaperTime,exam_class.classnameFROMf_exam_recordJOINf_exam_paperONf_exam_record.PaperId=f_exam_paper.PaperIdLEFTJOINexam_classonf_exam_record.c
仿华为车机功能之--修改Launcher3,增加横向滑动桌面空白处切换壁纸的功能 Kwanvin Android Launcher3深度定制开发华为 java android
本功能基于Android13Launcher3需求：模仿华为问界车机，实现横向滑动桌面空白处，切换壁纸功能（本质只是切换背景，没有切换壁纸）。实现效果：实现思路：第一步首先得增加手势识别第二步切换底图，不切换壁纸是因为切换壁纸动作太大，需要调用到WallpaperManager,耗时且会触发应用activity重启原生系统有识别上滑与下滑的动作，那我们应该增加一个左滑和右滑的动作识别禁止上滑出所有
开源的即时聊天解决方案Papercups 辣码甄源精品开源应用分享开源 github 信息与通信
Papercups：让聊天支持变得简单、私密、实时。-精选真开源，释放新价值。概览Papercups是一款开源的实时客户支持工具，它使用Elixir语言构建，为注重客户数据隐私和安全性的公司提供了一个自托管的解决方案。这款工具的设计理念是简化客户与企业之间的沟通流程，通过一个直观的聊天小部件嵌入到企业的网站中，实现无缝的实时交流。Papercups的聊天小部件不仅易于集成，还提供了丰富的自定义选项
今日欧美圈：Sam Smith专辑改期，The Box狂揽B榜十周冠胡萝卜音乐
新一期Billboard单曲榜上，《TheBox》狂揽十周冠，DuaLipa热单《Don'tStartNow》升至亚军，LilUziVert有三首歌曲进入前十。SamSmith新专辑《ToDieFor》发行日期推迟到6月5日。新单要来啦！LaurenJauregui宣布新单《Lento》将在3月20日发行。HarryStyles登上BeautyPapers写真释出！在《冰雪奇缘2》中为Honeym
Vblog#1 English learning for science research 一粒咖啡
提示：文章写完后，目录可以自动生成，如何生成可参考右边的帮助文档Englishlearningforscienceresearchintroduction一、GOALsin1month二、PlanseverydaySummeryintroductionIstartedtowritepaperinEnglishinordertoimproveabilityofEnglishandunderstand
AIGC：Kolors: Effective Training of Diffusion Model for Photorealistic Text-to-Image Synthesis 微风❤水墨 AIGC
代码：GitHub-Kwai-Kolors/Kolors:KolorsTeam论文：Kolors/imgs/Kolors_paper.pdfatmaster·Kwai-Kolors/Kolors·GitHub模型：huaggingface:https://huggingface.co/Kwai-Kolors/Kolors-diffusersmodelscope:https://modelscope
PaperWeekly sapienst Papers PaperwithCode General ML
1.Python软件包解决DL在未见过的数据分布下性能差的问题：（1）神经网络和损失分离的模块化设计（2）强大便捷的基准测试能力（3）易于使用但难以修改（4）github:https://github.com/marrlab/domainlabTrainer和Models之间是什么关系Trainer和Models是DomainLab中的两个核心概念。Trainer是一个用于指导数据流向模型并计算S
Java 并发包之线程池和原子计数 lijingyao8206 Java计数 ThreadPool 并发包 java线程池
对于大数据量关联的业务处理逻辑，比较直接的想法就是用JDK提供的并发包去解决多线程情况下的业务数据处理。线程池可以提供很好的管理线程的方式，并且可以提高线程利用率，并发包中的原子计数在多线程的情况下可以让我们避免去写一些同步代码。这里就先把jdk并发包中的线程池处理器ThreadPoolExecutor 以原子计数类AomicInteger 和倒数计时锁C
java编程思想抽象类和接口百合不是茶 java 抽象类接口
接口c++对接口和内部类只有简介的支持,但在java中有队这些类的直接支持 1 ,抽象类 : 如果一个类包含一个或多个抽象方法,该类必须限定为抽象类(否者编译器报错) 抽象方法 : 在方法中仅有声明而没有方法体 package com.wj.Interface;
[房地产与大数据]房地产数据挖掘系统 comsci 数据挖掘
随着一个关键核心技术的突破,我们已经是独立自主的开发某些先进模块,但是要完全实现,还需要一定的时间... 所以,除了代码工作以外,我们还需要关心一下非技术领域的事件..比如说房地产 &nb
数组队列总结沐刃青蛟数组队列
数组队列是一种大小可以改变，类型没有定死的类似数组的工具。不过与数组相比，它更具有灵活性。因为它不但不用担心越界问题，而且因为泛型（类似c++中模板的东西）的存在而支持各种类型。以下是数组队列的功能实现代码： import List.Student; public class
Oracle存储过程无法编译的解决方法 IT独行者 oracle 存储过程　
今天同事修改Oracle存储过程又导致2个过程无法被编译，流程规范上的东西，Dave 这里不多说，看看怎么解决问题。 1. 查看无效对象 XEZF@xezf(qs-xezf-db1)> select object_name,object_type,status from all_objects where status='IN
重装系统之后oracle恢复文强chu oracle
前几天正在使用电脑，没有暂停oracle的各种服务。突然win8.1系统奔溃，无法修复，开机时系统提示正在搜集错误信息，然后再开机，再提示的无限循环中。无耐我拿出系统u盘准备重装系统，没想到竟然无法从u盘引导成功。晚上到外面早了一家修电脑店，让人家给装了个系统，并且那哥们在我没反应过来的时候，直接把我的c盘给格式化了并且清理了注册表，再装系统。然后的结果就是我的oracl
python学习二（一些基础语法）小桔子 pthon 基础语法
紧接着把！昨天没看继续看django 官方教程，学了下python的基本语法与c类语言还是有些小差别： 1.ptyhon的源文件以UTF-8编码格式 2. / 除结果浮点型 // 除结果整形 % 除取余数 * 乘 ** 乘方 eg 5**2 结果是5的2次方25 _&
svn 常用命令 aichenglong SVN 版本回退
1 svn回退版本 1)在window中选择log,根据想要回退的内容,选择revert this version或revert chanages from this version 两者的区别: revert this version:表示回退到当前版本(该版本后的版本全部作废) revert chanages from this versio
某小公司面试归来 alafqq 面试
先填单子，还要写笔试题，我以时间为急，拒绝了它。。时间宝贵。老拿这些对付毕业生的东东来吓唬我。。面试官很刁难，问了几个问题，记录下； 1，包的范围。。。public,private,protect. --悲剧了 2，hashcode方法和equals方法的区别。谁覆盖谁.结果，他说我说反了。 3，最恶心的一道题，抽象类继承抽象类吗？（察，一般它都是被继承的啊） 4，stru
动态数组的存储速度比较集合框架百合不是茶集合框架
集合框架：自定义数据结构(增删改查等) package 数组; /** * 创建动态数组 * @author 百合 * */ public class ArrayDemo{ //定义一个数组来存放数据 String[] src = new String[0]; /** * 增加元素加入容器 * @param s要加入容器
用JS实现一个JS对象，对象里有两个属性一个方法 bijian1013 js对象
<html> <head> </head> <body> 用js代码实现一个js对象，对象里有两个属性，一个方法 </body> <script> var obj={a:'1234567',b:'bbbbbbbbbb',c:function(x){
探索JUnit4扩展：使用Rule bijian1013 java 单元测试 JUnit Rule
在上一篇文章中，讨论了使用Runner扩展JUnit4的方式，即直接修改Test Runner的实现(BlockJUnit4ClassRunner)。但这种方法显然不便于灵活地添加或删除扩展功能。下面将使用JUnit4.7才开始引入的扩展方式——Rule来实现相同的扩展功能。 1. Rule &n
[Gson一]非泛型POJO对象的反序列化 bit1129 POJO
当要将JSON数据串反序列化自身为非泛型的POJO时，使用Gson.fromJson(String, Class)方法。自身为非泛型的POJO的包括两种： 1. POJO对象不包含任何泛型的字段 2. POJO对象包含泛型字段，例如泛型集合或者泛型类 Data类 a.不是泛型类， b.Data中的集合List和Map都是泛型的 c.Data中不包含其它的POJO
【Kakfa五】Kafka Producer和Consumer基本使用 bit1129 kafka
0.Kafka服务器的配置一个Broker，一个Topic Topic中只有一个Partition（） 1. Producer： package kafka.examples.producers; import kafka.producer.KeyedMessage; import kafka.javaapi.producer.Producer; impor
lsyncd实时同步搭建指南——取代rsync+inotify ronin47
1. 几大实时同步工具比较 1.1 inotify + rsync 最近一直在寻求生产服务服务器上的同步替代方案，原先使用的是 inotify + rsync，但随着文件数量的增大到100W+，目录下的文件列表就达20M，在网络状况不佳或者限速的情况下，变更的文件可能10来个才几M，却因此要发送的文件列表就达20M，严重减低的带宽的使用效率以及同步效率；更为要紧的是，加入inotify
java-9. 判断整数序列是不是二元查找树的后序遍历结果 bylijinnan java
public class IsBinTreePostTraverse{ static boolean isBSTPostOrder(int[] a){ if(a==null){ return false; } /*1.只有一个结点时，肯定是查找树 *2.只有两个结点时，肯定是查找树。例如{5,6}对应的BST是 6 {6,5}对应的BST是
MySQL的sum函数返回的类型 bylijinnan java spring sql mysql jdbc
今天项目切换数据库时，出错访问数据库的代码大概是这样： String sql = "select sum(number) as sumNumberOfOneDay from tableName"; List<Map> rows = getJdbcTemplate().queryForList(sql); for (Map row : rows
java设计模式之单例模式 chicony java设计模式
在阎宏博士的《JAVA与模式》一书中开头是这样描述单例模式的：　　作为对象的创建模式，单例模式确保某一个类只有一个实例，而且自行实例化并向整个系统提供这个实例。这个类称为单例类。单例模式的结构　　单例模式的特点：单例类只能有一个实例。单例类必须自己创建自己的唯一实例。单例类必须给所有其他对象提供这一实例。　　饿汉式单例类 publ
javascript取当月最后一天 ctrain JavaScript
 <script language=javascript> var current = new Date(); var year = current.getYear(); var month = current.getMonth(); showMonthLastDay(year, mont
linux tune2fs命令详解 daizj linux tune2fs 查看系统文件块信息
一.简介： tune2fs是调整和查看ext2/ext3文件系统的文件系统参数，Windows下面如果出现意外断电死机情况，下次开机一般都会出现系统自检。Linux系统下面也有文件系统自检，而且是可以通过tune2fs命令，自行定义自检周期及方式。二.用法： Usage: tune2fs [-c max_mounts_count] [-e errors_behavior] [-g grou
做有中国特色的程序员 dcj3sjt126com 程序员
从出版业说起网络作品排到靠前的，都不会太难看，一般人不爱看某部作品也是因为不喜欢这个类型，而此人也不会全不喜欢这些网络作品。究其原因，是因为网络作品都是让人先白看的，看的好了才出了头。而纸质作品就不一定了，排行榜靠前的，有好作品，也有垃圾。许多大牛都是写了博客，后来出了书。这些书也都不次，可能有人让为不好，是因为技术书不像小说，小说在读故事，技术书是在学知识或温习知识，有
Android：TextView属性大全 dcj3sjt126com textview
android:autoLink 设置是否当文本为URL链接/email/电话号码/map时，文本显示为可点击的链接。可选值(none/web/email/phone/map/all) android:autoText 如果设置，将自动执行输入值的拼写纠正。此处无效果，在显示输入法并输
tomcat虚拟目录安装及其配置 eksliang tomcat配置说明 tomca部署web应用 tomcat虚拟目录安装
转载请出自出处：http://eksliang.iteye.com/blog/2097184 1.-------------------------------------------tomcat 目录结构 config：存放tomcat的配置文件 temp ：存放tomcat跑起来后存放临时文件用的 work ：当第一次访问应用中的jsp
浅谈：APP有哪些常被黑客利用的安全漏洞 gg163 APP
首先，说到APP的安全漏洞，身为程序猿的大家应该不陌生；如果抛开安卓自身开源的问题的话，其主要产生的原因就是开发过程中疏忽或者代码不严谨引起的。但这些责任也不能怪在程序猿头上，有时会因为BOSS时间催得紧等很多可观原因。由国内移动应用安全检测团队爱内测（ineice.com）的CTO给我们浅谈关于Android 系统的开源设计以及生态环境。 1. 应用反编译漏洞：APK 包非常容易被反编译成可读
C#根据网址生成静态页面 hvt Web .net C#asp.net hovertree
HoverTree开源项目中HoverTreeWeb.HVTPanel的Index.aspx文件是后台管理的首页。包含生成留言板首页，以及显示用户名，退出等功能。根据网址生成页面的方法： bool CreateHtmlFile(string url, string path) { //http://keleyi.com/a/bjae/3d10wfax.htm stri
SVG 教程（一）天梯梦 svg
SVG 简介 SVG 是使用 XML 来描述二维图形和绘图程序的语言。学习之前应具备的基础知识：继续学习之前，你应该对以下内容有基本的了解： HTML XML 基础如果希望首先学习这些内容，请在本站的首页选择相应的教程。什么是SVG？ SVG 指可伸缩矢量图形 (Scalable Vector Graphics) SVG 用来定义用于网络的基于矢量
一个简单的java栈 luyulong java 数据结构栈
public class MyStack { private long[] arr; private int top; public MyStack() { arr = new long[10]; top = -1; } public MyStack(int maxsize) { arr = new long[maxsize]; top
基础数据结构和算法八：Binary search sunwinner Algorithm Binary search
Binary search needs an ordered array so that it can use array indexing to dramatically reduce the number of compares required for each search, using the classic and venerable binary search algori
12个C语言面试题，涉及指针、进程、运算、结构体、函数、内存，看看你能做出几个！刘星宇 c 面试
12个C语言面试题，涉及指针、进程、运算、结构体、函数、内存，看看你能做出几个！ 1.gets()函数问：请找出下面代码里的问题： #include<stdio.h> int main(void) { char buff[10]; memset(buff,0,sizeof(buff));
ITeye 7月技术图书有奖试读获奖名单公布 ITeye管理员活动 ITeye 试读
ITeye携手人民邮电出版社图灵教育共同举办的7月技术图书有奖试读活动已圆满结束，非常感谢广大用户对本次活动的关注与参与。 7月试读活动回顾： http://webmaster.iteye.com/blog/2092746 本次技术图书试读活动的优秀奖获奖名单及相应作品如下（优秀文章有很多，但名额有限，没获奖并不代表不优秀）：《Java性能优化权威指南》