ForeverStrong

YOLOv3: An Incremental Improvement

Joseph Redmon, Ali Farhadi

University of Washington

https://pjreddie.com/darknet/yolov1/
https://pjreddie.com/darknet/yolov2/
https://pjreddie.com/darknet/yolo/

University of Washington，UW, Washington or U-Dub：华盛顿大学
Facebook Artificial Intelligence Research，Facebook AI Research，FAIR
Allen Institute for Artificial Intelligence，Allen Institute for AI，AI2
Computer Science，CS：计算机科学
Computer Vision，CV：计算机视觉
interpretation [ɪnˌtɜːprəˈteɪʃn]：n. 解释，翻译，演出
intuition [ˌɪntjuˈɪʃn]：n. 直觉，直觉力，直觉的知识
moderation [ˌmɒdəˈreɪʃn]：n. 适度，节制，温和，缓和
incremental [,ɪnkrɪ'mentəl]：adj. (定额) 增长的，逐渐的，逐步的，递增的

arXiv (archive - the X represents the Greek letter chi [χ]) is a repository of electronic preprints approved for posting after moderation, but not full peer review.

Abstract

We present some updates to YOLO! We made a bunch of little design changes to make it better. We also trained this new network that’s pretty swell. It’s a little bigger than last time but more accurate. It’s still fast though, don’t worry. At 320 $\times$ 320 YOLOv3 runs in 22 ms at 28.2 mAP, as accurate as SSD but three times faster. When we look at the old .5 IOU mAP detection metric YOLOv3 is quite good. It achieves 57.9 $AP_{50}$ in 51 ms on a Titan X, compared to 57.5 $AP_{50}$ in 198 ms by RetinaNet, similar performance but 3.8 $\times$ faster. As always, all the code is online at https://pjreddie.com/darknet/yolo/.
我们向 YOLO 提供一些更新！我们做了一些小的设计更改以使其更好。我们还训练了这个相当出色的新网络。比上次要大一点，但更准确。不过速度还是很快的，请放心。320 $\times$ 320 YOLOv3 运行时间是 22ms 且有 28.2 mAP，与 SSD 准确率一样，但速度快了三倍。当我们以旧的 .5 IOU mAP 检测指标度量 YOLOv3 时，它是相当不错的。在 Titan X 上，它以 51ms 的运行时间达到 57.9 $AP_{50}$ ，而 RetinaNet 以 198ms 的运行时间达到 57.5 $AP_{50}$ ，性能相似，但速度提高了 3.8 倍。与往常一样，所有代码都可以在 https://pjreddie.com/darknet/yolo/ 上在线查看。

我们在本文中提出 YOLO 的最新版本 YOLOv3。我们对 YOLO 加入了许多设计细节的变化，以提升其性能。这个新模型相对更大但准确率更高。不用担心，它依然非常快。对于 320 $\times$ 320 的尺度，YOLOv3 可以达到 22ms 的检测速度，获得 28.2 mAP 的性能，与 SSD 的准确率相当但是速度快 3 倍。当我们使用旧版 .5 IOU mAP 检测指标时，YOLOv3 是非常不错的。它在一块 TitanX 上以 51ms 的速度达到了 57.9 $AP_{50}$ 的性能，而用 RetinaNet 则以 198ms 的速度获得 57.5 $AP_{50}$ 的性能，性能相近但快了 3 倍。

swell [swel]：v. 膨胀，肿胀，(使) 凸出，鼓出，(使) 增加，扩大，(声音) 变响亮，充满 (激情) n. 凸起处，隆起处，逐渐增长，感情高涨，浪涌，音量调节器，(非正式) 名流 adj. (非正式) 极好的，非常愉快的，漂亮的，时髦的 adv. 极好地，出色地

Figure 1. We adapt this figure from the Focal Loss paper [9]. YOLOv3 runs significantly faster than other detection methods with comparable performance. Times from either an M40 or Titan X, they are basically the same GPU.
图 1. 我们根据 Focal Loss 论文 [9] 修改了该图。YOLOv3 的运行速度明显快于其他具有可比性能的检测方法。时间在 M40 或 Titan X 上测量的，它们基本上是相同的 GPU。

1. Introduction

Sometimes you just kinda phone it in for a year, you know? I didn’t do a whole lot of research this year. Spent a lot of time on Twitter. Played around with GANs a little. I had a little momentum left over from last year [12] [1]; I managed to make some improvements to YOLO. But, honestly, nothing like super interesting, just a bunch of small changes that make it better. I also helped out with other people’s research a little.
有时候你仅仅打电话就一年了，你知道吗？我今年没有做很多研究。在 Twitter 上花费了很多时间。玩了一点 GAN。我从去年留下一点动力 [12] [1]。我设法对 YOLO 进行了一些改进。但是，老实说，没有什么超级有趣的事，只是一堆小小的改进而已。我也帮助其他人做了一点研究。

有时，你一整年全在敷衍了事而不自知。比如今年我就没做太多研究，在推特上挥霍光阴，置 GANs 于不顾。凭着上年余留的一点动力，我成功对 YOLO 做了一些升级。但实话讲，没什么超有趣的东西，只不过是些小修小补。同时我对其他人的研究也尽了少许绵薄之力。

Actually, that’s what brings us here today. We have a camera-ready deadline [4] and we need to cite some of the random updates I made to YOLO but we don’t have a source. So get ready for a TECH REPORT!
实际上，这就是今天把我们带到这里的原因。我们有一个可随时使用相机的截止日期 [4]，我们需要引用我对 YOLO 所做的一些随机更新，但我们没有来源。因此，准备一份技术报告！

于是就有了今天的这篇论文。我们有一个最终截稿日期，需要引用 YOLO 的一些更新，但是没有资源。因此，准备一份技术报告！

The great thing about tech reports is that they don’t need intros, y’all know why we’re here. So the end of this introduction will signpost for the rest of the paper. First we’ll tell you what the deal is with YOLOv3. Then we’ll tell you how we do. We’ll also tell you about some things we tried that didn’t work. Finally we’ll contemplate what this all means.
技术报告的优点在于它们不需要导论，大家都知道我们为什么在这里。因此，本导论的结尾将为本文的其余部分指明路标。首先，我们将告诉您 YOLOv3 的处理方式。然后，我们将告诉您我们的做法。我们还将告诉您一些我们尝试过的无效的事情。最后，我们将思考所有这些。

twitter ['twɪtə(r)]：n. 兴奋，(鸟的) 唧啾声，紧张，激动 v. 叽喳，唧唧喳喳地说话，运用推特社交网络发送信息
generative adversarial network，GAN：生成式对抗网络
whole [həʊl]：adj. 完整的，纯粹的 n. 整体，全部
kinda ['kaɪndə]：adv. 有点，有几分
intro [ˈɪntrəʊ]：n. 前奏，前言，导言，介绍，简介
contemplate [ˈkɒntəmpleɪt]：vt. 沉思，注视，思忖，预期 vi. 冥思苦想，深思熟虑
signpost [ˈsaɪnpəʊst]：n. 路标，指示牌

2. The Deal

So here’s the deal with YOLOv3: We mostly took good ideas from other people. We also trained a new classifier network that’s better than the other ones. We’ll just take you through the whole system from scratch so you can understand it all.
因此，这是 YOLOv3 的处理方式：我们大多从别人那里吸取了好主意。我们还训练了一个新的分类器网络，该网络要比其他分类器更好。我们将带您从头开始学习整个系统，以便您可以全部了解。

这一部分主要介绍了 YOLOv3 的解决方案，我们从其他研究员那边获取了非常多的灵感。我们还训练了一个非常优秀的分类网络，因此原文章的这一部分主要从边界框的预测、类别预测和特征抽取等方面详细介绍整个系统。

2.1. Bounding Box Prediction (边界框的预测)

Following YOLO9000 our system predicts bounding boxes using dimension clusters as anchor boxes [15]. The network predicts 4 coordinates for each bounding box, $t_{x}, t_{y}, t_{w}, t_{h}$ . If the cell is offset from the top left corner of the image by $c_{x}, c_{y})$ and the bounding box prior has width and height $p_{w}, p_{h}$ , then the predictions correspond to:
遵循 YOLO9000，我们的系统使用 dimension clusters 作为 anchor boxes 来预测 bounding boxes [15]。网络为每个 bounding box $t_{x}, t_{y}, t_{w}, t_{h}$ 预测 4 个坐标。如果单元格从图像的左上角偏移了 $c_{x}, c_{y})$ ，并且 bounding box prior (anchor box) 的宽度和高度为 $p_{w}, p_{h}$ ，则预测对应于：

$\begin{aligned} b_{x} &= \sigma(t_{x}) + c_{x} \\ b_{y} &= \sigma(t_{y}) + c_{y} \\ b_{w} &= p_{w}e^{t_{w}} \\ b_{h} &= p_{h}e^{t_{h}} \end{aligned}$

anchor boxes 是通过聚类的方法得到的。cell (图像划分成 S $\times$ S 个网格 cell) 相对于图像左上角的偏移 $c_{x}, c_{y})$ 。bounding box prior (anchor box) 宽和高 (width and height) $p_{w}, p_{h}$ 。

During training we use sum of squared error loss. If the ground truth for some coordinate prediction is $\hat{t}_{\ast}$ our gradient is the ground truth value (computed from the ground truth box) minus our prediction: $\hat{t}_{\ast} - t_{\ast}$ . This ground truth value can be easily computed by inverting the equations above.
在训练期间，我们使用平方误差损失之和。如果某个坐标预测的 ground truth 为 $\hat{t}_{\ast}$ ，则我们的梯度为 ground truth 值 (从 ground truth 框计算得出) 减去我们的预测： $\hat{t}_{\ast} - t_{\ast}$ 。通过倒转上述公式，可以很容易地计算出 ground truth 值。

采用 sum of squared error loss 计算速度快。如果预测坐标的 ground truth 是 $\hat{t}_{\ast}$ ，相应的梯度是 ground truth 值和预测值的差： $\hat{t}_{\ast} - t_{\ast}$ .

YOLOv3 predicts an objectness score for each bounding box using logistic regression. This should be 1 if the bounding box prior overlaps a ground truth object by more than any other bounding box prior. If the bounding box prior is not the best but does overlap a ground truth object by more than some threshold we ignore the prediction, following [17]. We use the threshold of .5. Unlike [17] our system only assigns one bounding box prior for each ground truth object. If a bounding box prior is not assigned to a ground truth object it incurs no loss for coordinate or class predictions, only objectness.
YOLOv3 使用逻辑回归预测每个边界框的 objectness score。如果 bounding box prior 与 ground truth 目标的重叠量大于任何其他 bounding box prior，则应为 1。如果 bounding box prior 不是最好的，但是与 ground truth 目标的重叠超过某个阈值，我们将忽略预测 [17]。我们使用的阈值为 .5。与 [17] 不同，我们的系统仅为每个 ground truth 目标分配一个 bounding box prior。如果没有将 bounding box prior 分配给 ground truth 目标，则不会产生坐标或类别预测 loss，而只会产生 objectness 预测 loss。

yolov3.cfg 的训练轮数是 max_batches = 500200，数据量较小时，每一轮训练显示的损失值都是 nan，其原因可能是因为阈值直接忽略掉了这个 bounding box 导致没有loss。

If the bounding box prior is not the best but does overlap a ground truth object by more than some threshold we ignore the prediction, following [17]. We use the threshold of 0.5. 预测的 bounding box 被忽略，coordinate or class 预测不会产生 loss，只会产生 objectness 预测 loss。

invert [ɪnˈvɜːt]：vt. 使...转化，使...颠倒，使...反转，使...前后倒置 n. 颠倒的事物，倒置物，倒悬者 adj. 转化的
gradient [ˈɡreɪdiənt]：n. 梯度，陡度，倾斜度，坡度，变化率，梯度变化曲线 adj. 倾斜的，步行的，能步行的
incur [ɪnˈkɜː(r)]：v. 招致，遭受，引致，带来...
blatantly [ˈbleɪtəntli]：adv. 公然地，喧闹地，看穿了地
plagiarize ['pleɪdʒəraɪz]：vi. 剽窃，抄袭 vt. 剽窃，抄袭

Figure 2. Bounding boxes with dimension priors and location prediction. We predict the width and height of the box as offsets from cluster centroids. We predict the center coordinates of the box relative to the location of filter application using a sigmoid function. This figure blatantly self-plagiarized from [15].
Figure 2. Bounding boxes with dimension priors and location prediction. 我们将框的宽度和高度预测为与 cluster centroids 的偏移量。我们使用 sigmoid function 预测 box 的中心坐标相对于 filter 应用的位置。这个图片公然从 [15] 自剽窃。

2.2. Class Prediction

Each box predicts the classes the bounding box may contain using multilabel classification. We do not use a softmax as we have found it is unnecessary for good performance, instead we simply use independent logistic classifiers. During training we use binary cross-entropy loss for the class predictions.
每个框使用多标签分类预测边界框可能包含的类。我们不使用 softmax，因为我们发现它对于高性能没有必要，相反，我们只是使用独立的逻辑分类器。在训练期间，我们使用二元交叉熵损失进行类别预测。

每个边界框都会使用多标记分类来预测框中可能包含的类。我们不用 softmax，而是用单独的逻辑分类器，因为我们发现前者对于提升网络性能没什么用。在训练过程中，我们用二元交叉熵损失来预测类别。

This formulation helps when we move to more complex domains like the Open Images Dataset [7]. In this dataset there are many overlapping labels (i.e. Woman and Person). Using a softmax imposes the assumption that each box has exactly one class which is often not the case. A multilabel approach better models the data.
当我们移至 Open Images Dataset [7] 等更复杂的领域时，这种表达方式会有所帮助。在此数据集中，有许多重叠的标签 (i.e. Woman and Person)。使用 softmax 会假设每个 box 只有一个类，而通常并非如此。多标签方法可以更好地对数据建模。

如果我们用的是 softmax，它会强加一个假设，使得每个框只包含一个类别。但通常情况下这样做是不妥的，相比之下，多标记的分类方法能更好地模拟数据。

当一个目标仅属于一个类时，softnax 比较适合。当一个目标分属于多个类时，需要用 independent logistic classifier 对每个类别做二分类。During training we use binary cross-entropy loss for the class predictions.。

Class Prediction 是将原来的单标签分类改进为多标签分类，网络结构上就是将原来用于单标签多分类的 softmax 层换成用于多标签多分类的 independent logistic classifiers。原来分类网络中的 softmax 层是假设一张图像或一个 object 只属于一个类别，但是在一些复杂场景下，一个 object 可能属于多个类。例如类别中有 woman 和 person 这两个类，如果一张图像中有一个 woman，那么检测的结果中类别标签就要同时有 woman 和 person 两个类，这就是多标签分类，需要用 independent logistic classifiers 对每个类别做二分类。

During training we use binary cross-entropy loss for the class predictions. YOLOv1 使用 sum-squared error 计算 class loss，sum-squared error 在训练的时候相比 cross-entropy (交叉熵) 不易收敛，一般采用 cross-entropy (交叉熵) 计算 class loss。

multilabel classification：多标签分类
cross entropy：交叉熵
prediction [prɪˈdɪkʃn]：n. 预报，预言
independent [ˌɪndɪˈpendənt]：adj. 独立的，单独的，无党派的，不受约束的 n. 独立自主者，无党派者

2.3. Predictions Across Scales (跨尺度的预测)

YOLOv3 predicts boxes at 3 different scales. Our system extracts features from those scales using a similar concept to feature pyramid networks [8]. From our base feature extractor we add several convolutional layers. The last of these predicts a 3-d tensor encoding bounding box, objectness, and class predictions. In our experiments with COCO [10] we predict 3 boxes at each scale so the tensor is $\times N \times [3 ∗ (4 + 1 + 80)]$ for the 4 bounding box offsets, 1 objectness prediction, and 80 class predictions.
YOLOv3 预测 3 种不同尺度的 boxes。我们的系统使用一个相似于 feature pyramid networks [8] 的概念，从这些尺度来提取特征。从基本特征提取器中，我们添加了几个卷积层。这些中的最后一个预测 3-d tensor 来编码 bounding box, objectness, and class predictions。在我们用 COCO [10] 进行的实验中，我们预测了每个尺度上的 3 个 box，因此对于 4 bounding box offsets, 1 objectness prediction, and 80 class predictions，张量为 $\times N \times [3 ∗ (4 + 1 + 80)]$ 。

Next we take the feature map from 2 layers previous and upsample it by 2 $\times$ . We also take a feature map from earlier in the network and merge it with our upsampled features using concatenation. This method allows us to get more meaningful semantic information from the upsampled features and finer-grained information from the earlier feature map. We then add a few more convolutional layers to process this combined feature map, and eventually predict a similar tensor, although now twice the size.
接下来，我们从之前的 2 个层中获取 feature map，并对其进行 2 $\times$ 上采样。我们还从网络中较早的地方获取了一个特征图，并使用串联将其与我们的上采样特征合并。这种方法使我们能够从上采样的特征中获取更有意义的语义信息，并从较早的特征图中获取更细粒度的信息。然后，我们再添加一些卷积层来处理此组合特征图，并最终预测出相似的张量，尽管现在的大小是原来的两倍。

这种方法能使我们找到早期特征映射中的上采样特征和细粒度特征，并获得更有意义的语义信息。

YOLOv2 中通过 passthrough layer 增加细粒度特性。YOLOv3 中对前面两层得到的 feature map 进行上采样 2 $\times$ ，将更之前得到的 feature map 与经过上采样得到的 feature map 进行连接，这种方法可以让我们获得上采样层的语义信息以及更之前层的细粒度信息，将合并得到的 feature map 经过几个卷积层处理最终得到一个之前层两倍大小的张量。

semantic [sɪˈmæntɪk]：adj. 语义的，语义学的 (等于 semantical)

We perform the same design one more time to predict boxes for the final scale. Thus our predictions for the 3rd scale benefit from all the prior computation as well as finegrained features from early on in the network.
我们又用了一次同样的设计来为最后一个尺度预测边框。因此我们第三个尺度的预测，就得益于所有之前的计算，以及之前的网络中的细粒度特征。

We still use k-means clustering to determine our bounding box priors. We just sort of chose 9 clusters and 3 scales arbitrarily and then divide up the clusters evenly across scales. On the COCO dataset the 9 clusters were: (10 $\times$ 13), (16 $\times$ 30), (33 $\times$ 23), (30 $\times$ 61), (62 $\times$ 45), (59 $\times$ 119), (116 $\times$ 90), (156 $\times$ 198), (373 $\times$ 326).
我们仍然使用 k-means clustering 来确定 bounding box priors。我们只是随意选择了 9 clusters and 3 scales，然后将这些 clusters 在各个尺度之间平均分配。在 COCO 数据集上，9 clusters 是：(10 $\times$ 13), (16 $\times$ 30), (33 $\times$ 23), (30 $\times$ 61), (62 $\times$ 45), (59 $\times$ 119), (116 $\times$ 90), (156 $\times$ 198), (373 $\times$ 326)。

network resolution = 416 $\times$ 416

82 detection - scale 1
13 $\times$ 13 = 169 feature map stride = 32 416 / 32 = 13
13 $\times$ 13 $\times$ 3 = 507 大尺度 box
stride = 32，下采样数较高。feature map 的感受野较大，适合检测图像中尺寸较大的对象。COCO dataset bounding box priors (116 $\times$ 90), (156 $\times$ 198), (373 $\times$ 326)。

94 detection - scale 2
26 $\times$ 26 = 676 feature map stride = 16 13 $\times$ 2 = 26
26 $\times$ 26 $\times$ 3 = 2028 中尺度 box
stride = 16，下采样数中等。feature map 的感受野中等，适合检测图像中尺寸中等的对象。COCO dataset bounding box priors (30 $\times$ 61), (62 $\times$ 45), (59 $\times$ 119)。

106 detection - scale 3
52 $\times$ 52 = 2704 feature map stride = 8 26 $\times$ 2 = 52
52 $\times$ 52 $\times$ 3 = 8112 小尺度 box
stride = 8，下采样数较低。feature map 的感受野较小，适合检测图像中尺寸较小的对象。COCO dataset bounding box priors (10 $\times$ 13), (16 $\times$ 30), (33 $\times$ 23)。

YOLOv3 加深网络，同时收窄网络。

In our experiments with COCO [10] we predict 3 boxes at each scale so the tensor is $\times N \times [3 ∗ (4 + 1 + 80)]$ for the 4 bounding box offsets, 1 objectness prediction, and 80 class predictions.
对于 COCO 数据集，YOLOv3 预测输出三个尺寸 13 $\times$ 13 $\times$ 3 $\times$ 85，26 $\times$ 26 $\times$ 3 $\times$ 85，52 $\times$ 52 $\times$ 3 $\times$ 85。因此在预测52 $\times$ 52 输出时，受益于之前所有的计算以及网络前期的细粒度特性。

YOLOv2 有 5 个尺寸预选框，YOLOv3 有 3 个尺寸预选框，但是 YOLOv3 有 3 个检测输出层，所以 YOLOv3 预测的 bounding box 比 YOLOv2 要多。
YOLOv2: 13 $\times$ 13 $\times$ 5 = 845。
YOLOv3: (13 $\times$ 13 + 26 $\times$ 26 + 52 $\times$ 52) $\times$ 3 = (169 + 676 + 2704) $\times$ 3 = 3549 $\times$ 3 = 10647

Feature Pyramid Network，FPN

made by Levio

res1, res2, …, res8 等，表示 res_block 里面含有多少个 res_unit。每个 res_unit 需要一个 add 层，一共有 1 + 2 + 8 + 8 + 4 = 23 res_unit，包含 23 add 层。每个 res_block 都会用一个零填充，一共有 5 个 res_block，5 个 Zero Padding。

每一层 Batch Normalization 后面都会接一层 LeakyReLU，Batch Normalization 层和 LeakyReLU 层数量完全一样 (72 层)。

upsample 2 次，concatenate 2 次。

YOLOv3 没有池化层和全连接层。前向传播过程中，feature map 尺寸缩小是通过改变卷积核的步长来实现的。

2.4. Feature Extractor

We use a new network for performing feature extraction. Our new network is a hybrid approach between the network used in YOLOv2, Darknet-19, and that newfangled residual network stuff. Our network uses successive 3 $\times$ 3 and 1 $\times$ 1 convolutional layers but now has some shortcut connections as well and is significantly larger. It has 53 convolutional layers so we call it… wait for it… Darknet-53!
我们使用一个新的网络来执行特征提取。我们的新网络是 YOLOv2、Darknet-19 中使用的网络与新的 residual network 内容之间的一种混合方法。我们的网络使用了连续的 3 $\times$ 3 and 1 $\times$ 1 卷积层，但现在也具有一些 shortcut connections，并且明显更大。它有 53 个卷积层，所以我们称它为…等等…Darknet-53！

YOLOv3 的特征提取模型是一个杂交的模型，它使用了 YOLOv2、Darknet-19 以及 residual network。YOLO v3 特征提取网络有 53 个卷积层，因此把它们叫成 Darknet-53。Darknet-53 只是特征提取网络，YOLOv3 使用 Avgpool 层前面的卷积层来提取特征，multi-scale 的特征融合和预测支路并没有在 Darknet-53 中体现。

newfangled ['nju:,fæŋɡl]：adj. 新奇的，最新流行的，最新式的 (等于 newfangled) n. 新式的东西 v. 使流行
stuff [stʌf]：n. 东西，材料，填充物，素材资料 vt. 塞满，填塞，让吃饱 vi. 吃得过多
residual [rɪˈzɪdjuəl]：adj. (数量) 剩余的，(物质状态在成因消失后) 剩余的，残留的，(实验误差) 舍去的，残差的，(土壤) 残余的 n. 剩余物，残渣，残差，剩余误差，(付给表演者的) 复播追加酬金，(地质)残丘，蚀余山，(新车购入一定时间后的) 转售值

Table 1. Darknet-53

1 $\times$ , 2 $\times$ , 4 $\times$ , 8 $\times$ … 表示有多少个重复的残差组件。每个残差组件包含两个卷积层。

蓝色框为聚类得到的先验框，黄色框为 ground truth，红框是目标中心点所在的网格。

This new network is much more powerful than Darknet-19 but still more efficient than ResNet-101 or ResNet-152. Here are some ImageNet results:
这个新网络比 Darknet-19 功能强大得多，但仍比 ResNet-101 或 ResNet-152 更有效率。这里有一些 ImageNet 结果：

Table 2. Comparison of backbones. Accuracy, billions of operations, billion floating point operations per second, and FPS for various networks.

accuracy [ˈækjərəsi]：n. 精确度，准确性

Each network is trained with identical settings and tested at 256 $\times$ 256, single crop accuracy. Run times are measured on a Titan X at 256 $\times$ 256. Thus Darknet-53 performs on par with state-of-the-art classifiers but with fewer floating point operations and more speed. Darknet-53 is better than ResNet-101 and 1.5 $\times$ faster. Darknet-53 has similar performance to ResNet-152 and is 2 $\times$ faster.
每个网络都使用相同的设置进行训练，并以 256 $\times$ 256 的 single crop 精度进行测试。运行时间是在 Titan X 上以 256 $\times$ 256 进行测量的。因此 Darknet-53 的性能与最新的分类器相当，但浮点运算更少，速度更高。Darknet-53 比 ResNet-101 更好，速度提高了 1.5 $\times$ 。Darknet-53 的性能与 ResNet-152 相似，并且快 2 倍。

Darknet-53 also achieves the highest measured floating point operations per second. This means the network structure better utilizes the GPU, making it more efficient to evaluate and thus faster. That’s mostly because ResNets have just way too many layers and aren’t very efficient.
Darknet-53 also achieves the highest measured floating point operations per second. 这意味着网络结构可以更好地利用 GPU，从而使其运算效率更高，速度更快。这主要是因为 ResNets 层太多了，效率也不高。

ResNets 的层数太多，效率不高。

par [pɑː(r)]：n. 标准，票面价值，平均数量 adj. 标准的，票面的

2.5. Training

We still train on full images with no hard negative mining or any of that stuff. We use multi-scale training, lots of data augmentation, batch normalization, all the standard stuff. We use the Darknet neural network framework for training and testing [14].
我们仍然在整幅图上训练，没有难分负样本挖掘和任何其他策略。我们使用多尺度训练，大量数据扩充，batch normalization 以及所有标准内容。我们使用 Darknet 神经网络框架进行训练和测试 [14]。

hard negative mining 选择有代表性的负样本，分类器将背景预测为正样本的样本。

3. How We Do

YOLOv3 is pretty good! See table 3. In terms of COCOs weird average mean AP metric it is on par with the SSD variants but is 3 $\times$ faster. It is still quite a bit behind other models like RetinaNet in this metric though.
YOLOv3 很好！请参阅表 3。就 COCO 而言，average mean AP 指标很奇怪，与 SSD 变体相当，但速度快 3 倍。不过，在此指标上，它仍然比其他模型 (例如 RetinaNet) 要落后很多。

weird [wɪəd]：adj. 怪异的，不可思议的，超自然的 n. (苏格兰) 命运，预言

However, when we look at the “old” detection metric of mAP at IOU= .5 (or $AP_{50}$ in the chart) YOLOv3 is very strong. It is almost on par with RetinaNet and far above the SSD variants. This indicates that YOLOv3 is a very strong detector that excels at producing decent boxes for objects. However, performance drops significantly as the IOU threshold increases indicating YOLOv3 struggles to get the boxes perfectly aligned with the object.
但是，当我们以 IOU= .5 (or $AP_{50}$ in the chart) 查看 mAP 的旧检测评价标准时，YOLOv3 非常强大。它几乎与 RetinaNet 相当，并且远远超过 SSD 变体。这表明 YOLOv3 是一个非常强大的检测器，擅长于为物体制造相当好的 box。但是，随着 IOU 阈值的增加，性能会显著下降，这表明 YOLOv3 难以使 box 与目标完美对齐。

In the past YOLO struggled with small objects. However, now we see a reversal in that trend. With the new multi-scale predictions we see YOLOv3 has relatively high $AP_{S}$ performance. However, it has comparatively worse performance on medium and larger size objects. More investigation is needed to get to the bottom of this.
在过去，YOLO 在小目标的检测上表现一直不好。但是，现在我们看到了这种趋势的逆转。通过新的多尺度预测，我们看到 YOLOv3 具有相对较高的 $AP_{S}$ 性能。但是，它在中等和大尺寸目标上的性能相对较差。要深入了解这一点，还需要进行更多研究。

When we plot accuracy vs speed on the $AP_{50}$ metric (see figure 5) we see YOLOv3 has significant benefits over other detection systems. Namely, it’s faster and better.
当我们在 $AP_{50}$ 度量标准上绘制精度与速度的关系时 (参见图 5)，我们看到 YOLOv3 比其他检测系统具有明显的优势。即更快、更好。

decent [ˈdiːsnt]：adj. 正派的，得体的，相当好的
reversal [rɪˈvɜːsl]：n. 逆转，反转，撤销
comparatively [kəmˈpærətɪvli]：adv. 比较地，相当地

Table 3. I’m seriously just stealing all these tables from [9] they take soooo long to make from scratch. Ok, YOLOv3 is doing alright. Keep in mind that RetinaNet has like 3.8 $\times$ longer to process an image. YOLOv3 is much better than SSD variants and comparable to state-of-the-art models on the $AP_{50}$ metric.
Table 3. 我很认真地只是从 [9] 中偷走了所有这些表格，它们花了很长时间才能从头开始制作。好的，YOLOv3 一切正常。请记住，RetinaNet 的图像处理时间要长 3.8 $\times$ 。YOLOv3 比 SSD 变体好得多，并且可以与 $AP_{50}$ 指标上的最新模型相媲美。

steal [stiːl]：vt. 剽窃，偷偷地做，偷窃 vi. 窃取，偷偷地行动，偷垒 n. 偷窃，便宜货，偷垒，断球
alright [ɔːlˈraɪt]：adj. 没问题的 adv. 好吧 (等于 all right)
seriously [ˈsɪəriəsli]：adv. 认真地，严重地，严肃地

YOLOv2 对小物体的检测不敏感，主要是 cell 预测阶段导致的。增加了多尺度预测之后，YOLOv3 对小物体的检测方面有了好转，但是现在对中、大 size 的物体表现的不是那么好，这还得需要我们去努力做。

如何检测两个距离很近的同类物体，又或者是距离很近的非同类物体？
大部分算法都会对传入的图像数据执行 resize 到一个小的 resolution，它们对于这种情况都会给出一个目标框，因为在它们的特征提取或者回归过程看来，这就是一个物体 (本来就很近，缩小之后相邻的距离更近了)。小目标的检测，一直以来也被当成算法的一种评估。YOLOv3 对这种距离很近的物体或者小物体有很好的鲁棒性，这个难题得到了很大程度的解决。

4. Things We Tried That Didn’t Work

We tried lots of stuff while we were working on YOLOv3. A lot of it didn’t work. Here’s the stuff we can remember.
在开发 YOLOv3 时，我们尝试了很多东西。很多都行不通。这是我们能记住的东西。

Anchor box $x, y$ offset predictions. We tried using the normal anchor box prediction mechanism where you predict the $x, y$ offset as a multiple of the box width or height using a linear activation. We found this formulation decreased model stability and didn’t work very well.
Anchor box $x, y$ offset predictions. 我们尝试使用常规 anchor box 预测机制，在该机制中，您可以使用线性激活将 $x, y$ 偏移量预测为框宽度或高度的倍数。我们发现此设置降低了模型的稳定性，并且效果不佳。

Anchor box坐标的偏移预测。我们尝试了常规的 anchor box 预测方法，比如利用线性激活将坐标 $x, y$ 的偏移程度预测为边界框宽度或高度的倍数。但我们发现这种做法降低了模型的稳定性，且效果不佳。

Linear $x, y$ predictions instead of logistic. We tried using a linear activation to directly predict the $x, y$ offset instead of the logistic activation. This led to a couple point drop in mAP.
Linear $x, y$ predictions instead of logistic. 我们尝试使用线性激活来直接预测 $x, y$ 偏移量，而不是逻辑激活。这导致 mAP 下降了两点。

Focal loss. We tried using focal loss. It dropped our mAP about 2 points. YOLOv3 may already be robust to the problem focal loss is trying to solve because it has separate objectness predictions and conditional class predictions. Thus for most examples there is no loss from the class predictions? Or something? We aren’t totally sure.
Focal loss. 我们尝试使用 focal loss。它降低了我们的 mAP 大约 2 点。YOLOv3 可能已经对 focal loss 试图解决的问题具有鲁棒性，因为它具有独立的 objectness predictions and conditional class predictions。因此，对于大多数示例而言，分类预测不会带来损失吗？或者其他的东西？我们不太确定。

Dual IOU thresholds and truth assignment. Faster RCNN uses two IOU thresholds during training. If a prediction overlaps the ground truth by .7 it is as a positive example, by [.3 - .7] it is ignored, less than .3 for all ground truth objects it is a negative example. We tried a similar strategy but couldn’t get good results.
Dual IOU thresholds and truth assignment. Faster RCNN 在训练期间使用两个 IOU 阈值。如果预测与 ground truth 的重叠为 0.7，则为正例；在 [.3 - .7] 之间的预测将被忽略；对于所有 ground truth 目标，小于 0.3 则为负例。我们尝试了类似的策略，但未取得良好的效果。

We quite like our current formulation, it seems to be at a local optima at least. It is possible that some of these techniques could eventually produce good results, perhaps they just need some tuning to stabilize the training.
我们非常喜欢我们目前的表述，似乎至少是局部最优。这些技术中的某些可能最终会产生良好的结果，也许它们只需要进行一些调整即可稳定训练。

eventually [ɪˈventʃuəli]：adv. 最后，终于

5. What This All Means

YOLOv3 is a good detector. It’s fast, it’s accurate. It’s not as great on the COCO average AP between .5 and .95 IOU metric. But it’s very good on the old detection metric of .5 IOU.
YOLOv3 是一个很好的检测器。快速，准确。在 .5 至 .95 IOU 度量标准之间的 COCO average AP 效果不佳。但是，对于 .5 IOU 的旧检测指标而言，这非常好。

Why did we switch metrics anyway? The original COCO paper just has this cryptic sentence: “A full discussion of evaluation metrics will be added once the evaluation server is complete”. Russakovsky et al report that that humans have a hard time distinguishing an IOU of .3 from .5! “Training humans to visually inspect a bounding box with IOU of 0.3 and distinguish it from one with IOU 0.5 is surprisingly difficult.” [18] If humans have a hard time telling the difference, how much does it matter?
为什么我们仍要转换评测指标？原始的 COCO 论文只是这样一个含糊的句子：“评估服务器完成后，将添加对评估指标的完整讨论”。Russakovsky et al 的报告指出，人类很难区分 .3 和 .5 的 IOU！“训练人员视觉检查 IOU 为 0.3 的边界框并将其与 IOU 0.5 的边界框区别开来是非常困难的。”[18] 如果人类很难分辨出差异，那么这有多重要？

cryptic [ˈkrɪptɪk]：adj. 神秘的，含义模糊的，隐藏的

But maybe a better question is: “What are we going to do with these detectors now that we have them?” A lot of the people doing this research are at Google and Facebook. I guess at least we know the technology is in good hands and definitely won’t be used to harvest your personal information and sell it to… wait, you’re saying that’s exactly what it will be used for?? Oh.
但是也许更好的问题是：“既然有了检测器，我们将如何处理这些检测器？”许多从事这项研究的人都在 Google 和 Facebook 上。我想至少我们知道该技术掌握得很好，并且绝对不会被用来收集您的个人信息并将其出售给…。等等，您是在说这正是它的用途？？哦。

Well the other people heavily funding vision research are the military and they’ve never done anything horrible like killing lots of people with new technology oh wait…
好吧，那些为视觉研究投入大量资金的人是军队，他们从来没有做过任何可怕的事情，例如用新技术杀死许多人，等等。

I have a lot of hope that most of the people using computer vision are just doing happy, good stuff with it, like counting the number of zebras in a national park [13], or tracking their cat as it wanders around their house [19]. But computer vision is already being put to questionable use and as researchers we have a responsibility to at least consider the harm our work might be doing and think of ways to mitigate it. We owe the world that much.
我非常希望大多数使用计算机视觉的人都在用它做快乐的好事，例如计算国家公园中斑马的数量 [13] 或在猫徘徊在房子周围时追踪它们的猫[19]。但是计算机视觉已经被质疑使用，作为研究人员，我们有责任至少考虑我们的工作可能造成的危害并想办法减轻它。我们欠世界那么多。

In closing, do not @ me. (Because I finally quit Twitter).

The author is funded by the Office of Naval Research and Google.

collaboration [kəˌlæbəˈreɪʃn]：n. 合作，勾结，通敌
harvest [ˈhɑːvɪst]：n. 收获，产量，结果 vt. 收割，得到 vi. 收割庄稼
military [ˈmɪlətri]：adj. 军事的，军人的，适于战争的 n. 军队，军人
horrible ['hɒrəb(ə)l]：adj. 可怕的，极讨厌的
questionable [ˈkwestʃənəbl]：adj. 可疑的，有问题的
owe [əʊ]：vt. 欠，感激，应给予，应该把...归功于 vi. 欠钱

Figure 3. Again adapted from the [9], this time displaying speed/accuracy tradeoff on the mAP at .5 IOU metric. You can tell YOLOv3 is good because it’s very high and far to the left. Can you cite your own paper? Guess who’s going to try, this guy ! [16]. Oh, I forgot, we also fix a data loading bug in YOLOv2, that helped by like 2 mAP. Just sneaking this in here to not throw off layout.
图 3. 再次根据 [9] 进行改编，这次显示了在 0.5 IOU 度量标准上的 mAP 速度/精度折衷。您可以说 YOLOv3 很好，因为它很高而且离左边很远。你可以引用自己的论文吗？猜猜谁会尝试，这个家伙！[16]。哦，我忘了，我们还修复了 YOLOv2 中的数据加载错误，提升 2 mAP。Just sneaking this in here to not throw off layout.

sneak [sniːk]：vi. 溜，鬼鬼祟祟做事，向老师打小报告 vt. 偷偷地做，偷偷取得 n. 鬼鬼祟祟的人，偷偷摸摸的行为，告密者 adj. 暗中进行的
layout [ˈleɪaʊt]：n. 布局，设计，安排，陈列

Rebuttal

We would like to thank the Reddit commenters, labmates, emailers, and passing shouts in the hallway for their lovely, heartfelt words. If you, like me, are reviewing for ICCV then we know you probably have 37 other papers you could be reading that you’ll invariably put off until the last week and then have some legend in the field email you about how you really should finish those reviews execept it won’t entirely be clear what they’re saying and maybe they’re from the future? Anyway, this paper won’t have become what it will in time be without all the work your past selves will have done also in the past but only a little bit further forward, not like all the way until now forward. And if you tweeted about it I wouldn’t know. Just sayin.
我们要感谢 Reddit 评论员、同事、电子邮件发送者以及走廊上的欢呼声，感谢他们的可爱，由衷的话。如果您像我一样，正在审查 ICCV，那么我们知道您可能还会阅读其他 37 篇论文，您将不可避免地推迟到最后一周，然后在该领域中有一些传奇人物通过电子邮件向您发送有关您应该如何完成的论文，只是不清楚他们在说什么，也许他们来自未来？无论如何，如果没有你过去的自我所做的所有工作，这篇论文将不会变成及时的事情，而只是进一步向前迈进，而不是像现在一样前进。如果你在 Twitter 上发布推文，我不会知道。只是在说。

Reviewer #2 AKA Dan Grossman (lol blinding who does that) insists that I point out here that our graphs have not one but two non-zero origins. You’re absolutely right Dan, that’s because it looks way better than admitting to ourselves that we’re all just here battling over 2-3% mAP. But here are the requested graphs. I threw in one with FPS too because we look just like super good when we plot on FPS.
Reviewer #2 AKA Dan Grossman (笑的是谁呢) 坚持认为，我在这里指出，我们的图不是有一个而是有两个非零的原点。Dan，您说的完全正确，那是因为它看起来比向我们自己承认我们所有人都在争夺 2-3% 的平均分。但是这是要求的图表。我也加入了 FPS，因为当我们在 FPS 上绘图时，我们看起来就像是超级棒。

Figure 4. Zero-axis charts are probably more intellectually honest… and we can still screw with the variables to make ourselves look good!
图 4. 零轴图表可能在理论上更诚实…我们仍然可以使用变量来使自己看起来不错！

also known as，Aka, AKA or a.k.a.：亦称为，别名
rebuttal [rɪˈbʌtl]：n. 反驳，辩驳，反证
commenter ['kɔmentər]：n. 批评家，评论家
Reddit (/ˈrɛdɪt/) is an American social news aggregation, web content rating, and discussion website.
aggregation [ˌæɡrɪˈɡeɪʃn]：n. 聚合，聚集，聚集体，集合体
shout [ʃaʊt]：vi. 呼喊，喊叫，大声说 vt. 呼喊，大声说 n. 呼喊，呼叫
hallway [ˈhɔːlweɪ]：n. 走廊，门厅，玄关
heartfelt [ˈhɑːtfelt]：adj. 衷心的，真诚的，真心真意的
invariably [ɪnˈveəriəbli]：adv. 总是，不变地，一定地
legend [ˈledʒənd]：n. 传奇，说明，图例，刻印文字
tweet [twiːt]：n. 小鸟叫声，自录音再现装置发出的高音，推特 vi. 吱吱地叫，啾鸣
admit [ədˈmɪt]：vt. 承认，准许进入，可容纳 vi. 承认，容许
throw [θrəʊ]：vt. 投，抛，掷 vi. 抛，投掷 n. 投掷，冒险
battle [ˈbætl]：n. 战役，斗争 vi. 斗争，作战 vt. 与...作战
laugh(ing) out loud，LOL or lol：大声地笑

Reviewer #4 AKA JudasAdventus on Reddit writes “Entertaining read but the arguments against the MSCOCO metrics seem a bit weak”. Well, I always knew you would be the one to turn on me Judas. You know how when you work on a project and it only comes out alright so you have to figure out some way to justify how what you did actually was pretty cool? I was basically trying to do that and I lashed out at the COCO metrics a little bit. But now that I’ve staked out this hill I may as well die on it.
Reviewer #4 AKA JudasAdventus 在 Reddit 上写道：“有趣的阅读，但反对 MSCOCO 指标的论点似乎有些虚弱”。好吧，我一直都知道你会成为打开我 Judas 的人。你知道当你在一个项目上工作时，而且只能顺利进行，因此您必须找出某种方法来证明您所做的工作真的很酷吗？我基本上是想这样做，并且对 COCO 指标大加抨击。但现在我已经把这座山推下去了，我不妨死在它上面。

lash [læʃ]：vt. 鞭打，冲击，摆动，扎捆，煽动，讽刺 vi. 鞭打，猛击，急速甩动 n. 鞭打，睫毛，鞭子，责骂，讽刺
judas [ˈdʒuːdəs]：n. (门上的) 窥视孔 n. (Judas) 出卖朋友的人，叛徒
stake [steɪk]：n. 桩，棍子，赌注，火刑，奖金 vt. 资助，支持，系...于桩上，把...押下打赌 vi. 打赌

See here’s the thing, mAP is already sort of broken so an update to it should maybe address some of the issues with it or at least justify why the updated version is better in some way. And that’s the big thing I took issue with was the lack of justification. For PASCAL VOC, the IOU threshold was ”set deliberately low to account for inaccuracies in bounding boxes in the ground truth data“ [2]. Does COCO have better labelling than VOC? This is definitely possible since COCO has segmentation masks maybe the labels are more trustworthy and thus we aren’t as worried about inaccuracy. But again, my problem was the lack of justification.
看到这些情况，mAP 已经有点坏了，因此对其进行更新也许可以解决一些问题，或者至少说明为什么更新版本在某种程度上更好。这就是我遇到的最大问题是缺乏合理性。对于 PASCAL VOC，将 IOU 阈值“故意设置得较低，以解决 ground truth 数据中边界框中的不准确性” [2]。COCO 的标签是否比 VOC 更好？绝对有可能，因为 COCO 带有分割 masks，也许标签更值得信赖，因此我们不必担心准确性。但同样，我的问题是缺乏正当性。

deliberately [dɪˈlɪbərətli]：adv. 故意地，谨慎地，慎重地
segmentation [ˌseɡmenˈteɪʃn]：n. 分割，割断，细胞分裂
justification [ˌdʒʌstɪfɪˈkeɪʃn]：n. 理由，辩护，认为有理，认为正当，释罪

The COCO metric emphasizes better bounding boxes but that emphasis must mean it de-emphasizes something else, in this case classification accuracy. Is there a good reason to think that more precise bounding boxes are more important than better classification? A miss-classified example is much more obvious than a bounding box that is slightly shifted.
COCO 度量标准强调更好的边界框，但强调必须意味着它不再强调其他内容，在这种情况下，是分类准确性的重要性没有体现。是否有充分的理由认为更精确的边界框比更好的分类更重要？未分类的示例比稍微偏离的边界框更明显。

mAP is already screwed up because all that matters is per-class rank ordering. For example, if your test set only has these two images then according to mAP two detectors that produce these results are JUST AS GOOD:
mAP 已经搞砸了，因为所有重要的事情都是按照排名排序。例如，如果你的测试集只有这两个图像，那么根据 mAP，产生这些结果的两个检测器都是非常好的：

Figure 5. These two hypothetical detectors are perfect according to mAP over these two images. They are both perfect. Totally equal.
图 5. 根据这两个图像的 mAP，这两个假设检测器是完美的。他们俩都是完美的，完全相等。

Now this is OBVIOUSLY an over-exaggeration of the problems with mAP but I guess my newly retconned point is that there are such obvious discrepancies between what people in the “real world” would care about and our current metrics that I think if we’re going to come up with new metrics we should focus on these discrepancies. Also, like, it’s already mean average precision, what do we even call the COCO metric, average mean average precision?
现在，这显然是对 mAP 问题的过分夸张，但是我想我最近重新定义的一点是，现实世界中的人们所关心的与我们当前的度量标准之间存在如此明显的差异。要提出新的指标，我们应该关注这些差异。同样，它已经是 mean average precision 了，我们甚至称之为 COCO 指标，average mean average precision？

exaggeration [ɪɡˌzædʒəˈreɪʃn]：n. 夸张，夸大之词，夸张的手法
discrepancy [dɪˈskrepənsi]：n. 不符，矛盾，相差
retcon：重新复述，追溯
screw [skruː]：vt. 旋，拧，压榨，强迫 n. 螺旋，螺丝钉，吝啬鬼 vi. 转动，拧
hypothetical [ˌhaɪpəˈθetɪkl]：adj. 假设的，爱猜想的

Here’s a proposal, what people actually care about is given an image and a detector, how well will the detector find and classify objects in the image. What about getting rid of the per-class AP and just doing a global average precision? Or doing an AP calculation per-image and averaging over that?
这是一个建议，人们真正关心的是给定图像和检测器，检测器对图像中的目标进行查找和分类的程度如何。摆脱每个类别的 AP 而仅执行 global average precision 又如何呢？还是对每个图像进行 AP 计算并求平均值？

Boxes are stupid anyway though, I’m probably a true believer in masks except I can’t get YOLO to learn them.
无论如何，boxes 都是愚蠢的，我可能是 mask 的真正信徒，但我无法让 YOLO 来学习它们。

stupid [ˈstjuːpɪd]：adj. 愚蠢的，麻木的，乏味的 n. 傻瓜，笨蛋

References

[3] DSSD: Deconvolutional Single Shot Detector
[8] Feature Pyramid Networks for Object Detection
[11] SSD: Single Shot MultiBox Detector

WORDBOOK

mean Average Precision，mAP：平均精度均值
floating point operations per second，FLOPS
frame rate or frames per second，FPS：每秒帧数
hertz，Hz：赫兹 (频率单位)
billion，Bn
operations，Ops
configuration，cfg
AP small，AP_S
AP medium，AP_M
AP large，AP_L
Feature Pyramid Network，FPN

KEY POINTS

https://pjreddie.com/publications/
https://pjreddie.com/publications/yolo/
https://github.com/qqwweee/keras-yolo3
https://github.com/wizyoung/YOLOv3_TensorFlow

你可能感兴趣的:(YOLOv3,An,Incremental,Improvement,object,detection,-,目标检测)

android系统selinux中添加新属性property 辉色投像
1.定位/android/system/sepolicy/private/property_contexts声明属性开头：persist.charge声明属性类型：u:object_r:system_prop:s0图12.定位到android/system/sepolicy/public/domain.te删除neverallow{domain-init}default_prop:property
【目标检测数据集】卡车数据集1073张VOC+YOLO格式熬夜写代码的平头哥∰ 目标检测 YOLO 人工智能
数据集格式：PascalVOC格式+YOLO格式(不包含分割路径的txt文件，仅仅包含jpg图片以及对应的VOC格式xml文件和yolo格式txt文件)图片数量(jpg文件个数)：1073标注数量(xml文件个数)：1073标注数量(txt文件个数)：1073标注类别数：1标注类别名称:["truck"]每个类别标注的框数：truck框数=1120总框数：1120使用标注工具：labelImg标注
番茄西红柿叶子病害分类数据集12882张11类别 futureflsl 数据集分类数据挖掘人工智能
数据集类型：图像分类用，不可用于目标检测无标注文件数据集格式：仅仅包含jpg图片，每个类别文件夹下面存放着对应图片图片数量(jpg文件个数)：12882分类类别数：11类别名称:["Bacterial_Spot_Bacteria","Early_Blight_Fungus","Healthy","Late_Blight_Water_Mold","Leaf_Mold_Fungus","Powdery
iOS内存管理简单理解烧烤有点辣
什么是引用计数引用计数（ReferenceCount）是一个简单而有效的管理对象生命周期的方式。当我们创建一个新对象的时候，它的引用计数为1，当有一个新的指针指向这个对象时，我们将其引用计数加1，当某个指针不再指向这个对象是，我们将其引用计数减1，当对象的引用计数变为0时，说明这个对象不再被任何指针指向了，这个时候我们就可以将对象销毁，回收内存。由于引用计数简单有效，除了Objective-C和S
粒子群优化 (PSO) 在三维正弦波函数中的应用 subject625Ruben 机器学习人工智能 matlab 算法
在这篇博客中，我们将展示如何使用粒子群优化（PSO）算法求解三维正弦波函数，并通过增加正弦波扰动，使优化过程更加复杂和有趣。本文将介绍目标函数的定义、PSO参数设置以及算法执行的详细过程，并展示搜索空间中的动态过程和收敛曲线。1.目标函数定义我们使用的目标函数是一个三维正弦波函数，定义如下：objectiveFunc=@(x)sin(sqrt(x(1).^2+x(2).^2))+0.5*sin(5
自定义队列 junjun2018
队列：像排队吃饭一样，先到的先点菜，后来的后点菜。以下代码展示使用单向列表实现的队列。//链表是以节点为单位的，对于单向链表，每个节点中包含一个值和指向下一个对象的引用publicclassNode{Objectvalue;Nodenext;publicNode(Objectvalue){this.value=value;}publicObjectgetValue(){returnvalue;}p
使用由 Python 编写的 lxml 实现高性能 XML 解析 hunyxv python 笔记 python xml
转载自：文章lxml简介Python从来不出现XML库短缺的情况。从2.0版本开始，它就附带了xml.dom.minidom和相关的pulldom以及SimpleAPIforXML(SAX)模块。从2.4开始，它附带了流行的ElementTreeAPI。此外，很多第三方库可以提供更高级别的或更具有python风格的接口。尽管任何XML库都足够处理简单的DocumentObjectModel(DOM
[数据集][目标检测]汽车头部尾部检测数据集VOC+YOLO格式5319张3类别 FL1623863129 数据集目标检测汽车 YOLO
数据集制作单位：未来自主研究中心(FIRC)版权单位：未来自主研究中心(FIRC)版权声明：数据集仅仅供个人使用，不得在未授权情况下挂淘宝、咸鱼等交易网站公开售卖,由此引发的法律责任需自行承担数据集格式：PascalVOC格式+YOLO格式(不包含分割路径的txt文件，仅仅包含jpg图片以及对应的VOC格式xml文件和yolo格式txt文件)图片数量(jpg文件个数)：5319标注数量(xml文件
【Golang】 Golang 的 GORM 库中的 Rows 函数不爱洗脚的小滕 golang 开发语言后端
文章目录前言一、Rows函数解释二、代码实现三、总结前言在使用Go语言进行数据库操作时，GORM（GoObject-RelationalMapping）库是一个常用的工具。它提供了一种简洁和强大的方式来处理数据库操作。本文将介绍GORM库中的Rows函数，这是一个用于执行原生SQL查询并返回结果的函数。一、Rows函数解释在GORM库中，Rows函数用于执行原生SQL查询并返回*sql.Rows结
Python精选200Tips：121-125 AnFany Python200+Tips python 开发语言
Spendyourtimeonself-improvement121Requests-简化的HTTP请求处理发送GET请求发送POST请求发送PUT请求发送DELETE请求会话管理处理超时文件上传122BeautifulSoup-网页解析和抓取解析HTML和XML文档查找单个标签查找多个标签使用CSS选择器查找标签提取文本修改文档内容删除标签处理XML文档123Scrapy-强大的网络爬虫框架示例
UI 自动化的页面对象管理神器 PO-Manager TesterHome
原文由alex发表于TesterHome社区网站，点击原文链接可于作者直接交流。做UI自动化的同学都知道，UI自动化一个难点就是页面元素的变化，让自动化维护成为一个痛点。在此，为了减轻这个痛点，我在基于Page-Object模式的基础上开发了页面对象维护的工具。该工具为vscode的一个插件，可以通过vscode插件市场搜索PO-Manager来下载安装本文中的页面对象库文件基于json.一个元素
2018-08-16【Swift 4.1】关于Swift4.0以后调用MJExtension无法模型转换问题码农happy
1、本人使用swift4.1，弄了一晚上才弄好，结果还是一个小问题真是尴尬，要在model中每个属性前面加上@objcimportUIKitclassUserModel:NSObject{@objcvardix=String()}letdic=["dix":"ffffff"]asNSDictionaryletmodel=UserModel.mj_object(withKeyValues:dic)!
git：文件存储方式 xuanyu22 工具 git github
引言我们知道git跟踪文件会经历三个阶段：工作区，暂存区和本地仓库（参考git：理解工作区，暂存区和本地仓库），在这些阶段文件如何被储存？理解git文件的存储方式能帮助我们掌握git的工作原理。git对象在上述三个阶段，文件会以对象（object）的形式存储在.git/objects目录下，对象主要有三类：commit，tree和blob。假设初始目录如下：├──.git├──file│└──c.
Three.js AnimationUtils 和 AnimationObjectGroup 灵魂清零 three 前端 web3 javascript
AnimationObjectGroup接收共享动画状态的一组对象。在使用手册的“下一步”章节中，“动画系统”一文对three.js动画系统中的不同元素作出了概述用法:将本来要作为根对象传入构造器或者动画混合器(AnimationMixer)的clipAction方法中的对象加入组中，并将这个组对象作为根对象传递。注意，这个类的实例作为混合器中的一个对象，因此，必须对组内的单个对象做缓存控制。限制
python使用MD5 18.程序员哈希算法算法
一、要使用Python进行MD5加密，可以使用Python标准库中的hashlib模块。二、案例importhashlibstring="Hello,World!"#要进行加密的字符串hash_object=hashlib.md5(string.encode())#将字符串编码并进行MD5加密hash_hex=hash_object.hexdigest()#获取加密后的十六进制字符串print(h
Java集合类框架源码分析之 RoleList源码解析【6】 yunzhonghefei Java集合类源码分析 RoleList源码解析
该类继承于ArrayList，针对Role进行了一些扩展。其他方法和ArrayList中基本相同，源码不做针对性分析：看一下类简介：/***代表了一个roles的列表，作为方法setRoles()的参数，去创建一个关联关系，并且尝试在同一个关系中设置多个角色。*ARoleListrepresentsalistofroles(Roleobjects).Itisusedas*parameterwhen
Jooq 框架介绍及其核心要点木南曌 Java java
一、引言Jooq（JavaPersistenceforRelationalDatabases）是一个强大的类型安全的SQL查询构建器和ORM（Object-RelationalMapping）框架，专为Java和Kotlin设计。它为开发者提供了一种优雅的方式来编写SQL代码，同时还能享受到静态类型检查带来的好处。本文将详细介绍Jooq的核心功能，并通过一系列的代码示例来展示如何使用Jooq。二、
Android jni中数组参数的传递方式 lokeyme Andriod android开发 JNI NDK java c语言
1、背景今天调试了一下Androidjni关于Java中调用C代码的程序，发现我的数组参数传递方式不对，导致值传递不正确，我的方法是：C代码，入口函数#include#includejintJava_sony_MedicalRecordDemo_MainActivity_decryptionSuccess(JNIEnv*env,jobjectthiz,jintAttr[]){returnAttr[
SAP B1 无对象表或者没有含自动增量的对象，如何通过SBO_SP控制哲讯智能科技运维科技 erp
SAPB1中无对象或者没有含自动增量的对象表，在SBO_SP_TransactionNotification中object_type规则：-3+Tab键+@表名例如：创建无对象表IPS_OITM，则object_type值为：[-3@IPS_OITM]特别注意：如果用的是没有含自动增量的对象表，必须要在Name字段中输入值才会触发SBO控制的存储过程相关产品1.SAPBusinessOne是一套投
Java – 数组Copy的几种方式 hooc java web
目前在Java中数据拷贝提供了如下方式：cloneSystem.arraycopyArrays.copyOfArrays.copyOfRange1、clone方法clone方法是从Object类继承过来的，基本数据类型（String，boolean，char，byte，short，float，double，long）都可以直接使用clone方法进行克隆，注意String类型是因为其值不可变所以才可
CV、NLP、数据控掘推荐、量化海的那边- AI算法自然语言处理人工智能
下面是对CV（计算机视觉）、NLP（自然语言处理）、数据挖掘推荐和量化的简要概述及其应用领域的介绍：1.CV（计算机视觉，ComputerVision）定义：计算机视觉是一门让计算机能够从图像或视频中提取有用信息，并做出决策的学科。它通过模拟人类的视觉系统来识别、处理和理解视觉信息。主要任务：图像分类：识别图像中的物体并分类，比如猫、狗、车等。目标检测：在图像或视频中定位并识别多个对象，如人脸检测
Java中四种常用的数组复制的方法copyOf(),arraycop()，clone（）和copyOfRange()的使用与区别方九九 java知识点总结 java
所谓复制数组，是指将一个数组中的元素在另一个数组中进行复制。本文主要介绍关于Java里面的数组复制（拷贝）的几种方式和用法。在Java中实现数组复制分别有以下4种方法：1.Arrays类的copyOf()方法2.Arrays类的copyOfRange()方法3.System类的arraycopy()方法4.Object类的clone()方法下面来详细介绍这4种方法的使用。使用copyOf()方法和
document获取元素的方法小成语 js 平时 js
js学习总结----DOM获取元素的方法（8个）DOM:documentobjectmodel文档对象模型DOM就是描述整个html页面中节点关系的图谱，可以如下图理解在DOM中，提供了很多的获取元素的方法和之间关系的属性以及操作这些元素的方法。1、获取页面中元素的方法1）、document.getElementById('元素的ID')在整个文档中，通过元素的ID获取到这个元素对象(获取的是一个
数据格式：什么是JSON和XML isNotNullX json xml
JSON和XML都是数据交换的一种格式，用于在不同的系统和应用程序之间传输和存储数据。本文将解释JSON和XML的基础内容，并探讨两者的不同。一·什么是JSON？1.JSON（JavaScriptObjectNotation）即JavaScript对象标记法：-JSON是一种轻量级的数据交换格式，易于人阅读和编写，同时也易于机器解析和生成。-JSON基于JavaScript的一个子集，但JSON是
运算符、一元运算符、自增、自减玖岁灬
运算符运算符也叫操作符通过运算符可以对一个或多个值进行运算,并获取运算结果比如：typeof就是运算符，可以来获得一个值的类型，它会将该值的类型以字符串的形式返回"number""string""boolean""undefined""object"算数运算符当对非Number类型的值进行运算时，会将这些值转换为Number然后在运算任何值和NaN做运算都得NaN++可以对两个值进行加法运算，并将
python错误集锦--类型错误：‘NoneType‘ object is not subscriptable 程序员的修养 python python 开发语言后端
python类型错误：‘NoneType’objectisnotsubscriptable网上查到的原因和方案如下，但是小编给变量的命名不太像系统内置关键字。原因：变量使用了系统内置的关键字list解决：重新定义下这个变量小编需求是获取网络数据，从中获取某个key的值然后赋值给变量，代码如下targetData=monitorData['MonitorData’]既然是网络数据就有可能没有这个ke
JavaScript 基础 - 第13天 +码农快讯+ JavaScript学习笔记 javascript 开发语言 ecmascript
文章目录JavaScript基础-第13天深入对象创建对象三种方式构造函数new实例化过程实例成员&静态成员实例成员静态成员一切皆对象内置构造函数ObjectArrayStringNumberJavaScript基础-第13天了解面向对象编程的基础概念及构造函数的作用，体会JavaScript一切皆对象的语言特征，掌握常见的对象属性和方法的使用。深入对象内置构造函数综合案例深入对象了解面向对象的基
Java之String类不互关就取关 java python 开发语言
一、String类常用方法1.引用类型的比较我们知道在Java中两个引用遍历是不能用"=="号来比较的，而String类重写了父类objects的equals方法，实现了引用类型的比较例子importjava.util.Scanner;publicclassMain{publicstaticvoidmain(String[]args){Stringstr1="helloworld";Strings
Java String 文字（Literal）和对象（Object）初始化 HoneyMoose
当我们创建String对象的时候，如果使用new()的方式来创建一个String对象，JVM将会每次都会在heap内存中为我们创建的String对象开辟一个存储空间来进行存储。但是，如果我们使用赋值方式创建String对象的话，JVM首先将会对我们赋的值到StringPool中进行查找，如果找到的话，就返回已经存在这个值的引用。如果没有找到，就创建一个新的String对象并且返回这个创建对象的引用
前端基础面试题·第三篇——JavaScript（其二） DT—— 前端面试 javascript 面试
1.深浅拷贝1.浅拷贝浅拷贝会创建一个新的对象，这个对象有着原始对象属性值的一份精确拷贝。如果属性是基本类型，拷贝的就是基本类型的值，如果属性是引用类型，拷贝就是改引用类型的地址。//常见的浅拷贝1.Object.assign({},obj)//对象浅拷贝assign⽅法可以⽤于处理数组，不过会把数组视为对象，⽐如这⾥会把⽬标数组视为是属性为0、1、2的对象，所以源数组的0、1属性的值覆盖了⽬标对
怎么样才能成为专业的程序员？ cocos2d-x小菜编程 PHP
如何要想成为一名专业的程序员？仅仅会写代码是不够的。从团队合作去解决问题到版本控制，你还得具备其他关键技能的工具包。当我们询问相关的专业开发人员，那些必备的关键技能都是什么的时候，下面是我们了解到的情况。关于如何学习代码，各种声音很多，然后很多人就被误导为成为专业开发人员懂得一门编程语言就够了？！呵呵，就像其他工作一样，光会一个技能那是远远不够的。如果你想要成为
java web开发高并发处理 BreakingBad java Web 并发开发处理高
java处理高并发高负载类网站中数据库的设计方法（java教程,java处理大量数据，java高负载数据）一：高并发高负载类网站关注点之数据库没错,首先是数据库,这是大多数应用所面临的首个SPOF。尤其是Web2.0的应用，数据库的响应是首先要解决的。一般来说MySQL是最常用的，可能最初是一个mysql主机，当数据增加到100万以上，那么，MySQL的效能急剧下降。常用的优化措施是M-S（
mysql批量更新 ekian mysql
mysql更新优化：一版的更新的话都是采用update set的方式，但是如果需要批量更新的话，只能for循环的执行更新。或者采用executeBatch的方式，执行更新。无论哪种方式，性能都不见得多好。三千多条的更新，需要3分多钟。查询了批量更新的优化，有说replace into的方式，即： replace into tableName(id,status) values
微软BI（3） 18289753290 微软BI SSIS
1) Q：该列违反了完整性约束错误；已获得 OLE DB 记录。源:“Microsoft SQL Server Native Client 11.0” Hresult: 0x80004005 说明:“不能将值 NULL 插入列 'FZCHID'，表 'JRB_EnterpriseCredit.dbo.QYFZCH'；列不允许有 Null 值。INSERT 失败。”。 A：一般这类问题的存在是
Java中的List g21121 java
List是一个有序的 collection（也称为序列）。此接口的用户可以对列表中每个元素的插入位置进行精确地控制。用户可以根据元素的整数索引（在列表中的位置）访问元素，并搜索列表中的元素。与 set 不同，列表通常允许重复
读书笔记永夜-极光读书笔记
1. K是一家加工厂,需要采购原材料,有A,B,C,D 4家供应商,其中A给出的价格最低,性价比最高,那么假如你是这家企业的采购经理,你会如何决策? 传统决策: A:100%订单 B,C,D:0% &nbs
centos 安装 Codeblocks 随便小屋 codeblocks
1.安装gcc,需要c和c++两部分,默认安装下,CentOS不安装编译器的,在终端输入以下命令即可yum install gccyum install gcc-c++ 2.安装gtk2-devel,因为默认已经安装了正式产品需要的支持库,但是没有安装开发所需要的文档.yum install gtk2* 3. 安装wxGTK yum search w
23种设计模式的形象比喻 aijuans 设计模式
1、ABSTRACT FACTORY—追MM少不了请吃饭了，麦当劳的鸡翅和肯德基的鸡翅都是MM爱吃的东西，虽然口味有所不同，但不管你带MM去麦当劳或肯德基，只管向服务员说“来四个鸡翅”就行了。麦当劳和肯德基就是生产鸡翅的Factory 　　工厂模式：客户类和工厂类分开。消费者任何时候需要某种产品，只需向工厂请求即可。消费者无须修改就可以接纳新产品。缺点是当产品修改时，工厂类也要做相应的修改。如：
开发管理 CheckLists aoyouzi 开发管理 CheckLists
开发管理 CheckLists(23) -使项目组度过完整的生命周期开发管理 CheckLists(22) -组织项目资源开发管理 CheckLists(21) -控制项目的范围开发管理 CheckLists(20) -项目利益相关者责任开发管理 CheckLists(19) -选择合适的团队成员开发管理 CheckLists(18) -敏捷开发 Scrum Master 工作开发管理 C
js实现切换百合不是茶 JavaScript 栏目切换
js主要功能之一就是实现页面的特效,窗体的切换可以减少页面的大小,被门户网站大量应用思路: 1,先将要显示的设置为display:bisible 否则设为none 2,设置栏目的id ,js获取栏目的id,如果id为Null就设置为显示 3,判断js获取的id名字;再设置是否显示代码实现: html代码: <di
周鸿祎在360新员工入职培训上的讲话 bijian1013 感悟项目管理人生职场
这篇文章也是最近偶尔看到的，考虑到原博客发布者可能将其删除等原因，也更方便个人查找，特将原文拷贝再发布的。“学东西是为自己的，不要整天以混的姿态来跟公司博弈，就算是混，我觉得你要是能在混的时间里，收获一些别的有利于人生发展的东西，也是不错的，看你怎么把握了”，看了之后，对这句话记忆犹新。 &
前端Web开发的页面效果 Bill_chen html Web Microsoft
1.IE6下png图片的透明显示： <img src="图片地址" border="0" style="Filter.Alpha(Opacity)=数值(100),style=数值(3)"/> 或在<head></head>间加一段JS代码让透明png图片正常显示。 2.<li>标
【JVM五】老年代垃圾回收：并发标记清理GC(CMS GC) bit1129 垃圾回收
CMS概述并发标记清理垃圾回收(Concurrent Mark and Sweep GC）算法的主要目标是在GC过程中，减少暂停用户线程的次数以及在不得不暂停用户线程的请夸功能，尽可能短的暂停用户线程的时间。这对于交互式应用，比如web应用来说，是非常重要的。 CMS垃圾回收针对新生代和老年代采用不同的策略。相比同吞吐量垃圾回收，它要复杂的多。吞吐量垃圾回收在执
Struts2技术总结白糖_ struts2
必备jar文件早在struts2.0.*的时候，struts2的必备jar包需要如下几个： commons-logging-*.jar Apache旗下commons项目的log日志包 freemarker-*.jar
Jquery easyui layout应用注意事项 bozch jquery 浏览器 easyui layout
在jquery easyui中提供了easyui-layout布局，他的布局比较局限，类似java中GUI的border布局。下面对其使用注意事项作简要介绍：如果在现有的工程中前台界面均应用了jquery easyui，那么在布局的时候最好应用jquery eaysui的layout布局，否则在表单页面（编辑、查看、添加等等）在不同的浏览器会出
java-拷贝特殊链表：有一个特殊的链表，其中每个节点不但有指向下一个节点的指针pNext，还有一个指向链表中任意节点的指针pRand，如何拷贝这个特殊链表？ bylijinnan java
public class CopySpecialLinkedList { /** * 题目：有一个特殊的链表，其中每个节点不但有指向下一个节点的指针pNext，还有一个指向链表中任意节点的指针pRand，如何拷贝这个特殊链表？拷贝pNext指针非常容易，所以题目的难点是如何拷贝pRand指针。假设原来链表为A1 -> A2 ->... -> An，新拷贝
color Chen.H JavaScript html css
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd"> <HTML> <HEAD>&nbs
[信息与战争]移动通讯与网络 comsci 网络
两个坚持:手机的电池必须可以取下来光纤不能够入户,只能够到楼宇建议大家找这本书看看:<&
oracle flashback query(闪回查询) daizj oracle flashback query flashback table
在Oracle 10g中，Flash back家族分为以下成员： Flashback Database Flashback Drop Flashback Table Flashback Query(分Flashback Query,Flashback Version Query，Flashback Transaction Query) 下面介绍一下Flashback Drop 和Flas
zeus持久层DAO单元测试 deng520159 单元测试
zeus代码测试正紧张进行中,但由于工作比较忙,但速度比较慢.现在已经完成读写分离单元测试了,现在把几种情况单元测试的例子发出来,希望有人能进出意见,让它走下去. 本文是zeus的dao单元测试: 1.单元测试直接上代码 package com.dengliang.zeus.webdemo.test; import org.junit.Test; import o
C语言学习三printf函数和scanf函数学习 dcj3sjt126com c printf scanf language
printf函数 /* 2013年3月10日20:42:32 地点：北京潘家园功能：目的：测试%x %X %#x %#X的用法 */ # include <stdio.h> int main(void) { printf("哈哈！\n"); // \n表示换行 int i = 10; printf
那你为什么小时候不好好读书? dcj3sjt126com life
dady, 我今天捡到了十块钱, 不过我还给那个人了 good girl! 那个人有没有和你讲thank you啊没有啦....他拉我的耳朵我才把钱还给他的, 他哪里会和我讲thank you 爸爸, 如果地上有一张5块一张10块你拿哪一张呢.... 当然是拿十块的咯... 爸爸你很笨的, 你不会两张都拿爸爸为什么上个月那个人来跟你讨钱, 你告诉他没
iptables开放端口 Fanyucai linux iptables 端口
1，找到配置文件 vi /etc/sysconfig/iptables 2，添加端口开放，增加一行，开放18081端口 -A INPUT -m state --state NEW -m tcp -p tcp --dport 18081 -j ACCEPT 3，保存 ESC :wq! 4，重启服务 service iptables
Ehcache（05）——缓存的查询 234390216 排序 ehcache 统计 query
缓存的查询目录 1. 使Cache可查询 1.1 基于Xml配置 1.2 基于代码的配置 2 指定可搜索的属性 2.1 可查询属性类型 2.2 &
通过hashset找到数组中重复的元素 jackyrong hashset
如何在hashset中快速找到重复的元素呢?方法很多，下面是其中一个办法： int[] array = {1,1,2,3,4,5,6,7,8,8}; Set<Integer> set = new HashSet<Integer>(); for(int i = 0
使用ajax和window.history.pushState无刷新改变页面内容和地址栏URL lanrikey history
后退时关闭当前页面 <script type="text/javascript"> jQuery(document).ready(function ($) { if (window.history && window.history.pushState) {
应用程序的通信成本 netkiller.github.com 虚拟机应用服务器陈景峰 netkiller neo
应用程序的通信成本什么是通信一个程序中两个以上功能相互传递信号或数据叫做通信。什么是成本这是是指时间成本与空间成本。时间就是传递数据所花费的时间。空间是指传递过程耗费容量大小。都有哪些通信方式全局变量线程间通信共享内存共享文件管道 Socket 硬件（串口，USB）等等全局变量全局变量是成本最低通信方法，通过设置
一维数组与二维数组的声明与定义恋洁e生二维数组一维数组定义声明初始化
/** * */ package test20111005; /** * @author FlyingFire * @date:2011-11-18 上午04:33:36 * @author ：代码整理 * @introduce :一维数组与二维数组的初始化 *summary： */ public c
Spring Mybatis独立事务配置 toknowme mybatis
在项目中有很多地方会使用到独立事务，下面以获取主键为例（1）修改配置文件spring-mybatis.xml  <tx:annotation-driven transaction-manager="transactionManager" /> &n
更新Anadroid SDK Tooks之后，Eclipse提示No update were found xp9802 eclipse
使用Android SDK Manager 更新了Anadroid SDK Tooks 之后，打开eclipse提示 This Android SDK requires Android Developer Toolkit version 23.0.0 or above, 点击Check for Updates 检测一会后提示 No update were found