yolo系列配置文件解析,内容来自官方

CFG-Parameters in the [net] section:

[net] section

  • batch=1 number of samples (images, letters, …) which will be precossed in one batch
  • subdivisions=1 - number of mini_batches in one batch, size mini_batch = batch/subdivisions, so GPU processes mini_batch samples at once, and the weights will be updated for batch samples (1 iteration processes batch images)
  • width=416 - network size (width), so every image will be resized to the network size during Training and Detection
  • height=416 - network size (height), so every image will be resized to the network size during Training and Detection
  • channels=3 - network size (channels), so every image will be converted to this number of channels during Training and Detection
  • inputs=256 - network size (inputs) is used for non-image data: letters, prices, any custom data
  • max_chart_loss=20 - max value of Loss in the image chart.png

For training only

Contrastive loss:

  • contrastive=1 - use Supervised contrastive loss for training Classifier (should be used with [contrastive] layer)
  • unsupervised=1 - use Unsupervised contrastive loss for training Classifier on images without labels (should be used with contrastive=1 parameter and with [contrastive] layer)

Data augmentation:

  • angle=0 - randomly rotates images during training (classification only)
  • saturation = 1.5 - randomly changes saturation of images during training
  • exposure = 1.5 - randomly changes exposure (brightness) during training
  • hue=.1 - randomly changes hue (color) during training https://en.wikipedia.org/wiki/HSL_and_HSV
  • blur=1 - blur will be applied randomly in 50% of the time: if 1 - will be blured background except objects with blur_kernel=31, if >1 - will be blured whole image with blur_kernel=blur (only for detection and if OpenCV is used)
  • min_crop=224 - minimum size of randomly cropped image (classification only)
  • max_crop=448 - maximum size of randomly cropped image (classification only)
  • aspect=.75 - aspect ration can be changed during croping from 0.75 - to 1/0.75 (classification only)
  • letter_box=1 - keeps aspect ratio of loaded images during training (detection training only, but to use it during detection-inference - use flag -letter_box at the end of detection command)
  • cutmix=1 - use CutMix data augmentation (for Classifier only, not for Detector)
  • mosaic=1 - use Mosaic data augmentation (4 images in one)
  • mosaic_bound=1 - limits the size of objects when mosaic=1 is used (does not allow bounding boxes to leave the borders of their images when Mosaic-data-augmentation is used)

data augmentation in the last [yolo]-layer

  • jitter=0.3 - randomly changes size of image and its aspect ratio from x(1 - 2jitter) to x(1 + 2jitter)
  • random=1 - randomly resizes network size after each 10 batches (iterations) from /1.4 to x1.4 with keeping initial aspect ratio of network size
    adversarial_lr=1.0 - Changes all detected objects to make it unlike themselves from neural network point of view. The neural network do an adversarial attack on itself
  • attention=1 - shows points of attention during training
  • gaussian_noise=1 - add gaussian noise

Optimizator:

  • momentum=0.9 - accumulation of movement, how much the history affects the further change of weights (optimizer)
  • decay=0.0005 - a weaker updating of the weights for typical features, it eliminates dysbalance in dataset (optimizer) http://cs231n.github.io/neural-networks-3/
  • learning_rate=0.001 - initial learning rate for training
  • burn_in=1000 - initial burn_in will be processed for the first 1000 iterations, current_learning rate = learning_rate * pow(iterations / burn_in, power) = 0.001 * pow(iterations/1000, 4) where is power=4 by default
  • max_batches = 500200 - the training will be processed for this number of iterations (batches)
  • policy=steps - policy for changing learning rate: constant (by default), sgdr, steps, step, sig, exp, poly, random (f.e., if policy=random - then current learning rate will be changed in this way = learning_rate * pow(rand_uniform(0,1), power))
  • power=4 - if policy=poly - the learning rate will be = learning_rate * pow(1 - current_iteration / max_batches, power)
  • sgdr_cycle=1000 - if policy=sgdr - the initial number of iterations in cosine-cycle
  • sgdr_mult=2 - if policy=sgdr - multiplier for cosine-cycle https://towardsdatascience.com/https-medium-com-reina-wang-tw-stochastic-gradient-descent-with-restarts-5f511975163
  • steps=8000,9000,12000 - if policy=steps - at these numbers of iterations the learning rate will be multiplied by scales factor
  • scales=.1,.1,.1 - if policy=steps - f.e. if steps=8000,9000,12000, scales=.1,.1,.1 and the current iteration number is 10000 then current_learning_rate = learning_rate * scales[0] * scales[1] = 0.001 * 0.1 * 0.1 = 0.00001
  • label_smooth_eps=0.1 - use label smoothing for training Classifier

For training Recurrent networks:

Object Detection/Tracking on Video - if [conv-lstm] or [crnn] layers are used in additional to [connected] and [convolutional] layers

  • Text generation - if [lstm] or [rnn] layers are used in additional to [connected] layers
  • track=1 - if is set 1 then the training will be performed in Recurrents-tyle for image sequences
  • time_steps=16 - training will be performed for a random image sequence that contains 16 images from train.txt file

for [convolutional]-layers: mini_batch = time_steps*batch/subdivisions
for [conv_lstm]-recurrent-layers: mini_batch = batch/subdivisions and sequence=16
augment_speed=3 - if set 3 then can be used each 1st, 2nd or 3rd image randomly, i.e. can be used 16 images with indexes 0, 1, 2, … 15 or 110, 113, 116, … 155 from train.txt file

sequential_subdivisions=8 - lower value increases the sequence of images, so if time_steps=16 batch=16 sequential_subdivisions=8, then will be loaded time_stepsbatch/sequential_subdivisions = 1616/8 = 32 sequential images with the same data-augmentation, so the model will be trained for sequence of 32 video-frames

seq_scales=0.5, 0.5 - increasing sequence of images at some steps, i.e. the coefficients to which the original sequential_subdivisions value will be multiplied (and batch will be dividied, so the weights will be updated rarely) at correspond steps if is used policy=steps or policy=sgdr

你可能感兴趣的:(深度学习算法,计算机视觉,深度学习,opencv)