yanzhiwen2

yolo3制作数据集并训练模型检测路面

1.labeImg制作数据集
摁了w后再用鼠标框选，然后再选择类别，记得保存
a:前一张图片
d:后一张图片
2.代码下载：https://github.com/Gavin-Tao/yolov3
3.在主目录里创建VOCdevkit文件夹，文件夹的具体组成如下图：

标注好的数据对应存在VOC2007目录Annotations里面（xml文件），标注的图像放在JPEGImages里面（jpg文件）。标注好的数据是一一对应的。

4.在VOC2007目录下创建一个python代码，我这里命名CreateMainDirTex.py
CreateMainDirTex.py的代码如下：

import os
import random
 
trainval_percent = 0.1
train_percent = 0.9
xmlfilepath = r'G:\keras-yolo3-master\VOCdevkit\VOC2007\Annotations'
txtsavepath = r'G:\keras-yolo3-master\VOCdevkit\VOC2007\ImageSets\Main'
total_xml = os.listdir(xmlfilepath)
 
num = len(total_xml)
list = range(num)
tv = int(num * trainval_percent)
tr = int(tv * train_percent)
trainval = random.sample(list, tv)
train = random.sample(trainval, tr)
 
ftrainval = open(os.path.join(txtsavepath,'trainval.txt'), 'w')
ftest = open(os.path.join(txtsavepath,'test.txt'), 'w')
ftrain = open(os.path.join(txtsavepath,'train.txt'), 'w')
fval = open(os.path.join(txtsavepath,'val.txt'), 'w')
 
for i in list:
    name = total_xml[i][:-4] + '\n'
    if i in trainval:
        ftrainval.write(name)
        if i in train:
            ftest.write(name)
        else:
            fval.write(name)
    else:
        ftrain.write(name)
 
ftrainval.close()
ftrain.close()
fval.close()
ftest.close()

运行，结果在Main下创建了四个txt文件。

5.下载权重：将下载后的文件放在主目录中：
https://pjreddie.com/media/files/yolov3.weights

6.修改voc_annotation.py分类，更改的代码，有两个位置，一个是数据格式，一个是标注的标签名。classes对应修改为自己的类别。
执行之后，将会得到2007_train.txt、2007_test.txt、2007_val.txt，文本内容是:　img_path、box位置(top left bottom right)、类别索引(0 1 2 …)

import xml.etree.ElementTree as ET
from os import getcwd

sets=[('2007', 'train'), ('2007', 'val'), ('2007', 'test')]

classes = ["grass road","asphalt road","stone road","sand road","brick road"]


def convert_annotation(year, image_id, list_file):
    in_file = open('VOCdevkit/VOC%s/Annotations/%s.xml'%(year, image_id))
    tree=ET.parse(in_file)
    root = tree.getroot()

    for obj in root.iter('object'):
        difficult = obj.find('Difficult').text
        cls = obj.find('name').text
        if cls not in classes or int(difficult)==1:
            continue
        cls_id = classes.index(cls)
        xmlbox = obj.find('bndbox')
        b = (int(xmlbox.find('xmin').text), int(xmlbox.find('ymin').text), int(xmlbox.find('xmax').text), int(xmlbox.find('ymax').text))
        list_file.write(" " + ",".join([str(a) for a in b]) + ',' + str(cls_id))

wd = getcwd()

for year, image_set in sets:
    image_ids = open('VOCdevkit/VOC%s/ImageSets/Main/%s.txt'%(year, image_set)).read().strip().split()
    list_file = open('%s_%s.txt'%(year, image_set), 'w')
    for image_id in image_ids:
        list_file.write('%s/VOCdevkit/VOC%s/JPEGImages/%s.jpg'%(wd, year, image_id))
        convert_annotation(year, image_id, list_file)
        list_file.write('\n')
    list_file.close()

遇到’NoneType’ object has no attribute 'text’是因为xml中没有difficult这个属性内容，最后导致difficult = obj.find(‘difficult’).text没起作用
因此解决办法就是看自己的xml文件中是否存在这一个属性，没有的话直接把含有difficult的语句注释掉即可，当然也有的是xml中含有Difficult，其中D是大写，只要把程序中含有difficult的句子中的difficult变为Difficult即可。

7.打开yolov3.cfg文件，Ctrl+f查找yolo（共出现了三次），每次都按下图修改

filter：3*（5+len（classes））
classes：你要训练的类别数（我这里是5类）
random：原来的是1，显存小改为0
8.修改model_data下的voc_classes.txt文件，放入你的类别。

9.修改train.py代码（用下面代码直接替换原来的代码）

"""
Retrain the YOLO model for your own dataset.
"""
import numpy as np
import keras.backend as K
from keras.layers import Input, Lambda
from keras.models import Model
from keras.callbacks import TensorBoard, ModelCheckpoint, EarlyStopping
 
from yolo3.model import preprocess_true_boxes, yolo_body, tiny_yolo_body, yolo_loss
from yolo3.utils import get_random_data
 
 
def _main():
    annotation_path = '2007_train.txt'
    log_dir = 'logs/000/'
    classes_path = 'model_data/voc_classes.txt'
    anchors_path = 'model_data/yolo_anchors.txt'
    class_names = get_classes(classes_path)
    anchors = get_anchors(anchors_path)
    input_shape = (416,416) # multiple of 32, hw
    model = create_model(input_shape, anchors, len(class_names) )
    train(model, annotation_path, input_shape, anchors, len(class_names), log_dir=log_dir)
 
def train(model, annotation_path, input_shape, anchors, num_classes, log_dir='logs/'):
    model.compile(optimizer='adam', loss={
        'yolo_loss': lambda y_true, y_pred: y_pred})
    logging = TensorBoard(log_dir=log_dir)
    checkpoint = ModelCheckpoint(log_dir + "ep{epoch:03d}-loss{loss:.3f}-val_loss{val_loss:.3f}.h5",
        monitor='val_loss', save_weights_only=True, save_best_only=True, period=1)
    batch_size = 10
    val_split = 0.1
    with open(annotation_path) as f:
        lines = f.readlines()
    np.random.shuffle(lines)
    num_val = int(len(lines)*val_split)
    num_train = len(lines) - num_val
    print('Train on {} samples, val on {} samples, with batch size {}.'.format(num_train, num_val, batch_size))
 
    model.fit_generator(data_generator_wrap(lines[:num_train], batch_size, input_shape, anchors, num_classes),
            steps_per_epoch=max(1, num_train//batch_size),
            validation_data=data_generator_wrap(lines[num_train:], batch_size, input_shape, anchors, num_classes),
            validation_steps=max(1, num_val//batch_size),
            epochs=500,
            initial_epoch=0)
    model.save_weights(log_dir + 'trained_weights.h5')
 
def get_classes(classes_path):
    with open(classes_path) as f:
        class_names = f.readlines()
    class_names = [c.strip() for c in class_names]
    return class_names
 
def get_anchors(anchors_path):
    with open(anchors_path) as f:
        anchors = f.readline()
    anchors = [float(x) for x in anchors.split(',')]
    return np.array(anchors).reshape(-1, 2)
 
def create_model(input_shape, anchors, num_classes, load_pretrained=False, freeze_body=False,
            weights_path='model_data/yolo_weights.h5'):
    K.clear_session() # get a new session
    image_input = Input(shape=(None, None, 3))
    h, w = input_shape
    num_anchors = len(anchors)
    y_true = [Input(shape=(h//{0:32, 1:16, 2:8}[l], w//{0:32, 1:16, 2:8}[l], \
        num_anchors//3, num_classes+5)) for l in range(3)]
 
    model_body = yolo_body(image_input, num_anchors//3, num_classes)
    print('Create YOLOv3 model with {} anchors and {} classes.'.format(num_anchors, num_classes))
 
    if load_pretrained:
        model_body.load_weights(weights_path, by_name=True, skip_mismatch=True)
        print('Load weights {}.'.format(weights_path))
        if freeze_body:
            # Do not freeze 3 output layers.
            num = len(model_body.layers)-7
            for i in range(num): model_body.layers[i].trainable = False
            print('Freeze the first {} layers of total {} layers.'.format(num, len(model_body.layers)))
 
    model_loss = Lambda(yolo_loss, output_shape=(1,), name='yolo_loss',
        arguments={'anchors': anchors, 'num_classes': num_classes, 'ignore_thresh': 0.5})(
        [*model_body.output, *y_true])
    model = Model([model_body.input, *y_true], model_loss)
    return model
def data_generator(annotation_lines, batch_size, input_shape, anchors, num_classes):
    n = len(annotation_lines)
    np.random.shuffle(annotation_lines)
    i = 0
    while True:
        image_data = []
        box_data = []
        for b in range(batch_size):
            i %= n
            image, box = get_random_data(annotation_lines[i], input_shape, random=True)
            image_data.append(image)
            box_data.append(box)
            i += 1
        image_data = np.array(image_data)
        box_data = np.array(box_data)
        y_true = preprocess_true_boxes(box_data, input_shape, anchors, num_classes)
        yield [image_data, *y_true], np.zeros(batch_size)
 
def data_generator_wrap(annotation_lines, batch_size, input_shape, anchors, num_classes):
    n = len(annotation_lines)
    if n==0 or batch_size<=0: return None
    return data_generator(annotation_lines, batch_size, input_shape, anchors, num_classes)
 
if __name__ == '__main__':
	_main()

替换完成后，千万千万值得注意的是，因为程序中有logs/000/目录，你需要创建这样一个目录，这个目录的作用就是存放自己的数据集训练得到的模型。不然程序运行到最后会因为找不到该路径而发生错误。生成的模型trained_weights.h5如下：

10.利用自己训练好的模型进行预测

修改yolo.py文件，如下将self这三行修改为各自对应的路径。

完整的yolo.py如下：

# -*- coding: utf-8 -*-
"""
Class definition of YOLO_v3 style detection model on image and video
"""

import colorsys
import os
from timeit import default_timer as timer

import numpy as np
from keras import backend as K
from keras.models import load_model
from keras.layers import Input
from PIL import Image, ImageFont, ImageDraw

from yolo3.model import yolo_eval, yolo_body, tiny_yolo_body
from yolo3.utils import letterbox_image
import os
from keras.utils import multi_gpu_model

class YOLO(object):
    _defaults = {
        "model_path": 'logs/000/trained_weights.h5',
        "anchors_path": 'model_data/yolo_anchors.txt',
        "classes_path": 'model_data/voc_classes.txt',
        "score" : 0.12,
        "iou" : 0.3,
        "model_image_size" : (416, 416),
        "gpu_num" : 1,
    }

    @classmethod
    def get_defaults(cls, n):
        if n in cls._defaults:
            return cls._defaults[n]
        else:
            return "Unrecognized attribute name '" + n + "'"

    def __init__(self, **kwargs):
        self.__dict__.update(self._defaults) # set up default values
        self.__dict__.update(kwargs) # and update with user overrides
        self.class_names = self._get_class()
        self.anchors = self._get_anchors()
        self.sess = K.get_session()
        self.boxes, self.scores, self.classes = self.generate()

    def _get_class(self):
        classes_path = os.path.expanduser(self.classes_path)
        with open(classes_path) as f:
            class_names = f.readlines()
        class_names = [c.strip() for c in class_names]
        return class_names

    def _get_anchors(self):
        anchors_path = os.path.expanduser(self.anchors_path)
        with open(anchors_path) as f:
            anchors = f.readline()
        anchors = [float(x) for x in anchors.split(',')]
        return np.array(anchors).reshape(-1, 2)

    def generate(self):
        model_path = os.path.expanduser(self.model_path)
        assert model_path.endswith('.h5'), 'Keras model or weights must be a .h5 file.'

        # Load model, or construct model and load weights.
        num_anchors = len(self.anchors)
        num_classes = len(self.class_names)
        is_tiny_version = num_anchors==6 # default setting
        try:
            self.yolo_model = load_model(model_path, compile=False)
        except:
            self.yolo_model = tiny_yolo_body(Input(shape=(None,None,3)), num_anchors//2, num_classes) \
                if is_tiny_version else yolo_body(Input(shape=(None,None,3)), num_anchors//3, num_classes)
            self.yolo_model.load_weights(self.model_path) # make sure model, anchors and classes match
        else:
            assert self.yolo_model.layers[-1].output_shape[-1] == \
                num_anchors/len(self.yolo_model.output) * (num_classes + 5), \
                'Mismatch between model and given anchor and class sizes'

        print('{} model, anchors, and classes loaded.'.format(model_path))

        # Generate colors for drawing bounding boxes.
        hsv_tuples = [(x / len(self.class_names), 1., 1.)
                      for x in range(len(self.class_names))]
        self.colors = list(map(lambda x: colorsys.hsv_to_rgb(*x), hsv_tuples))
        self.colors = list(
            map(lambda x: (int(x[0] * 255), int(x[1] * 255), int(x[2] * 255)),
                self.colors))
        np.random.seed(10101)  # Fixed seed for consistent colors across runs.
        np.random.shuffle(self.colors)  # Shuffle colors to decorrelate adjacent classes.
        np.random.seed(None)  # Reset seed to default.

        # Generate output tensor targets for filtered bounding boxes.
        self.input_image_shape = K.placeholder(shape=(2, ))
        if self.gpu_num>=2:
            self.yolo_model = multi_gpu_model(self.yolo_model, gpus=self.gpu_num)
        boxes, scores, classes = yolo_eval(self.yolo_model.output, self.anchors,
                len(self.class_names), self.input_image_shape,
                score_threshold=self.score, iou_threshold=self.iou)
        return boxes, scores, classes

    def detect_image(self, image):
        start = timer()

        if self.model_image_size != (None, None):
            assert self.model_image_size[0]%32 == 0, 'Multiples of 32 required'
            assert self.model_image_size[1]%32 == 0, 'Multiples of 32 required'
            boxed_image = letterbox_image(image, tuple(reversed(self.model_image_size)))
        else:
            new_image_size = (image.width - (image.width % 32),
                              image.height - (image.height % 32))
            boxed_image = letterbox_image(image, new_image_size)
        image_data = np.array(boxed_image, dtype='float32')

        print(image_data.shape)
        image_data /= 255.
        image_data = np.expand_dims(image_data, 0)  # Add batch dimension.

        out_boxes, out_scores, out_classes = self.sess.run(
            [self.boxes, self.scores, self.classes],
            feed_dict={
                self.yolo_model.input: image_data,
                self.input_image_shape: [image.size[1], image.size[0]],
                K.learning_phase(): 0
            })

        print('Found {} boxes for {}'.format(len(out_boxes), 'img'))

        font = ImageFont.truetype(font='font/FiraMono-Medium.otf',
                    size=np.floor(3e-2 * image.size[1] + 0.5).astype('int32'))
        thickness = (image.size[0] + image.size[1]) // 300

        for i, c in reversed(list(enumerate(out_classes))):
            predicted_class = self.class_names[c]
            box = out_boxes[i]
            score = out_scores[i]

            label = '{} {:.2f}'.format(predicted_class, score)
            draw = ImageDraw.Draw(image)
            label_size = draw.textsize(label, font)

            top, left, bottom, right = box
            top = max(0, np.floor(top + 0.5).astype('int32'))
            left = max(0, np.floor(left + 0.5).astype('int32'))
            bottom = min(image.size[1], np.floor(bottom + 0.5).astype('int32'))
            right = min(image.size[0], np.floor(right + 0.5).astype('int32'))
            print(label, (left, top), (right, bottom))

            if top - label_size[1] >= 0:
                text_origin = np.array([left, top - label_size[1]])
            else:
                text_origin = np.array([left, top + 1])

            # My kingdom for a good redistributable image drawing library.
            for i in range(thickness):
                draw.rectangle(
                    [left + i, top + i, right - i, bottom - i],
                    outline=self.colors[c])
            draw.rectangle(
                [tuple(text_origin), tuple(text_origin + label_size)],
                fill=self.colors[c])
            draw.text(text_origin, label, fill=(0, 0, 0), font=font)
            del draw

        end = timer()
        print(end - start)
        return image

    def close_session(self):
        self.sess.close()

def detect_video(yolo, video_path, output_path=""):
    import cv2
    vid = cv2.VideoCapture(video_path)
    if not vid.isOpened():
        raise IOError("Couldn't open webcam or video")
    video_FourCC    = int(vid.get(cv2.CAP_PROP_FOURCC))
    video_fps       = vid.get(cv2.CAP_PROP_FPS)
    video_size      = (int(vid.get(cv2.CAP_PROP_FRAME_WIDTH)),
                        int(vid.get(cv2.CAP_PROP_FRAME_HEIGHT)))
    isOutput = True if output_path != "" else False
    if isOutput:
        print("!!! TYPE:", type(output_path), type(video_FourCC), type(video_fps), type(video_size))
        out = cv2.VideoWriter(output_path, video_FourCC, video_fps, video_size)
    accum_time = 0
    curr_fps = 0
    fps = "FPS: ??"
    prev_time = timer()
    while True:
        return_value, frame = vid.read()
        image = Image.fromarray(frame)
        image = yolo.detect_image(image)
        result = np.asarray(image)
        curr_time = timer()
        exec_time = curr_time - prev_time
        prev_time = curr_time
        accum_time = accum_time + exec_time
        curr_fps = curr_fps + 1
        if accum_time > 1:
            accum_time = accum_time - 1
            fps = "FPS: " + str(curr_fps)
            curr_fps = 0
        cv2.putText(result, text=fps, org=(3, 15), fontFace=cv2.FONT_HERSHEY_SIMPLEX,
                    fontScale=0.50, color=(255, 0, 0), thickness=2)
        cv2.namedWindow("result", cv2.WINDOW_NORMAL)
        cv2.imshow("result", result)
        if isOutput:
            out.write(result)
        if cv2.waitKey(1) & 0xFF == ord('q'):
            break
    yolo.close_session()

11.在终端输入python yolo_video.py --image
再输入要检测的图片名字

如果出现AttributeError: module 'keras.backend' has no attribute 'get_session'只需要在yolo_video.py中添加

import os
os.environ['KERAS_BACKEND']='tensorflow'

完整的yolo_video.py如下：

import sys
import argparse
import os
os.environ['KERAS_BACKEND']='tensorflow'
from yolo import YOLO, detect_video
from PIL import Image

def detect_img(yolo):
    while True:
        img = input('Input image filename:')
        try:
            image = Image.open(img)
        except:
            print('Open Error! Try again!')
            continue
        else:
            r_image = yolo.detect_image(image)
            r_image.show()
    yolo.close_session()

FLAGS = None

if __name__ == '__main__':
    # class YOLO defines the default value, so suppress any default here
    parser = argparse.ArgumentParser(argument_default=argparse.SUPPRESS)
    '''
    Command line options
    '''
    parser.add_argument(
        '--model', type=str,
        help='path to model weight file, default ' + YOLO.get_defaults("model_path")
    )

    parser.add_argument(
        '--anchors', type=str,
        help='path to anchor definitions, default ' + YOLO.get_defaults("anchors_path")
    )

    parser.add_argument(
        '--classes', type=str,
        help='path to class definitions, default ' + YOLO.get_defaults("classes_path")
    )

    parser.add_argument(
        '--gpu_num', type=int,
        help='Number of GPU to use, default ' + str(YOLO.get_defaults("gpu_num"))
    )

    parser.add_argument(
        '--image', default=False, action="store_true",
        help='Image detection mode, will ignore all positional arguments'
    )
    '''
    Command line positional arguments -- for video detection mode
    '''
    parser.add_argument(
        "--input", nargs='?', type=str,required=False,default='./path2your_video',
        help = "Video input path"
    )

    parser.add_argument(
        "--output", nargs='?', type=str, default="",
        help = "[Optional] Video output path"
    )

    FLAGS = parser.parse_args()

    if FLAGS.image:
        """
        Image detection mode, disregard any remaining command line arguments
        """
        print("Image detection mode")
        if "input" in FLAGS:
            print(" Ignoring remaining command line arguments: " + FLAGS.input + "," + FLAGS.output)
        detect_img(YOLO(**vars(FLAGS)))
    elif "input" in FLAGS:
        detect_video(YOLO(**vars(FLAGS)), FLAGS.input, FLAGS.output)
    else:
        print("Must specify at least video_input_path.  See usage with --help.")

##如果没有检测出框，主要是修改score和iou两个量,score代表的是置信度，大于这个值的框才会显示。而iou是表示预测的矩形框和真实目标的交集与并集之比，没有框的话可以适当减小。##

批量检测文件夹（TextImages)中的路面照片（jpg）
代码一：

import sys
import argparse
import os
os.environ['KERAS_BACKEND']='tensorflow'
import cv2
import numpy as np

import matplotlib.pyplot as plt
from yolo import YOLO

from keras import backend as K
from keras.models import load_model
from keras.layers import Input
from PIL import Image, ImageFont, ImageDraw

from yolo3.model import yolo_eval, yolo_body, tiny_yolo_body
from yolo3.utils import letterbox_image
from keras.utils import multi_gpu_model

yolo=YOLO()

# if __name__ == '__main__':

    
#     yolo = YOLO()   
#     for filename in os.listdir(path):        
#         image_path = path+'/'+filename
           
#         image = Image.open(image_path)
#         r_image = yolo.detect_image(image)
    
#         r_image.show() 

#     yolo.close_session()

def road_surface_detect(dictionary_name):

    for filename in os.listdir(r'./'+dictionary_name):
        
        image_path=r'./'+dictionary_name+'/'+filename

        image=Image.open(image_path)

        r_image = yolo.detect_image(image)

        r_image.show()

    yolo.close_session()  


road_surface_detect('TextImages')

代码二：

# -*- coding: utf-8 -*-


import colorsys
import os
from timeit import default_timer as timer
import time

os.environ['KERAS_BACKEND']='tensorflow'
import numpy as np
from keras import backend as K
from keras.models import load_model
from keras.layers import Input
from PIL import Image, ImageFont, ImageDraw

from yolo3.model import yolo_eval, yolo_body, tiny_yolo_body
from yolo3.utils import letterbox_image
from keras.utils import multi_gpu_model

path = './TextImages/'  #待检测图片的位置

# 创建创建一个存储检测结果的dir
result_path = './result'
if not os.path.exists(result_path):
    os.makedirs(result_path)

# result如果之前存放的有文件，全部清除
for i in os.listdir(result_path):
    path_file = os.path.join(result_path,i)  
    if os.path.isfile(path_file):
        os.remove(path_file)

#创建一个记录检测结果的文件
txt_path =result_path + '/result.txt'
file = open(txt_path,'w')  

class YOLO(object):
    _defaults = {
        "model_path": 'logs/000/trained_weights.h5',
        "anchors_path": 'model_data/yolo_anchors.txt',
        "classes_path": 'model_data/voc_classes.txt',
        "score" : 0.01,
        "iou" : 0.1,
        "model_image_size" : (416, 416),
        "gpu_num" : 1,
    }

    @classmethod
    def get_defaults(cls, n):
        if n in cls._defaults:
            return cls._defaults[n]
        else:
            return "Unrecognized attribute name '" + n + "'"

    def __init__(self, **kwargs):
        self.__dict__.update(self._defaults) # set up default values
        self.__dict__.update(kwargs) # and update with user overrides
        self.class_names = self._get_class()
        self.anchors = self._get_anchors()
        self.sess = K.get_session()
        self.boxes, self.scores, self.classes = self.generate()

    def _get_class(self):
        classes_path = os.path.expanduser(self.classes_path)
        with open(classes_path) as f:
            class_names = f.readlines()
        class_names = [c.strip() for c in class_names]
        return class_names

    def _get_anchors(self):
        anchors_path = os.path.expanduser(self.anchors_path)
        with open(anchors_path) as f:
            anchors = f.readline()
        anchors = [float(x) for x in anchors.split(',')]
        return np.array(anchors).reshape(-1, 2)

    def generate(self):
        model_path = os.path.expanduser(self.model_path)
        assert model_path.endswith('.h5'), 'Keras model or weights must be a .h5 file.'

        # Load model, or construct model and load weights.
        num_anchors = len(self.anchors)
        num_classes = len(self.class_names)
        is_tiny_version = num_anchors==6 # default setting
        try:
            self.yolo_model = load_model(model_path, compile=False)
        except:
            self.yolo_model = tiny_yolo_body(Input(shape=(None,None,3)), num_anchors//2, num_classes) \
                if is_tiny_version else yolo_body(Input(shape=(None,None,3)), num_anchors//3, num_classes)
            self.yolo_model.load_weights(self.model_path) # make sure model, anchors and classes match
        else:
            assert self.yolo_model.layers[-1].output_shape[-1] == \
                num_anchors/len(self.yolo_model.output) * (num_classes + 5), \
                'Mismatch between model and given anchor and class sizes'

        print('{} model, anchors, and classes loaded.'.format(model_path))

        # Generate colors for drawing bounding boxes.
        hsv_tuples = [(x / len(self.class_names), 1., 1.)
                      for x in range(len(self.class_names))]
        self.colors = list(map(lambda x: colorsys.hsv_to_rgb(*x), hsv_tuples))
        self.colors = list(
            map(lambda x: (int(x[0] * 255), int(x[1] * 255), int(x[2] * 255)),
                self.colors))
        np.random.seed(10101)  # Fixed seed for consistent colors across runs.
        np.random.shuffle(self.colors)  # Shuffle colors to decorrelate adjacent classes.
        np.random.seed(None)  # Reset seed to default.

        # Generate output tensor targets for filtered bounding boxes.
        self.input_image_shape = K.placeholder(shape=(2, ))
        if self.gpu_num>=2:
            self.yolo_model = multi_gpu_model(self.yolo_model, gpus=self.gpu_num)
        boxes, scores, classes = yolo_eval(self.yolo_model.output, self.anchors,
                len(self.class_names), self.input_image_shape,
                score_threshold=self.score, iou_threshold=self.iou)
        return boxes, scores, classes

    def detect_image(self, image):
        start = timer() # 开始计时

        if self.model_image_size != (None, None):
            assert self.model_image_size[0]%32 == 0, 'Multiples of 32 required'
            assert self.model_image_size[1]%32 == 0, 'Multiples of 32 required'
            boxed_image = letterbox_image(image, tuple(reversed(self.model_image_size)))
        else:
            new_image_size = (image.width - (image.width % 32),
                              image.height - (image.height % 32))
            boxed_image = letterbox_image(image, new_image_size)
        image_data = np.array(boxed_image, dtype='float32')

        print(image_data.shape) #打印图片的尺寸
        image_data /= 255.
        image_data = np.expand_dims(image_data, 0)  # Add batch dimension.

        out_boxes, out_scores, out_classes = self.sess.run(
            [self.boxes, self.scores, self.classes],
            feed_dict={
                self.yolo_model.input: image_data,
                self.input_image_shape: [image.size[1], image.size[0]],
                K.learning_phase(): 0
            })

        print('Found {} boxes for {}'.format(len(out_boxes), 'img')) # 提示用于找到几个bbox

        font = ImageFont.truetype(font='font/FiraMono-Medium.otf',
                    size=np.floor(2e-2 * image.size[1] + 0.2).astype('int32'))
        thickness = (image.size[0] + image.size[1]) // 500

        # 保存框检测出的框的个数
        file.write('find  '+str(len(out_boxes))+' target(s) \n')

        for i, c in reversed(list(enumerate(out_classes))):
            predicted_class = self.class_names[c]
            box = out_boxes[i]
            score = out_scores[i]

            label = '{} {:.2f}'.format(predicted_class, score)
            draw = ImageDraw.Draw(image)
            label_size = draw.textsize(label, font)

            top, left, bottom, right = box
            top = max(0, np.floor(top + 0.5).astype('int32'))
            left = max(0, np.floor(left + 0.5).astype('int32'))
            bottom = min(image.size[1], np.floor(bottom + 0.5).astype('int32'))
            right = min(image.size[0], np.floor(right + 0.5).astype('int32'))

            # 写入检测位置            
            file.write(predicted_class+'  score: '+str(score)+' \nlocation: top: '+str(top)+'、 bottom: '+str(bottom)+'、 left: '+str(left)+'、 right: '+str(right)+'\n')
            
            print(label, (left, top), (right, bottom))

            if top - label_size[1] >= 0:
                text_origin = np.array([left, top - label_size[1]])
            else:
                text_origin = np.array([left, top + 1])

            # My kingdom for a good redistributable image drawing library.
            for i in range(thickness):
                draw.rectangle(
                    [left + i, top + i, right - i, bottom - i],
                    outline=self.colors[c])
            draw.rectangle(
                [tuple(text_origin), tuple(text_origin + label_size)],
                fill=self.colors[c])
            draw.text(text_origin, label, fill=(0, 0, 0), font=font)
            del draw

        end = timer()
        print('time consume:%.3f s '%(end - start))
        return image

    def close_session(self):
        self.sess.close()


# 图片检测

if __name__ == '__main__':

    t1 = time.time()
    yolo = YOLO()   
    for filename in os.listdir(path):        
        image_path = path+'/'+filename
        portion = os.path.split(image_path)
        file.write(portion[1]+' detect_result：\n')        
        image = Image.open(image_path)
        r_image = yolo.detect_image(image)
        file.write('\n')
        #r_image.show() 显示检测结果
        image_save_path = './result/result_'+portion[1]        
        print('detect result save to....:'+image_save_path)
        r_image.save(image_save_path)

    time_sum = time.time() - t1
    file.write('time sum: '+str(time_sum)+'s') 
    print('time sum:',time_sum)
    file.close() 
    yolo.close_session()

迁移学习

1.yolo_weights.h5 预训练模型（用作迁移）

python convert.py -w yolov3.cfg yolov3.weights model_data/yolo_weights.h5 #此时在model_data目录下生成yolo_weights.h5

darknet53.weights （用作重新训练）

wget https://pjreddie.com/media/files/darknet53.conv.74

yolo.h5 （yolov3-VOC训练模型，可以直接用来做预测）

python convert.py yolov3.cfg yolov3.weights model_data/yolo.h5

convert.py如下：

#! /usr/bin/env python
"""
Reads Darknet config and weights and creates Keras model with TF backend.

"""

import argparse
import configparser
import io
import os
from collections import defaultdict

import numpy as np
from keras import backend as K
from keras.layers import (Conv2D, Input, ZeroPadding2D, Add,
                          UpSampling2D, MaxPooling2D, Concatenate)
from keras.layers.advanced_activations import LeakyReLU
from keras.layers.normalization import BatchNormalization
from keras.models import Model
from keras.regularizers import l2
from keras.utils.vis_utils import plot_model as plot


parser = argparse.ArgumentParser(description='Darknet To Keras Converter.')
parser.add_argument('config_path', help='Path to Darknet cfg file.')
parser.add_argument('weights_path', help='Path to Darknet weights file.')
parser.add_argument('output_path', help='Path to output Keras model file.')
parser.add_argument(
    '-p',
    '--plot_model',
    help='Plot generated Keras model and save as image.',
    action='store_true')
parser.add_argument(
    '-w',
    '--weights_only',
    help='Save as Keras weights file instead of model file.',
    action='store_true')

def unique_config_sections(config_file):
    """Convert all config sections to have unique names.

    Adds unique suffixes to config sections for compability with configparser.
    """
    section_counters = defaultdict(int)
    output_stream = io.StringIO()
    with open(config_file) as fin:
        for line in fin:
            if line.startswith('['):
                section = line.strip().strip('[]')
                _section = section + '_' + str(section_counters[section])
                section_counters[section] += 1
                line = line.replace(section, _section)
            output_stream.write(line)
    output_stream.seek(0)
    return output_stream

# %%
def _main(args):
    config_path = os.path.expanduser(args.config_path)
    weights_path = os.path.expanduser(args.weights_path)
    assert config_path.endswith('.cfg'), '{} is not a .cfg file'.format(
        config_path)
    assert weights_path.endswith(
        '.weights'), '{} is not a .weights file'.format(weights_path)

    output_path = os.path.expanduser(args.output_path)
    assert output_path.endswith(
        '.h5'), 'output path {} is not a .h5 file'.format(output_path)
    output_root = os.path.splitext(output_path)[0]

    # Load weights and config.
    print('Loading weights.')
    weights_file = open(weights_path, 'rb')
    major, minor, revision = np.ndarray(
        shape=(3, ), dtype='int32', buffer=weights_file.read(12))
    if (major*10+minor)>=2 and major<1000 and minor<1000:
        seen = np.ndarray(shape=(1,), dtype='int64', buffer=weights_file.read(8))
    else:
        seen = np.ndarray(shape=(1,), dtype='int32', buffer=weights_file.read(4))
    print('Weights Header: ', major, minor, revision, seen)

    print('Parsing Darknet config.')
    unique_config_file = unique_config_sections(config_path)
    cfg_parser = configparser.ConfigParser()
    cfg_parser.read_file(unique_config_file)

    print('Creating Keras model.')
    input_layer = Input(shape=(None, None, 3))
    prev_layer = input_layer
    all_layers = []

    weight_decay = float(cfg_parser['net_0']['decay']
                         ) if 'net_0' in cfg_parser.sections() else 5e-4
    count = 0
    out_index = []
    for section in cfg_parser.sections():
        print('Parsing section {}'.format(section))
        if section.startswith('convolutional'):
            filters = int(cfg_parser[section]['filters'])
            size = int(cfg_parser[section]['size'])
            stride = int(cfg_parser[section]['stride'])
            pad = int(cfg_parser[section]['pad'])
            activation = cfg_parser[section]['activation']
            batch_normalize = 'batch_normalize' in cfg_parser[section]

            padding = 'same' if pad == 1 and stride == 1 else 'valid'

            # Setting weights.
            # Darknet serializes convolutional weights as:
            # [bias/beta, [gamma, mean, variance], conv_weights]
            prev_layer_shape = K.int_shape(prev_layer)

            weights_shape = (size, size, prev_layer_shape[-1], filters)
            darknet_w_shape = (filters, weights_shape[2], size, size)
            weights_size = np.product(weights_shape)

            print('conv2d', 'bn'
                  if batch_normalize else '  ', activation, weights_shape)

            conv_bias = np.ndarray(
                shape=(filters, ),
                dtype='float32',
                buffer=weights_file.read(filters * 4))
            count += filters

            if batch_normalize:
                bn_weights = np.ndarray(
                    shape=(3, filters),
                    dtype='float32',
                    buffer=weights_file.read(filters * 12))
                count += 3 * filters

                bn_weight_list = [
                    bn_weights[0],  # scale gamma
                    conv_bias,  # shift beta
                    bn_weights[1],  # running mean
                    bn_weights[2]  # running var
                ]

            conv_weights = np.ndarray(
                shape=darknet_w_shape,
                dtype='float32',
                buffer=weights_file.read(weights_size * 4))
            count += weights_size

            # DarkNet conv_weights are serialized Caffe-style:
            # (out_dim, in_dim, height, width)
            # We would like to set these to Tensorflow order:
            # (height, width, in_dim, out_dim)
            conv_weights = np.transpose(conv_weights, [2, 3, 1, 0])
            conv_weights = [conv_weights] if batch_normalize else [
                conv_weights, conv_bias
            ]

            # Handle activation.
            act_fn = None
            if activation == 'leaky':
                pass  # Add advanced activation later.
            elif activation != 'linear':
                raise ValueError(
                    'Unknown activation function `{}` in section {}'.format(
                        activation, section))

            # Create Conv2D layer
            if stride>1:
                # Darknet uses left and top padding instead of 'same' mode
                prev_layer = ZeroPadding2D(((1,0),(1,0)))(prev_layer)
            conv_layer = (Conv2D(
                filters, (size, size),
                strides=(stride, stride),
                kernel_regularizer=l2(weight_decay),
                use_bias=not batch_normalize,
                weights=conv_weights,
                activation=act_fn,
                padding=padding))(prev_layer)

            if batch_normalize:
                conv_layer = (BatchNormalization(
                    weights=bn_weight_list))(conv_layer)
            prev_layer = conv_layer

            if activation == 'linear':
                all_layers.append(prev_layer)
            elif activation == 'leaky':
                act_layer = LeakyReLU(alpha=0.1)(prev_layer)
                prev_layer = act_layer
                all_layers.append(act_layer)

        elif section.startswith('route'):
            ids = [int(i) for i in cfg_parser[section]['layers'].split(',')]
            layers = [all_layers[i] for i in ids]
            if len(layers) > 1:
                print('Concatenating route layers:', layers)
                concatenate_layer = Concatenate()(layers)
                all_layers.append(concatenate_layer)
                prev_layer = concatenate_layer
            else:
                skip_layer = layers[0]  # only one layer to route
                all_layers.append(skip_layer)
                prev_layer = skip_layer

        elif section.startswith('maxpool'):
            size = int(cfg_parser[section]['size'])
            stride = int(cfg_parser[section]['stride'])
            all_layers.append(
                MaxPooling2D(
                    pool_size=(size, size),
                    strides=(stride, stride),
                    padding='same')(prev_layer))
            prev_layer = all_layers[-1]

        elif section.startswith('shortcut'):
            index = int(cfg_parser[section]['from'])
            activation = cfg_parser[section]['activation']
            assert activation == 'linear', 'Only linear activation supported.'
            all_layers.append(Add()([all_layers[index], prev_layer]))
            prev_layer = all_layers[-1]

        elif section.startswith('upsample'):
            stride = int(cfg_parser[section]['stride'])
            assert stride == 2, 'Only stride=2 supported.'
            all_layers.append(UpSampling2D(stride)(prev_layer))
            prev_layer = all_layers[-1]

        elif section.startswith('yolo'):
            out_index.append(len(all_layers)-1)
            all_layers.append(None)
            prev_layer = all_layers[-1]

        elif section.startswith('net'):
            pass

        else:
            raise ValueError(
                'Unsupported section header type: {}'.format(section))

    # Create and save model.
    if len(out_index)==0: out_index.append(len(all_layers)-1)
    model = Model(inputs=input_layer, outputs=[all_layers[i] for i in out_index])
    print(model.summary())
    if args.weights_only:
        model.save_weights('{}'.format(output_path))
        print('Saved Keras weights to {}'.format(output_path))
    else:
        model.save('{}'.format(output_path))
        print('Saved Keras model to {}'.format(output_path))

    # Check to see if all weights have been read.
    remaining_weights = len(weights_file.read()) / 4
    weights_file.close()
    print('Read {} of {} from Darknet weights.'.format(count, count +
                                                       remaining_weights))
    if remaining_weights > 0:
        print('Warning: {} unused weights'.format(remaining_weights))

    if args.plot_model:
        plot(model, to_file='{}.png'.format(output_root), show_shapes=True)
        print('Saved model plot to {}.png'.format(output_root))


if __name__ == '__main__':
    _main(parser.parse_args())

2.重新运行train.py训练
通过更改冻结层来进行迁移学习：

load_pretrained=True, freeze_body=True
num = len(model_body.layers)-3 #只用后面三层进行训练

def create_model(input_shape, anchors, num_classes, load_pretrained=True, freeze_body=True,
            weights_path='model_data/yolo_weights.h5'):
    K.clear_session() # get a new session
    image_input = Input(shape=(None, None, 3))
    h, w = input_shape
    num_anchors = len(anchors)
    y_true = [Input(shape=(h//{0:32, 1:16, 2:8}[l], w//{0:32, 1:16, 2:8}[l], \
        num_anchors//3, num_classes+5)) for l in range(3)]
 
    model_body = yolo_body(image_input, num_anchors//3, num_classes)
    print('Create YOLOv3 model with {} anchors and {} classes.'.format(num_anchors, num_classes))
 
    if load_pretrained:
        model_body.load_weights(weights_path, by_name=True, skip_mismatch=True)
        print('Load weights {}.'.format(weights_path))
        if freeze_body:
            # Do not freeze 3 output layers.
            num = len(model_body.layers)-3
            for i in range(num): model_body.layers[i].trainable = False
            print('Freeze the first {} layers of total {} layers.'.format(num, len(model_body.layers)))
 
    model_loss = Lambda(yolo_loss, output_shape=(1,), name='yolo_loss',
        arguments={'anchors': anchors, 'num_classes': num_classes, 'ignore_thresh': 0.5})(
        [*model_body.output, *y_true])
    model = Model([model_body.input, *y_true], model_loss)
    return model

##当仓库更换路径时，要重新运行voc_annotation.py使生成的2007_text.txt 2007_train.txt 2007_val.txt里生成的路径与真实路径对应起来。##

你可能感兴趣的:(yolo3制作数据集并训练模型检测路面)

在C#中，可以不实例化一个类而直接调用其静态字段就是有点傻 C#c#
这是因为静态成员（staticmembers）属于类本身，而不是类的实例。这是静态成员的核心特性1.静态成员属于类，而非实例当用static关键字修饰字段、方法或属性时，这些成员会绑定到类级别，而不是实例级别。它们在类加载时（通常是在程序启动或首次访问时）由CLR（公共语言运行时）分配内存并初始化，与是否创建实例无关。2.为什么不需要实例化？内存分配：静态字段的内存空间在程序运行期间只有一份，所有
如何在YashanDB中管理数据模型变更数据库
在现代企业中，数据模型的变更管理扮演着关键角色。无论是扩展现有业务，还是应对新的需求，业务模型的改变往往需要相应的数据模型更新。如何有效地管理这些变更，确保数据的完整性、一致性及应用的高可用性，成为了数据架构师和开发者必须面对的重要问题。本文将详细探讨在YashanDB中管理数据模型变更的策略和方法，旨在提升对YashanDB数据库技术的理解及应用能力。数据模型变更管理的关键要素版本控制与变更日志
如何在YashanDB数据库中实现数据模型的简化数据库
在现代数据库技术领域，数据模型的复杂性经常导致性能瓶颈和维护困惑。随着数据规模的增长和业务诉求的增加，复杂的数据结构、冗余的存储和不必要的关联关系都会影响整体数据库的性能和可维护性。特别是在面对动态变化的业务需求时，灵活性和扩展性成为关键因素。YashanDB提供了一系列功能强大的工具和机制，能够有效简化数据模型，提升数据库性能，并增强数据操作的灵活性。本文章旨在为数据库开发者和架构师提供技术洞见
如何为看板产品接入实时行情 API 后端教程观点程序员web3
以下是一个基于Java的完整示例，演示如何通过WebSocket接入InfowayAPI提供的实时行情接口，并展示如加密货币BTC/USDT的实时价格更新。文末附有完整代码。步骤1：准备工作注册账号并申请免费APIKey阅读接入文档（可选）Java环境准备：JDK11+添加jakarta.websocket依赖添加fastjson2依赖（用于构造/解析JSON）步骤2：建立WebSocket连接W
MongoDB数据库备份及恢复策略详解魑魅丶小鬼
本文还有配套的精品资源，点击获取简介：MongoDB，作为流行的开源NoSQL数据库，提供灵活、高性能和易用性的特点。为了保证数据安全和业务连续性，进行有效的备份和恢复策略至关重要。本文将介绍MongoDB的备份工具和方法，包括mongodump和mongorestore命令行工具，以及更复杂的云备份解决方案。同时，将通过一个中等规模的数据集实例来详细说明备份流程，强调备份前停止写入、执行备份、检
用 AI “一句话生成代码”，用创意兑换灵码潮品：技术人的夏日狂欢季来了人工智能
在AI技术迅猛发展的2025年，我们正式推出“通义灵码编程智能体挑战季”，以“码力觉醒”为主题，打造一场融合技术探索与潮流文化的开发者盛宴。活动以体验MCP服务、Qwen3大模型及记忆功能的智能编程助手为核心，通过“小游戏开发”和“MCP场景实践”两大趣味赛道，降低AI技术门槛，让开发者轻松体验“一句话生成代码”的魔力。活动亮点抢先看：零门槛参与：新老用户均可参与，完成任务即领限量定制棒球帽！趣味
摄像头各参数的意义_详解：摄像头参数介绍说明序雨摄像头各参数的意义
摄像头的核心是CCD，由于CCD在生产过程中分不同等级和和生产商获得的途径不同，造成CCD的采集效果也不同。一个简单的检测方法，就是将摄像头通电，不接镜头，用手遮住镜头接口，看图像有没有亮点，雪花大不大，然后接上镜头，将摄像头对准一个色彩鲜明的物体，查看器的颜色是否有偏色，图像有无扭曲现象，色彩和灰度是否平滑。由于摄像头的核心部件是CCD，所以其主要参数大多与CCD有关，下面就列出摄像头的主要参数
向量化编程：SIMD（Single Instruction, Multiple Data）深度解析
在现代处理器架构中，向量化编程已成为提升计算密集型应用性能的关键技术。SIMD（SingleInstruction,MultipleData）作为向量化编程的核心，通过一条指令同时处理多个数据，能够显著提高数据并行度。本文将从SIMD的基础概念出发，深入探讨其硬件实现、编程模型、性能优化及典型应用场景，帮助开发者充分利用SIMD技术提升代码性能。一、SIMD基础概念1.1什么是SIMD？SIMD是
C++17 并行算法：std::execution::par
在多核处理器普及的今天，如何高效利用硬件资源成为提升软件性能的关键。C++17引入的并行算法库（ParallelAlgorithms）为开发者提供了一套标准化的并行编程接口，通过简单的策略切换即可将顺序算法转换为并行执行。本文将深入探讨C++17并行算法中最核心的执行策略std::execution::par，从基础概念到高级应用，全面解析其原理、用法及最佳实践。一、C++17并行算法概述1.1并
【华为od刷题（C++）】HJ35 蛇形矩阵（指针） m0_64866459 华为od c++链表
我的代码1：#includeusingnamespacestd;intmain(){introw;//row：定义了矩阵的行数（和列数，实际上是一个正方形矩阵）while(cin>>row){//这个循环会持续执行，直到输入流被结束//每次读取一个整数并赋值给row，程序就开始执行填充操作int**a=newint*[row];//动态地为一个二维数组（a）的行分配内存/*这里a是一个指向指针的指
自然语言处理-基于预训练模型的方法-笔记
自然语言处理-基于预训练模型的方法-笔记【下载地址】自然语言处理-基于预训练模型的方法-笔记《自然语言处理-基于预训练模型的方法》由哈尔滨工业大学出版，深入探讨了NLP领域的前沿技术与预训练模型的应用。本书系统介绍了预训练模型的基本概念、发展历程及常见模型的原理，并通过丰富的实践案例与代码实现，帮助读者掌握这些技术在自然语言处理任务中的实际应用。无论是初学者、研发人员，还是希望提升NLP能力的研究
三生原理在三个关键历史断层中实现中西科学传统的创造性弥合？葫三生三生学派算法数学建模人工智能机器学习量子计算
AI辅助创作：三生原理通过重构科学史的叙事逻辑，在三个关键历史断层中实现了中西科学传统的创造性弥合，其核心突破如下：‌一、科学方法论断层：实验主义与直觉理性的融合‌‌西方实验传统的局限‌欧洲科学革命依赖形式逻辑与实验验证（如伽利略斜面实验），但面临复杂系统建模的瓶颈。三生原理将《周易》“阴阳动态平衡”转化为‌参数化递归模型‌（如素性塔的三级筛除结构），在密码学应用中实现效率提升40%，证明东方直觉
Redis GEO vs MongoDB 地理空间关键指标对比
方案对比：RedisGEO：优点：性能极快（微秒级）简单易用，支持距离计算缺点：仅支持位置查询，无法直接关联其他属性（如商家类型）需要额外存储详细信息（需要二次查询MySQL或MongoDB）数据同步：需要维护数据一致性（当商家位置更新时，需要同步更新Redis）MongoDB地理空间索引：优点：支持地理位置+属性联合查询（如查找附近且类型为“餐饮”的商家）数据与业务模型存储在一起，避免二次查询提
PyQt5—QTextEdit 学习笔记寄思～ Python——PyQt5笔记 qt 学习笔记 python
第二章控件学习一、QTextEdit基础认知QTextEdit是PyQt/PySide框架中用于处理富文本内容的强大控件，它不仅支持纯文本编辑，还能处理HTML、图片等复杂内容，是开发文本编辑器、日志查看器等应用的核心组件。二、最简单的QTextEdit实现下面是一个创建QTextEdit并显示的基础案例，适合零基础入门：importsysfromPyQt5.QtWidgetsimportQApp
linux环境中配置中文输入法王慧-tyger linux linux中文
rpm方式。在安装盘上已经有各种语言包了，我们只需要找到他们，并安装就可以了。中文的是fonts-chinese-3.02-9.6.el5.noarch.rpmfonts-ISO8859-2-75dpi-1.0-17.1.noarch.rpm进入各文件对应目录，运行下面命令：#rpm-ivhfonts-chinese-3.02-9.6.el5.noarch.rpm#rpm-ivhfonts-ISO
求平方根：牛顿迭代法 mjfztms leetcode 算法
应用牛顿迭代法求解方程近似解，收敛速度很快牛顿迭代法求解平方根给你一个非负整数x，计算并返回x的算术平方根n，结果只保留整数部分。算法流程图由题意得，n2=xn^2=xn2=x，即为对f(n)=n2−xf(n)=n^2-xf(n)=n2−x求解。第一步：易得：x2−x1=0−f(x1)f′(x1)x_2-x_1=\frac{0-f(x_1)}{f'(x_1)}x2−x1=f′(x1)0−f(x1)
深度模型训练，加速数据读取遇到显卡跑不满的问题不是吧这都有重名遇到的问题 llama 人工智能 LLM python
实测在pytorch的dataloader中使用prefetch_factor参数的时候，如果数据在机械硬盘上显卡始终是跑不满的，瓶颈在数据预加载速度上，当数据放在固态硬盘的时候就可以跑满。问题排查过程：一直在跑模型，但是数据量比较大，之前有段时间还是比较头疼显卡跑不满的。后来直接用钞能力，加了内存条，将数据缓存后一次性读到内存中终于可以跑满了，然后后面就一直没管这个了，唯一的缺点就是每次开始训练
模型微调方法Prefix-Tuning ballball~~ 大模型人工智能算法大数据
简介：个人学习分享，如有错误，欢迎批评指正。随着大规模预训练语言模型（如GPT系列、BERT等）的广泛应用，如何高效、经济地针对特定任务对这些模型进行微调（Fine-Tuning）成为研究热点。传统的微调方法通常需要调整模型的大量参数，导致计算资源消耗大、适应新任务的速度慢。为了解决这一问题，Prefix-Tuning（前缀调优）作为一种高效的微调技术被提出，旨在通过引入少量可训练的前缀参数，达到
红色用 RGB 16进制表示的值 BlueBirdssh RGB颜色值
**红色**在RGB颜色模型中，表示为**#FF0000**（16进制表示）。以下是详细解释：---###1.**RGB模型**RGB模型由**红（Red）**、**绿（Green）**和**蓝（Blue）**三种颜色组成，每种颜色的值范围是0到255（十进制），或者**00到FF**（十六进制）。-红色的RGB值为：-红色（R）=255（十进制）=FF（十六进制）-绿色（G）=0（十进制）=00
【Statsmodels和SciPy介绍与常用方法】机器学习司猫白 scipy statsmodels 统计
Statsmodels库介绍与常用方法Statsmodels是一个强大的Python库，专注于统计建模和数据分析，广泛应用于经济学、金融、生物统计等领域。它提供了丰富的统计模型、假设检验和数据探索工具，适合进行回归分析、时间序列分析等任务。本文将介绍Statsmodels的核心功能，并通过代码示例展示其常用方法。Statsmodels简介Statsmodels建立在NumPy和SciPy的基础上，
Git 分支与远程仓库基础教学总结 Leon_az Git git
Git分支与远程仓库基础教学总结1.Git分支基础什么是分支（Branch）？分支是对项目某个提交状态的指针。用于并行开发、多人协作和代码版本隔离。常用分支命令命令作用gitbranch查看本地分支gitbranch-r查看远程分支gitbranch-a查看本地和远程分支gitbranch创建新分支（基于当前分支）gitcheckout切换分支gitcheckout-b创建并切换新分支gitbra
ubuntu 6.8.0 安装xenomai3.3 ZPC8210 ROS ubuntu linux 运维
通过以下步骤来获取和准备Linux内核6.8.0的源码，并应用Xenomai补丁：1.下载Linux内核6.8.0源码你可以从TheLinuxKernelArchives下载Linux内核6.8.0的源码。以下是具体步骤：访问内核官方网站：打开TheLinuxKernelArchives。找到对应版本的内核：在网站中找到内核6.8.0的下载链接。通常在v6.x目录下。下载源码：下载linux-6.
（五)PS识别：压缩痕迹挖掘-压缩量化表与 DCT 系数分析超龄超能程序猿机器学习 python 图像处理人工智能计算机视觉
（一)PS识别：Python图像分析PS识别之道（二）PS识别：特征识别-直方图分析的从原理到实现（三)PS识别：基于噪声分析PS识别的技术实现（四)PS识别：基于边缘纹理检测分析PS识别的技术实现一介绍本文将介绍一种基于量化表分析和DCT系数分析的图片PS检测方法，帮助你判断图片是否经过处理。二实现原理量化表分析在JPEG图片的压缩过程中，量化表起着关键作用。不同的软件或处理操作可能会改变量化表
算法分析与设计实验2：实现克鲁斯卡尔算法和prim算法表白墙上别挂我算法笔记经验分享
实验原理（一）克鲁斯卡尔算法：一种用于求解最小生成树问题的贪心算法，该算法的基本思想是按照边的权重从小到大排序，然后依次选择边，并加入生成树中，同时确保不会形成环路，直到生成树包含图中所有的顶点为止。具体步骤：边的排序：将所有边按照权重从小到大排序。初始化：创建一个空的生成树（可以是一个空的图结构），以及一个用于记录每个顶点所属集合（或称为连通分量）的数据结构（例如并查集）。边的选择：依次选择排序
SQLite - C/C++编程环境搭建与使用指南 lsx202406 开发语言
SQLite-C/C++编程环境搭建与使用指南引言SQLite是一款轻量级的数据库管理系统，广泛应用于嵌入式系统、移动设备、Web应用等场景。其独特的架构和易用性使其成为许多开发者的首选。本文将详细介绍如何搭建SQLite的C/C++编程环境，并探讨如何在C/C++程序中集成SQLite数据库。环境搭建1.获取SQLite首先，我们需要从SQLite的官方网站（https://www.sqlite
star31.6k，Aider：让代码编写如虎添翼的终端神器
ider是一款运行在终端中的AI结对编程工具，它能与大型语言模型（LLM）无缝协作，直接在您的本地Git仓库中编辑代码。无论是启动新项目，还是优化现有代码库，Aider都能成为您最得力的助手。它支持Claude3.5Sonnet、DeepSeekV3、GPT-4o等顶级AI模型，几乎可以连接任何LLM，让编程体验如虎添翼。Stars数35,188Forks数3,230主要特点Git操作：Aider
yolov5训练失败总结 BTU_YC 深度学习 python pytorch
yolov5训练失败总结版本原因：在进行训练时，出现如下报错：UserWarning:Detectedcalloflr_scheduler.step()beforeoptimizer.step().InPyTorch1.1.0andlater,youshouldcallthemintheoppositeorder:optimizer.step()beforelr_scheduler.step().
ViP-LLaVA: 使大型多模态模型理解任意视觉提示 AI专题精讲 Paper阅读多模态人工智能 AI
摘要现有的大型视觉-语言多模态模型主要关注整体图像理解，但在实现区域特定的理解方面仍存在显著差距。目前，使用文本坐标或空间编码的方法通常无法为视觉提示提供用户友好的接口。为了解决这个问题，我们提出了一种新颖的多模态模型，能够解码任意（自由形式）视觉提示。这使得用户可以通过自然提示（如“红色边框”或“指向箭头”）直观地标记图像并与模型互动。我们的简单设计直接将视觉标记叠加在RGB图像上，避免了复杂的
openai-agents记忆持久化（neo4j） ZHOU_CAMP oi_agents agent中的记忆模块 neo4j python 开发语言
目录环境安装模型配置Memory配置测试环境安装mem0ai[graph]安装uvpipinstall"mem0ai[graph]"docker启动neo4j数据库dockerrun\-p7474:7474-p7687:7687\-eNEO4J_AUTH=neo4j/password\neo4j:5模型配置fromdotenvimportload_dotenvimportosfromopenaii
Aider：27.6K Star！这个终端AI编程神器能用语音改代码，自动生成Git记录并提交，接入DeepSeek斩获编程基准最高分蚝油菜花每日 AI 项目与应用实例 AI编程 git 人工智能开源
❤️如果你也关注AI的发展现状，且对AI应用开发感兴趣，我会每日分享大模型与AI领域的开源项目和应用，提供运行实例和实用教程，帮助你快速上手AI技术！AI在线答疑->智能检索历史文章和开源项目->尽在微信公众号->搜一搜：蚝油菜花⌨️“每个CLI爱好者都该试试的AI编程革命：对着终端说话自动生成Gitcommit是怎样的体验？”大家好，我是蚝油菜花。如果你也经历过——在vim里卡了半小时，只为给函
jsonp 常用util方法 hw1287789687 jsonp jsonp常用方法 jsonp callback
jsonp 常用java方法 (1)以jsonp的形式返回:函数名(json字符串) /*** * 用于jsonp调用 * @param map : 用于构造json数据 * @param callback : 回调的javascript方法名 * @param filters : <code>SimpleBeanPropertyFilter theFilt
多线程场景 alafqq 多线程
0 能不能简单描述一下你在java web开发中需要用到多线程编程的场景？0 对多线程有些了解，但是不太清楚具体的应用场景，能简单说一下你遇到的多线程编程的场景吗？ Java多线程 2012年11月23日 15:41 Young9007 Young9007 4 0 0 4 Comment添加评论关注(2) 3个答案按时间排序按投票排序 0 0 最典型的如： 1、
Maven学习——修改Maven的本地仓库路径 Kai_Ge maven
安装Maven后我们会在用户目录下发现.m2 文件夹。默认情况下，该文件夹下放置了Maven本地仓库.m2/repository。所有的Maven构件(artifact)都被存储到该仓库中，以方便重用。但是windows用户的操作系统都安装在C盘，把Maven仓库放到C盘是很危险的，为此我们需要修改Maven的本地仓库路径。
placeholder的浏览器兼容 120153216 placeholder
【前言】自从html5引入placeholder后，问题就来了，不支持html5的浏览器也先有这样的效果，各种兼容，之前考虑，今天测试人员逮住不放，想了个解决办法，看样子还行，记录一下。【原理】不使用placeholder，而是模拟placeholder的效果，大概就是用focus和focusout效果。【代码】 <scrip
debian_用iso文件创建本地apt源 2002wmj Debian
1.将N个debian-506-amd64-DVD-N.iso存放于本地或其他媒介内，本例是放在本机/iso/目录下 2.创建N个挂载点目录如下： debian:~#mkdir –r /media/dvd1 debian:~#mkdir –r /media/dvd2 debian:~#mkdir –r /media/dvd3 …. debian:~#mkdir –r /media
SQLSERVER耗时最长的SQL 357029540 SQL Server
对于DBA来说，经常要知道存储过程的某些信息： 1. 执行了多少次 2. 执行的执行计划如何 3. 执行的平均读写如何 4. 执行平均需要多少时间列名 &
com/genuitec/eclipse/j2eedt/core/J2EEProjectUtil 7454103 eclipse
今天eclipse突然报了com/genuitec/eclipse/j2eedt/core/J2EEProjectUtil 错误，并且工程文件打不开了，在网上找了一下资料，然后按照方法操作了一遍，好了，解决方法如下：错误提示信息： An error has occurred.See error log for more details. Reason: com/genuitec/
用正则删除文本中的html标签 adminjun java html 正则表达式去掉html标签
使用文本编辑器录入文章存入数据中的文本是HTML标签格式，由于业务需要对HTML标签进行去除只保留纯净的文本内容，于是乎Java实现自动过滤。如下： public static String Html2Text(String inputString) { String htmlStr = inputString; // 含html标签的字符串 String textSt
嵌入式系统设计中常用总线和接口 aijuans linux 基础
嵌入式系统设计中常用总线和接口任何一个微处理器都要与一定数量的部件和外围设备连接，但如果将各部件和每一种外围设备都分别用一组线路与CPU直接连接，那么连线
Java函数调用方式——按值传递 ayaoxinchao java 按值传递对象基础数据类型
Java使用按值传递的函数调用方式，这往往使我感到迷惑。因为在基础数据类型和对象的传递上，我就会纠结于到底是按值传递，还是按引用传递。其实经过学习，Java在任何地方，都一直发挥着按值传递的本色。首先，让我们看一看基础数据类型是如何按值传递的。 public static void main(String[] args) { int a = 2;
ios音量线性下降 bewithme ios音量
直接上代码吧 //second 几秒内下降为0 - (void)reduceVolume:(int)second { KGVoicePlayer *player = [KGVoicePlayer defaultPlayer]; if (!_flag) { _tempVolume = player.volume;
与其怨它不如爱它 bijian1013 选择理想职业规划
抱怨工作是年轻人的常态，但爱工作才是积极的心态，与其怨它不如爱它。一般来说，在公司干了一两年后，不少年轻人容易产生怨言，除了具体的埋怨公司“扭门”，埋怨上司无能以外，也有许多人是因为根本不爱自已的那份工作，工作完全成了谋生的手段，跟自已的性格、专业、爱好都相差甚远。
一边时间不够用一边浪费时间 bingyingao 工作时间浪费
一方面感觉时间严重不够用，另一方面又在不停的浪费时间。每一个周末，晚上熬夜看电影到凌晨一点，早上起不来一直睡到10点钟，10点钟起床，吃饭后玩手机到下午一点。精神还是很差，下午像一直野鬼在城市里晃荡。为何不尝试晚上10点钟就睡，早上7点就起，时间完全是一样的，把看电影的时间换到早上，精神好，气色好，一天好状态。控制让自己周末早睡早起，你就成功了一半。有多少个工作
【Scala八】Scala核心二：隐式转换 bit1129 scala
Implicits work like this: if you call a method on a Scala object, and the Scala compiler does not see a definition for that method in the class definition for that object, the compiler will try to con
sudoku slover in Haskell (2) bookjovi haskell sudoku
继续精简haskell版的sudoku程序，稍微改了一下，这次用了8行，同时性能也提高了很多，对每个空格的所有解不是通过尝试算出来的，而是直接得出。 board = [0,3,4,1,7,0,5,0,0, 0,6,0,0,0,8,3,0,1, 7,0,0,3,0,0,0,0,6, 5,0,0,6,4,0,8,0,7,
Java-Collections Framework学习与总结-HashSet和LinkedHashSet BrokenDreams linkedhashset
本篇总结一下两个常用的集合类HashSet和LinkedHashSet。它们都实现了相同接口java.util.Set。Set表示一种元素无序且不可重复的集合；之前总结过的java.util.List表示一种元素可重复且有序
读《研磨设计模式》-代码笔记-备忘录模式-Memento bylijinnan java 设计模式
声明：本文只为方便我个人查阅和理解，详细的分析以及源代码请移步原作者的博客http://chjavach.iteye.com/ import java.util.ArrayList; import java.util.List; /* * 备忘录模式的功能是，在不破坏封装性的前提下，捕获一个对象的内部状态，并在对象之外保存这个状态，为以后的状态恢复作“备忘”
《RAW格式照片处理专业技法》笔记 cherishLC PS
注意，这不是教程！仅记录楼主之前不太了解的一、色彩（空间）管理作者建议采用ProRGB（色域最广），但camera raw中设为ProRGB，而PS中则在ProRGB的基础上，将gamma值设为了1.8（更符合人眼）注意：bridge、camera raw怎么设置显示、输出的颜色都是正确的（会读取文件内的颜色配置文件），但用PS输出jpg文件时，必须先用Edit->conv
使用 Git 下载 Spring 源码编译 for Eclipse crabdave eclipse
使用 Git 下载 Spring 源码编译 for Eclipse 1、安装gradle，下载 http://www.gradle.org/downloads 配置环境变量GRADLE_HOME，配置PATH %GRADLE_HOME%/bin，cmd，gradle -v 2、spring4 用jdk8 下载 https://jdk8.java.
mysql连接拒绝问题 daizj mysql 登录权限
mysql中在其它机器连接mysql服务器时报错问题汇总一、[running][email protected]:~$mysql -uroot -h 192.168.9.108 -p //带-p参数，在下一步进行密码输入 Enter password: //无字符串输入 ERROR 1045 (28000): Access
Google Chrome 为何打压 H.264 dsjt apple html5 chrome Google
Google 今天在 Chromium 官方博客宣布由于 H.264 编解码器并非开放标准，Chrome 将在几个月后正式停止对 H.264 视频解码的支持，全面采用开放的 WebM 和 Theora 格式。 Google 在博客上表示，自从 WebM 视频编解码器推出以后，在性能、厂商支持以及独立性方面已经取得了很大的进步，为了与 Chromium 现有支持的編解码器保持一致，Chrome
yii 获取控制器名和方法名 dcj3sjt126com yii framework
1. 获取控制器名在控制器中获取控制器名: $name = $this->getId(); 在视图中获取控制器名: $name = Yii::app()->controller->id; 2. 获取动作名在控制器beforeAction()回调函数中获取动作名: $name =
Android知识总结（二） come_for_dream android
明天要考试了，速速总结如下 1、Activity的启动模式 standard：每次调用Activity的时候都创建一个（可以有多个相同的实例，也允许多个相同Activity叠加。） singleTop：可以有多个实例，但是不允许多个相同Activity叠加。即，如果Ac
高洛峰收徒第二期：寻找未来的“技术大牛” ——折腾一年，奖励20万元 gcq511120594 工作项目管理
高洛峰，兄弟连IT教育合伙人、猿代码创始人、PHP培训第一人、《细说PHP》作者、软件开发工程师、《IT峰播》主创人、PHP讲师的鼻祖！首期现在的进程刚刚过半，徒弟们真的很棒，人品都没的说，团结互助，学习刻苦，工作认真积极，灵活上进。我几乎会把他们全部留下来，现在已有一多半安排了实际的工作，并取得了很好的成绩。等他们出徒之日，凭他们的能力一定能够拿到高薪，而且我还承诺过一个徒弟，当他拿到大学毕
linux expect heipark expect
1. 创建、编辑文件go.sh #!/usr/bin/expect spawn sudo su admin expect "*password*" { send "13456\r\n" } interact 2. 设置权限 chmod u+x go.sh 3.
Spring4.1新特性——静态资源处理增强 jinnianshilongnian spring 4.1
目录 Spring4.1新特性——综述 Spring4.1新特性——Spring核心部分及其他 Spring4.1新特性——Spring缓存框架增强 Spring4.1新特性——异步调用和事件机制的异常处理 Spring4.1新特性——数据库集成测试脚本初始化 Spring4.1新特性——Spring MVC增强 Spring4.1新特性——页面自动化测试框架Spring MVC T
idea ubuntuxia 乱码 liyonghui160com
1.首先需要在windows字体目录下或者其它地方找到simsun.ttf 这个字体文件。 2.在ubuntu 下可以执行下面操作安装该字体： sudo mkdir /usr/share/fonts/truetype/simsun sudo cp simsun.ttf /usr/share/fonts/truetype/simsun fc-cache -f -v
改良程序的11技巧 pda158 技巧
有很多理由都能说明为什么我们应该写出清晰、可读性好的程序。最重要的一点，程序你只写一次，但以后会无数次的阅读。当你第二天回头来看你的代码时，你就要开始阅读它了。当你把代码拿给其他人看时，他必须阅读你的代码。因此，在编写时多花一点时间，你会在阅读它时节省大量的时间。让我们看一些基本的编程技巧：尽量保持方法简短永远永远不要把同一个变量用于多个不同的
300个涵盖IT各方面的免费资源（下）——工作与学习篇 shoothao 创业免费资源学习课程远程工作
工作与生产效率: A. 背景声音 Noisli:背景噪音与颜色生成器。 Noizio:环境声均衡器。 Defonic:世界上任何的声响都可混合成美丽的旋律。 Designers.mx:设计者为设计者所准备的播放列表。 Coffitivity:这里的声音就像咖啡馆里放的一样。 B. 避免注意力分散 Self Co
深入浅出RPC uule rpc
深入浅出RPC-浅出篇深入浅出RPC-深入篇 RPC Remote Procedure Call Protocol 远程过程调用协议它是一种通过网络从远程计算机程序上请求服务，而不需要了解底层网络技术的协议。RPC协议假定某些传输协议的存在，如TCP或UDP，为通信程序之间携带信息数据。在OSI网络通信模型中，RPC跨越了传输层和应用层。RPC使得开发