Liaojiajia-2020

Visdrone数据集 | 数据前&后处理操作系列

Visdrone预处理

一、把Visdrone数据集进行切割，生成600*600大小的图片和xml文件
二、提取图片名字
三、将xml标注文件转成yolo需要的标注格式
四、Visdrone直接转成yolo格式
五、使用xml格式画框
六、将txt格式转换成xml格式
七、将原始图像变成二值图像（目标存在是白色，背景是黑色）
八、绘制PR曲线

本篇博客包含：

切割数据集
将数据做成Yolo格式
图像二值化
绘制评估曲线

github：https://github.com/mary-0830/visdrone-dataset

一、把Visdrone数据集进行切割，生成600*600大小的图片和xml文件

train_crop_visdrone.py

import os
import scipy.misc as misc
from xml.dom.minidom import Document
import numpy as np
import copy, cv2

def save_to_txt(save_path, objects_axis):             
    f = open(save_path,'w')
    objects_list = objects_axis.tolist()
    objects_ = [','.join(map(str, i)) + '\n' for i in objects_list]
    objects_[-1] = objects_[-1][:-1]
    # import pdb
    # pdb.set_trace()
    f.writelines(objects_)
    f.close() 


def save_to_xml(save_path, im_width, im_height, objects_axis, label_name, name, hbb=True):
    im_depth = 0
    object_num = len(objects_axis)
    doc = Document()

    annotation = doc.createElement('annotation')
    doc.appendChild(annotation)

    folder = doc.createElement('folder')
    folder_name = doc.createTextNode('Visdrone')
    folder.appendChild(folder_name)
    annotation.appendChild(folder)

    filename = doc.createElement('filename')
    filename_name = doc.createTextNode(name)
    filename.appendChild(filename_name)
    annotation.appendChild(filename)

    source = doc.createElement('source')
    annotation.appendChild(source)

    database = doc.createElement('database')
    database.appendChild(doc.createTextNode('The Visdrone Database'))
    source.appendChild(database)

    annotation_s = doc.createElement('annotation')
    annotation_s.appendChild(doc.createTextNode('Visdrone'))
    source.appendChild(annotation_s)

    image = doc.createElement('image')
    image.appendChild(doc.createTextNode('flickr'))
    source.appendChild(image)

    flickrid = doc.createElement('flickrid')
    flickrid.appendChild(doc.createTextNode('322409915'))
    source.appendChild(flickrid)

    owner = doc.createElement('owner')
    annotation.appendChild(owner)

    flickrid_o = doc.createElement('flickrid')
    flickrid_o.appendChild(doc.createTextNode('knautia'))
    owner.appendChild(flickrid_o)

    name_o = doc.createElement('name')
    name_o.appendChild(doc.createTextNode('yang'))
    owner.appendChild(name_o)


    size = doc.createElement('size')
    annotation.appendChild(size)
    width = doc.createElement('width')
    width.appendChild(doc.createTextNode(str(im_width)))
    height = doc.createElement('height')
    height.appendChild(doc.createTextNode(str(im_height)))
    depth = doc.createElement('depth')
    depth.appendChild(doc.createTextNode(str(im_depth)))
    size.appendChild(width)
    size.appendChild(height)
    size.appendChild(depth)
    segmented = doc.createElement('segmented')
    segmented.appendChild(doc.createTextNode('0'))
    annotation.appendChild(segmented)
    for i in range(object_num):
        objects = doc.createElement('object')
        annotation.appendChild(objects)
        object_name = doc.createElement('name')
        object_name.appendChild(doc.createTextNode(label_name[int(objects_axis[i][5])]))
        objects.appendChild(object_name)
        pose = doc.createElement('pose')
        pose.appendChild(doc.createTextNode('Unspecified'))
        objects.appendChild(pose)
        truncated = doc.createElement('truncated')
        truncated.appendChild(doc.createTextNode('1'))
        objects.appendChild(truncated)
        difficult = doc.createElement('difficult')
        difficult.appendChild(doc.createTextNode('0'))
        objects.appendChild(difficult)
        bndbox = doc.createElement('bndbox')
        objects.appendChild(bndbox)
        if hbb:
           x0 = doc.createElement('xmin')
           x0.appendChild(doc.createTextNode(str((objects_axis[i][0]))))
           bndbox.appendChild(x0)
           y0 = doc.createElement('ymin')
           y0.appendChild(doc.createTextNode(str((objects_axis[i][1]))))
           bndbox.appendChild(y0)
           x1 = doc.createElement('xmax')
           x1.appendChild(doc.createTextNode(str((objects_axis[i][2]))))
           bndbox.appendChild(x1)
           y1 = doc.createElement('ymax')
           y1.appendChild(doc.createTextNode(str((objects_axis[i][3]))))
           bndbox.appendChild(y1)       
        else:

            x0 = doc.createElement('x0')
            x0.appendChild(doc.createTextNode(str((objects_axis[i][0]))))
            bndbox.appendChild(x0)
            y0 = doc.createElement('y0')
            y0.appendChild(doc.createTextNode(str((objects_axis[i][1]))))
            bndbox.appendChild(y0)

            x1 = doc.createElement('x1')
            x1.appendChild(doc.createTextNode(str((objects_axis[i][2]))))
            bndbox.appendChild(x1)
            y1 = doc.createElement('y1')
            y1.appendChild(doc.createTextNode(str((objects_axis[i][3]))))
            bndbox.appendChild(y1)
            
            x2 = doc.createElement('x2')
            x2.appendChild(doc.createTextNode(str((objects_axis[i][4]))))
            bndbox.appendChild(x2)
            y2 = doc.createElement('y2')
            y2.appendChild(doc.createTextNode(str((objects_axis[i][5]))))
            bndbox.appendChild(y2)

            x3 = doc.createElement('x3')
            x3.appendChild(doc.createTextNode(str((objects_axis[i][6]))))
            bndbox.appendChild(x3)
            y3 = doc.createElement('y3')
            y3.appendChild(doc.createTextNode(str((objects_axis[i][7]))))
            bndbox.appendChild(y3)
        
    f = open(save_path,'w')
    f.write(doc.toprettyxml(indent = ''))
    f.close() 

class_list = ['ignored regions','pedestrian','people','bicycle','car','van','truck','tricycle','awning-tricycle','bus','motor','others']


def format_label(txt_list):
    format_data = []
    for i in txt_list[0:]:
        format_data.append(
        [int(xy) for xy in i.split(',')[:8]] 
        # {'x0': int(i.split(' ')[0]),
        # 'x1': int(i.split(' ')[2]),
        # 'x2': int(i.split(' ')[4]),
        # 'x3': int(i.split(' ')[6]),
        # 'y1': int(i.split(' ')[1]),
        # 'y2': int(i.split(' ')[3]),
        # 'y3': int(i.split(' ')[5]),
        # 'y4': int(i.split(' ')[7]),
        # 'class': class_list.index(i.split(' ')[8]) if i.split(' ')[8] in class_list else 0, 
        # 'difficulty': int(i.split(' ')[9])}
        )
        # if i.split(',')[8] not in class_list :
        #     print ('warning found a new label :', i.split(',')[8])
        #     exit()
    return np.array(format_data)

def clip_image(file_idx, image, boxes_all, width, height, stride_w, stride_h):
    if len(boxes_all) > 0:
        shape = image.shape
        for start_h in range(0, shape[0], stride_h):
            for start_w in range(0, shape[1], stride_w):
                boxes = copy.deepcopy(boxes_all)
                box = np.zeros_like(boxes_all)
                start_h_new = start_h
                start_w_new = start_w
                if start_h + height > shape[0]:
                  start_h_new = shape[0] - height
                if start_w + width > shape[1]:
                  start_w_new = shape[1] - width
                top_left_row = max(start_h_new, 0)
                top_left_col = max(start_w_new, 0)
                bottom_right_row = min(start_h + height, shape[0])
                bottom_right_col = min(start_w + width, shape[1])

                subImage = image[top_left_row:bottom_right_row, top_left_col: bottom_right_col]

                box[:, 0] = boxes[:, 0]- top_left_col
                box[:, 2] = boxes[:, 0] + boxes[:, 2]- top_left_col 
                box[:, 4] = boxes[:, 4]
                box[:, 0] = [max(i, 0) for i in box[:, 0]]  # 限制框的大小
                
                # box[:, 6] = boxes[:, 6] - top_left_col

                box[:, 1] = boxes[:, 1] - top_left_row 
                box[:, 3] = boxes[:, 1] + boxes[:, 3] - top_left_row 
                box[:, 5] = boxes[:, 5]
                box[:, 1] = [max(i, 0) for i in box[:, 1]]
                # box[:, 7] = boxes[:, 7] - top_left_row
                # box[:, 8] = boxes[:, 8]
                center_y = 0.5*(box[:, 1] + box[:, 3])
                center_x = 0.5*(box[:, 0] + box[:, 2])
                # print('center_y', center_y)
                # print('center_x', center_x)
                # print ('boxes', boxes)
                # print ('boxes_all', boxes_all)
                # print ('top_left_col', top_left_col, 'top_left_row', top_left_row)
                
                cond1 = np.intersect1d(np.where(center_y[:]>=0)[0], np.where(center_x[:]>=0 )[0])
                cond2 = np.intersect1d(np.where(center_y[:] <= (bottom_right_row - top_left_row))[0],
                                        np.where(center_x[:] <= (bottom_right_col - top_left_col))[0])
                idx = np.intersect1d(cond1, cond2)
                # idx = np.where(center_y[:]>=0 and center_x[:]>=0 and center_y[:] <= (bottom_right_row - top_left_row) and center_x[:] <= (bottom_right_col - top_left_col))[0]
                # save_path, im_width, im_height, objects_axis, label_name
                if len(idx) > 0:
                    name="%s_%04d_%04d.jpg" % (file_idx, top_left_row, top_left_col)
                    print(name)
                    xml = os.path.join(save_dir, 'annotations_600_xml', "%s_%04d_%04d.xml" % (file_idx, top_left_row, top_left_col))
                    save_to_xml(xml, subImage.shape[1], subImage.shape[0], box[idx, :], class_list, str(name))
                    # save_to_txt(xml, box[idx, :])
                    # print ('save xml : ', xml)
                    if subImage.shape[0] > 5 and subImage.shape[1] >5:
                        img = os.path.join(save_dir, 'images_600', "%s_%04d_%04d.jpg" % (file_idx, top_left_row, top_left_col))
                        cv2.imwrite(img, subImage)


print ('class_list', len(class_list))
raw_data = 'D:/datasets/VisDrone/VisDrone2019-DET-val/'
raw_images_dir = os.path.join(raw_data, 'images')
raw_label_dir = os.path.join(raw_data, 'annotations')

save_dir = 'D:/datasets/VisDrone/VisDrone2019-DET-val/' 

images = [i for i in os.listdir(raw_images_dir) if 'jpg' in i]
labels = [i for i in os.listdir(raw_label_dir) if 'txt' in i]

print ('find image', len(images))
print ('find label', len(labels))

min_length = 1e10
max_length = 1
img_h, img_w, stride_h, stride_w = 600, 600, 450, 450 

for idx, img in enumerate(images):
# img = 'P1524.png'
    
    img_data = misc.imread(os.path.join(raw_images_dir, img))
    print (idx, 'read image', img)

    # if len(img_data.shape) == 2:
    #     img_data = img_data[:, :, np.newaxis]
    #     print ('find gray image')

    txt_data = open(os.path.join(raw_label_dir, img.replace('jpg', 'txt')), 'r').readlines()
    # print (idx, len(format_label(txt_data)), img_data.shape)
    # if max(img_data.shape[:2]) > max_length:
        # max_length = max(img_data.shape[:2])
    # if min(img_data.shape[:2]) < min_length:
        # min_length = min(img_data.shape[:2])
    # if idx % 50 ==0:
        # print (idx, len(format_label(txt_data)), img_data.shape)
        # print (idx, 'min_length', min_length, 'max_length', max_length)
    box = format_label(txt_data)
    # box = dele(box)
    clip_image(img.strip('.jpg'), img_data, box, img_h, img_w, stride_h, stride_w)
    
#     rm val/images/*   &&   rm val/labeltxt/*

二、提取图片名字

extract_name.py

# P02 批量读取文件名（不带后缀）

import os

file_path = "D:/datasets/VisDrone/VisDrone2019-DET-val/annotations_600/"
path_list = os.listdir(file_path)  # os.listdir(file)会历遍文件夹内的文件并返回一个列表
print(path_list)
path_name = []  # 把文件列表写入save.txt中


def saveList(pathName):
    for file_name in pathName:
        with open("name_600_val.txt", "a") as f:
            f.write(file_name.split(".")[0] + "\n")


def dirList(path_list):
    for i in range(0, len(path_list)):
        path = os.path.join(file_path, path_list[i])
    if os.path.isdir(path):
        saveList(os.listdir(path))


dirList(path_list)
saveList(path_list)

三、将xml标注文件转成yolo需要的标注格式

xml2yolo.py

# 缺陷坐标xml转txt

import xml.etree.ElementTree as ET
import os


classes = ['ignored regions','pedestrian','people','bicycle','car','van','truck','tricycle','awning-tricycle','bus','motor','others']  # 输入缺陷名称，必须与xml标注名称一致


train_file = 'images_val_600_test.txt'  
train_file_txt = ''

wd = os.getcwd()

def convert(size, box):
    dw = 1. / size[0]
    dh = 1. / size[1]
    box = list(box)
    box[1] = min(box[1], size[0])   # 限制目标的范围在图片尺寸内
    box[3] = min(box[3], size[1])
    x = ((box[0] + box[1]) / 2.0) * dw
    y = ((box[2] + box[3]) / 2.0) * dh
    w = (box[1] - box[0]) * dw
    h = (box[3] - box[2]) * dh
    return (x, y, w, h)   


def convert_annotation(image_id):
    in_file = open('D:/datasets/VisDrone/VisDrone2019-DET-val/annotations_600_xml/%s.xml' % (image_id))  # 读取xml文件路径

    out_file = open('D:/datasets/VisDrone/labels_val_600/%s.txt' % (image_id), 'w')  # 需要保存的txt格式文件路径
    tree = ET.parse(in_file)
    root = tree.getroot()
    size = root.find('size')
    w = int(size.find('width').text)
    h = int(size.find('height').text)

    for obj in root.iter('object'):
        cls = obj.find('name').text
        if cls not in classes:  # 检索xml中的缺陷名称
            continue
        cls_id = classes.index(cls)
        # import pdb
        # pdb.set_trace()
        if cls_id == 0 or cls_id == 11:
            continue
        xmlbox = obj.find('bndbox')
        b = (float(xmlbox.find('xmin').text), float(xmlbox.find('xmax').text), float(xmlbox.find('ymin').text),
             float(xmlbox.find('ymax').text))
        bb = convert((w, h), b)
        out_file.write(str(cls_id - 1) + " " + " ".join([str(a) for a in bb]) + '\n')


image_ids_train = open('D:/datasets/VisDrone/name_600_val.txt').read().strip().split()  # 读取xml文件名索引

for image_id in image_ids_train:
    convert_annotation(image_id)

anns = os.listdir('./VisDrone2019-DET-val/annotations_600_xml/')
for ann in anns:
    ans = ''
    outpath = wd + '/labels_val_600/' + ann
    if ann[-3:] != 'xml':
        continue
    train_file_txt = train_file_txt + wd + '/VisDrone2019-DET-val' +  '/images_600/' + ann[:-3] + 'jpg\n'

with open(train_file, 'w') as outfile:
    outfile.write(train_file_txt)

四、Visdrone直接转成yolo格式

trans_yolo.py

import os
from pathlib import Path
from PIL import Image
import csv


def convert(size, box):
    dw = 1. / size[0]
    dh = 1. / size[1]
    x = (box[0] + box[2] / 2) * dw
    y = (box[1] + box[3] / 2) * dh
    w = box[2] * dw
    h = box[3] * dh
    return (x, y, w, h)
            
wd = os.getcwd()

if not os.path.exists('labels_val'):
    os.makedirs('labels_val')


train_file = 'images_val.txt'  
train_file_txt = ''
    
anns = os.listdir('./VisDrone2019-DET-val/annotations')
for ann in anns:
    ans = ''
    outpath = wd + '/labels_val/' + ann
    if ann[-3:] != 'txt':
        continue
    with Image.open(wd + './VisDrone2019-DET-val/images/' + ann[:-3] + 'jpg') as Img:
        img_size = Img.size
    with open(wd + './VisDrone2019-DET-val/annotations/' + ann, newline='') as csvfile:
        spamreader = csv.reader(csvfile)
        # import pdb
        # pdb.set_trace()
        for row in spamreader:
            if row[4] == '0':
                continue
            bb = convert(img_size, tuple(map(int, row[:4])))
            ans = ans + str(int(row[5])-1) + ' ' + ' '.join(str(a) for a in bb) + '\n'
            with open(outpath, 'w') as outfile:
                outfile.write(ans)
    train_file_txt = train_file_txt + wd + '/images/' + ann[:-3] + 'jpg\n'

with open(train_file, 'w') as outfile:
    outfile.write(train_file_txt)

五、使用xml格式画框

draw_visdrone.py

import os
import os.path
import xml.etree.cElementTree as ET
import cv2
def draw(image_path, xml_path, root_saved_path):
    """
    图片根据标注画框
    """
    src_img_path = image_path
    src_ann_path = xml_path
    for file in os.listdir(src_ann_path):
        # print(file)
        file_name, suffix = os.path.splitext(file)
        # import pdb
        # pdb.set_trace()
        if suffix == '.xml':
            # print(file)
            xml_path = os.path.join(src_ann_path, file)
            image_path = os.path.join(src_img_path, file_name+'.jpg')
            img = cv2.imread(image_path)
            tree = ET.parse(xml_path)
            root = tree.getroot()
            # import pdb
            # pdb.set_trace()
            for obj in root.iter('object'):
                name = obj.find('name').text
                xml_box = obj.find('bndbox')
                x1 = int(xml_box.find('xmin').text)
                x2 = int(xml_box.find('xmax').text)
                y1 = int(xml_box.find('ymin').text)
                y2 = int(xml_box.find('ymax').text)
                cv2.rectangle(img, (x1, y1), (x2, y2), (255, 0, 0), thickness=2)
                # 字为绿色
                # cv2.putText(img, name, (x1, y1), cv2.FONT_HERSHEY_COMPLEX, 0.7, (0, 255, 0), thickness=2)
            cv2.imwrite(os.path.join(root_saved_path, file_name+'.jpg'), img)


if __name__ == '__main__':
    image_path = "D:/datasets/VisDrone/VisDrone2019-DET-train/images_600"
    xml_path = "D:/datasets/VisDrone/VisDrone2019-DET-train/annotations_600"
    root_saved_path = "D:/datasets/VisDrone/VisDrone2019-DET-train/result"
    draw(image_path, xml_path, root_saved_path)

六、将txt格式转换成xml格式

txt2xml.py

# coding: utf-8
# author: HXY
# 2020-4-17

"""
该脚本用于visdrone数据处理；
将annatations文件夹中的txt标签文件转换为XML文件；
txt标签内容为：
,,,,,,,
类别：
ignored regions(0), pedestrian(1),
people(2), bicycle(3), car(4), van(5),
truck(6), tricycle(7), awning-tricycle(8),
bus(9), motor(10), others(11)
"""

import os
import cv2
import time
from xml.dom import minidom

name_dict = {'0': 'ignored regions', '1': 'pedestrian', '2': 'people',
             '3': 'bicycle', '4': 'car', '5': 'van', '6': 'truck',
             '7': 'tricycle', '8': 'awning-tricycle', '9': 'bus',
             '10': 'motor', '11': 'others'}


def transfer_to_xml(pic, txt, file_name):
    xml_save_path = 'D:/datasets/VisDrone/VisDrone2019-DET-val/annotations_xml'  # 生成的xml文件存储的文件夹
    if not os.path.exists(xml_save_path):
        os.mkdir(xml_save_path)

    img = cv2.imread(pic)
    img_w = img.shape[1]
    img_h = img.shape[0]
    img_d = img.shape[2]
    doc = minidom.Document()

    annotation = doc.createElement("annotation")
    doc.appendChild(annotation)
    folder = doc.createElement('folder')
    folder.appendChild(doc.createTextNode('visdrone'))
    annotation.appendChild(folder)

    filename = doc.createElement('filename')
    filename.appendChild(doc.createTextNode(file_name))
    annotation.appendChild(filename)

    source = doc.createElement('source')
    database = doc.createElement('database')
    database.appendChild(doc.createTextNode("Unknown"))
    source.appendChild(database)

    annotation.appendChild(source)

    size = doc.createElement('size')
    width = doc.createElement('width')
    width.appendChild(doc.createTextNode(str(img_w)))
    size.appendChild(width)
    height = doc.createElement('height')
    height.appendChild(doc.createTextNode(str(img_h)))
    size.appendChild(height)
    depth = doc.createElement('depth')
    depth.appendChild(doc.createTextNode(str(img_d)))
    size.appendChild(depth)
    annotation.appendChild(size)

    segmented = doc.createElement('segmented')
    segmented.appendChild(doc.createTextNode("0"))
    annotation.appendChild(segmented)

    with open(txt, 'r') as f:
        lines = [f.readlines()]
        for line in lines:
            for boxes in line:
                box = boxes.strip('/n')
                box = box.split(',')
                x_min = box[0]
                y_min = box[1]
                x_max = int(box[0]) + int(box[2])
                y_max = int(box[1]) + int(box[3])
                object_name = name_dict[box[5]]

                # if object_name is 'ignored regions' or 'others':
                #     continue

                object = doc.createElement('object')
                nm = doc.createElement('name')
                nm.appendChild(doc.createTextNode(object_name))
                object.appendChild(nm)
                pose = doc.createElement('pose')
                pose.appendChild(doc.createTextNode("Unspecified"))
                object.appendChild(pose)
                truncated = doc.createElement('truncated')
                truncated.appendChild(doc.createTextNode("1"))
                object.appendChild(truncated)
                difficult = doc.createElement('difficult')
                difficult.appendChild(doc.createTextNode("0"))
                object.appendChild(difficult)
                bndbox = doc.createElement('bndbox')
                xmin = doc.createElement('xmin')
                xmin.appendChild(doc.createTextNode(x_min))
                bndbox.appendChild(xmin)
                ymin = doc.createElement('ymin')
                ymin.appendChild(doc.createTextNode(y_min))
                bndbox.appendChild(ymin)
                xmax = doc.createElement('xmax')
                xmax.appendChild(doc.createTextNode(str(x_max)))
                bndbox.appendChild(xmax)
                ymax = doc.createElement('ymax')
                ymax.appendChild(doc.createTextNode(str(y_max)))
                bndbox.appendChild(ymax)
                object.appendChild(bndbox)
                annotation.appendChild(object)
                with open(os.path.join(xml_save_path, file_name + '.xml'), 'w') as x:
                    x.write(doc.toprettyxml())
                x.close()
    f.close()


if __name__ == '__main__':
    t = time.time()
    print('Transfer .txt to .xml...ing....')
    txt_folder = 'D:/datasets/VisDrone/VisDrone2019-DET-val/annotations'  # visdrone txt标签文件夹
    txt_file = os.listdir(txt_folder)
    img_folder = 'D:/datasets/VisDrone/VisDrone2019-DET-val/images'  # visdrone 照片所在文件夹

    for txt in txt_file:
        txt_full_path = os.path.join(txt_folder, txt)
        img_full_path = os.path.join(img_folder, txt.split('.')[0] + '.jpg')

        try:
            transfer_to_xml(img_full_path, txt_full_path, txt.split('.')[0])
        except Exception as e:
            print(e)

    print("Transfer .txt to .XML sucessed. costed: {:.3f}s...".format(time.time() - t))

七、将原始图像变成二值图像（目标存在是白色，背景是黑色）

img2binary.py

# -*- coding:utf-8 -*-
# 把标注目标映射为二值图像

import matplotlib.pyplot as plt
import cv2, os
from xml.dom import minidom
import xml.etree.ElementTree as ET
import numpy as np

def tobinary(img_path):
    
    img_list = os.listdir(img_path)
    for img_name_id in img_list:
        # import pdb
        # pdb.set_trace()
        img_id, _ = os.path.splitext(img_name_id)
        img_file = os.path.join(img_path, img_id + ".jpg")
        image = cv2.imread(img_file)
        xml_file = os.path.join(xml_path, img_id + ".xml")
        weight, height = image.shape[0], image.shape[1]
        bg = np.zeros((weight, height), dtype = np.uint8)
        tree = ET.parse(xml_file)
        root = tree.getroot()
        for obj in root.iter("object"):
            name = obj.find("name").text
            xmlbox = obj.find("bndbox")
            xmin = int(xmlbox.find('xmin').text)
            ymin = int(xmlbox.find('ymin').text)
            xmax = int(xmlbox.find('xmax').text)
            ymax = int(xmlbox.find('ymax').text)
            color = (255, 255, 255)
            cv2.rectangle(bg, (xmin, ymin), (xmax, ymax), color, -1)


        img_name = img_id + ".jpg"
        out_file = os.path.join(out_path, img_name)
        cv2.imwrite(out_file, bg)


if __name__ == "__main__":
    img_path = "D:/datasets/VisDrone/VisDrone2019-DET-train/images/"
    xml_path = "D:/datasets/VisDrone/VisDrone2019-DET-train/annotations_xml/"
    out_path = "D:/datasets/VisDrone/VisDrone2019-DET-train/output/"

    tobinary(img_path)
    # xml = open("D:/datasets/VisDrone/VisDrone2019-DET-train/annotations_xml/0000001_02999_d_0000005.xml")
    # import pdb
    # pdb.set_trace()

八、绘制PR曲线

draw.py

from pr import *
clas= ['pedestrian','people','bicycle','car','van','truck',
        'tricycle','awning-tricycle','bus','motor'] # 类别
visdrone_file=["yolov5.txt", "centernet.txt", "ucgnet.txt"] # 放在Prediction里面"Faster_RCNN.txt"
visdrone_algorithm=["yolov5", "centernet", "ucgnet"] # 对应上面的文件的算法名称
use_07_metric=[False, False, False]
ground_truth_file=["visdrone_gt.txt", "visdrone_gt.txt", "visdrone_gt.txt"] # 放在Ground Truth里面


# DOTA_file=["MKD-Net-128.txt"] # 放在Prediction里面"Faster_RCNN.txt"
# DOTA_algorithm=["MKD-Net-128"] # 对应上面的文件的算法名称
# use_07_metric=[False]
# ground_truth_file=["dota_gt.txt"] # 放在Ground Truth里面
# 绘制出了各个类别的Precision-Recall曲线
draw_pr(visdrone_file,visdrone_algorithm,ground_truth_file,clas,use_07_metric)

pr.py

# -*- coding: UTF-8 -*-
#以faster和yolo为例
import numpy as np
import math
import matplotlib.pyplot as plt

def get_pr_data_map(prediction_file,ground_truth_file,cls_name,use_07_metric,ovthresh=0.4):
    with open(ground_truth_file, 'r') as f:####读取各个算法的gt.txt文件
        lines_gt = f.readlines()          #gt.txt的每一行，line_gt为一维数组
    with open(prediction_file, 'r') as f:####读取各个算法的pre.txt文件
        lines_pre = f.readlines()         #读取pre.txt的每一行，lines_gt为一维数组
            #根据txt文件中每一行的空格进行划分，splitlines_gt为二维数组，行为每一行的数据，列为每一行的数据划分
    splitlines_gt = [x.strip().split(' ') for x in lines_gt]
    #imagenames_gt为一维数组，当第二列的类别与遍历的类别数相同时，将第一列加入到imagenames_gt中
    imagenames_gt = [ x[0] for x in splitlines_gt ] 
    #print(imagenames_gt)    
    class_recs = {}       
#     #BB_gt为二维数组，当第二列的类别与遍历的类别数相同时，将第二列之后的加入到BB_gt中
    BB_gt=np.array([[math.ceil(float(z)) for z in x[2:]] for x in splitlines_gt])
    for i in range(len(imagenames_gt)):
        #如果类别中没有该遍历的类别，则更新
        if imagenames_gt[i] not in class_recs:
            class_recs.update({imagenames_gt[i]:{"bbox":[],"det":[],"difficult":[]}})
        #否则将BB_gt[i]插入到类别框信息中
        class_recs[imagenames_gt[i]]["bbox"].append(BB_gt[i])
        class_recs[imagenames_gt[i]]["det"].append(False)
        class_recs[imagenames_gt[i]]["difficult"].append(False)
            #npos为imagenames_gt的长度
    npos=len(imagenames_gt)
    splitlines_pre = [x.strip().split(' ') for x in lines_pre] 
    #image_ids为一维数组，当第二列的类别与遍历的类别数相同时，将第一列加入到image_ids中     
    image_ids = [x[0] for x in splitlines_pre] 
    #提取该类别预测框的置信度，当第二列的类别与遍历的类别数相同时，将第三列加入到confidence中
    confidence = np.array([float(x[2]) for x in splitlines_pre]) 
    #提取该类别预测框的bounging-boxes（二维数组），当第二列的类别与遍历的类别数相同时,将第三列之后的BBOX坐标插入
    BB_pre = np.array([[float(z) for z in x[3:]] for x in splitlines_pre]) 
    #按置信度大小将其索引从小到大排序（生成有顺序的一维数组）
    sorted_ind = np.argsort(-confidence)
    #按置信度大小将置信度从小到大排序（生成有顺序的一维数组）
    sorted_scores = np.sort(-confidence)
    #根据索引排序相应的bbox的坐标值（生成按置信度大小排列的二维数组）
    BB_pre = BB_pre[sorted_ind, :]
    #按置信度大小重新排列image_ids
    image_ids = [image_ids[x] for x in sorted_ind]
    nd = len(image_ids)
    tp = np.zeros(nd)
    fp = np.zeros(nd)
    pr_data_map=[]
    # print(nd)###############################################################################################################
    for d in range(nd):
        #按置信度顺序提取类别框
        #print(image_ids[d].get(image_ids[d]))
        # if class_recs[image_ids[d]] not in class_recs:
        # print(class_recs)
        if class_recs.get(image_ids[d]):
            #print(image_ids[d])
            R = class_recs[image_ids[d]]
            print(R)
            bb = np.array(BB_pre[d, :]).astype(float)#按置信度顺序提取bounding box
            ovmax = -np.inf#ovmax为负无穷大的数
            BBGT = np.array(R['bbox']).astype(float)#按置信度顺序提取groudtruth bbox

            #计算iou
            if BBGT.size > 0:
                ixmin = np.maximum(BBGT[:, 0], bb[0])
                iymin = np.maximum(BBGT[:, 1], bb[1])
                ixmax = np.minimum(BBGT[:, 2], bb[2])
                iymax = np.minimum(BBGT[:, 3], bb[3])
                iw = np.maximum(ixmax - ixmin + 1., 0.)
                ih = np.maximum(iymax - iymin + 1., 0.)
                inters = iw * ih

                uni = ((bb[2] - bb[0] + 1.) * (bb[3] - bb[1] + 1.) +
                       (BBGT[:, 2] - BBGT[:, 0] + 1.) *
                       (BBGT[:, 3] - BBGT[:, 1] + 1.) - inters)

                overlaps = inters / uni#重叠率
                ovmax = np.max(overlaps)#按重叠率的大小从大到小排序重叠率
                jmax = np.argmax(overlaps)#根据重叠率大小重新排序的索引

            if ovmax > ovthresh:#ovthresh=0.5阈值为0.5，判断tp和fp
                if not R['difficult'][jmax]:
                    if not R['det'][jmax]:
                        tp[d] = 1.
                        print(tp)
                        R['det'][jmax] = 1
                    else:
                        fp[d] = 1.
            else:
                fp[d] = 1.
        else:
            #print("*************************************************************"+str(d))
            fp[d]=1.


        
    fp = np.cumsum(fp)
    tp = np.cumsum(tp)
    #rec=tp/正样本数
    rec = tp / float(npos)
    print(len(rec))
    # print(type(rec))
    #perc=tp/(tp+fp)
    prec = tp / np.maximum(tp + fp, np.finfo(np.float64).eps)
    print(len(prec))
    # print(type(prec))
    ap = voc_ap(rec, prec, use_07_metric)
    print(ap)
    ###################################################
    
    # mAP_rec+=rec
    # print("mAP_rec:"+str(mAP_rec)+"\n")
    # mAP_prec+=prec
    # print("mAP_prec:"+str(mAP_prec)+"\n")
    # mAP+=ap
    # print("mAP:"+str(mAP)+"\n")
    ###################################################
    # if not pr_data.has_key(cls_name[cls_num]):
    # rec=np.array(rec)
    # prec=np.array(prec)
    # ap=np.array(ap)
    pr_data_map=np.array([rec.tolist(),prec.tolist()])#,ap.tolist()]
    pr_data_map=pr_data_map.tolist()

    #print(pr_data_map)
    return pr_data_map 


def get_pr_data(prediction_file,ground_truth_file,cls_name,use_07_metric,ovthresh=0.5):
    with open(ground_truth_file, 'r') as f:####读取各个算法的gt.txt文件
        lines_gt = f.readlines()          #gt.txt的每一行，line_gt为一维数组
    with open(prediction_file, 'r') as f:####读取各个算法的pre.txt文件
        lines_pre = f.readlines()         #读取pre.txt的每一行，lines_gt为一维数组

    pr_data={}
    
    for cls_num in range(len(cls_name)): #遍历每个类别数
    #ground_truth  
    	#根据txt文件中每一行的空格进行划分，splitlines_gt为二维数组，行为每一行的数据，列为每一行的数据划分
        splitlines_gt = [x.strip().split(' ') for x in lines_gt]
        # print(splitlines_gt)
        #imagenames_gt为一维数组，当第二列的类别与遍历的类别数相同时，将第一列加入到imagenames_gt中
        # imagenames_gt = []
        # for x in splitlines_gt:
        #     if int(x[1]) - 1 == cls_num:
        #         imagenames_gt = x[0]
        imagenames_gt = [ x[0] for x in splitlines_gt if int(x[1]) - 1 ==cls_num]
                # print(imagenames_gt)   



        #BB_gt为二维数组，当第二列的类别与遍历的类别数相同时，将第二列之后的加入到BB_gt中
        BB_gt=np.array([[math.ceil(float(z)) for z in x[2:]] for x in splitlines_gt if int(x[1]) - 1 ==cls_num])
        #创建class_recs数组
        class_recs = {}
        for i in range(len(imagenames_gt)):
            # import pdb
            # pdb.set_trace()
            #如果类别中没有该遍历的类别，则更新
            if imagenames_gt[i] not in class_recs:
                
                class_recs.update({imagenames_gt[i]:{"bbox":[],"det":[],"difficult":[]}})
            #否则将BB_gt[i]插入到类别框信息中
            class_recs[imagenames_gt[i]]["bbox"].append(BB_gt[i])
            class_recs[imagenames_gt[i]]["det"].append(False)
            class_recs[imagenames_gt[i]]["difficult"].append(False)
            # print(class_recs[imagenames_gt])
        #npos为imagenames_gt的长度
        npos=len(imagenames_gt)
        # print("##################")
        # print(class_recs)
    
    #prediction 
    	#对预测框的每一行根据空格划分，splitlines_pre为二维数组   
        splitlines_pre = [x.strip().split(' ') for x in lines_pre] 
        # print(splitlines_pre)
        #image_ids为一维数组，当第二列的类别与遍历的类别数相同时，将第一列加入到image_ids中     
        image_ids = [ x[0] for x in splitlines_pre if int(x[1]) - 1 ==cls_num] 
        # print(cls_num)
		#提取该类别预测框的置信度，当第二列的类别与遍历的类别数相同时，将第三列加入到confidence中
        confidence = np.array([float(x[2]) for x in splitlines_pre if int(x[1]) - 1 ==cls_num]) 
        # print("********", confidence)
        #提取该类别预测框的bounging-boxes（二维数组），当第二列的类别与遍历的类别数相同时,将第三列之后的BBOX坐标插入
        BB_pre = np.array([[float(z) for z in x[3:]] for x in splitlines_pre if int(x[1]) - 1 ==cls_num]) 
        #按置信度大小将其索引从小到大排序（生成有顺序的一维数组）
        sorted_ind = np.argsort(-confidence)
        # print(sorted_ind,"*************")
        #按置信度大小将置信度从小到大排序（生成有顺序的一维数组）
        sorted_scores = np.sort(-confidence)
        #根据索引排序相应的bbox的坐标值（生成按置信度大小排列的二维数组）
        BB_pre = BB_pre[sorted_ind, :]
        #按置信度大小重新排列image_ids
        image_ids = [image_ids[x] for x in sorted_ind]
        #该类别的预测框的数量
        nd = len(image_ids)
        tp = np.zeros(nd)#将长度为nd数组置0
        fp = np.zeros(nd)

        # print(nd)###############################################################################################################
        for d in range(nd):
        	#按置信度顺序提取类别框
        	#print(image_ids[d].get(image_ids[d]))
            # if class_recs[image_ids[d]] not in class_recs:
            
            if class_recs.get(image_ids[d]):
                
                # print(image_ids[d])
                R = class_recs[image_ids[d]]
                #print(R)
                bb = np.array(BB_pre[d, :]).astype(float)#按置信度顺序提取bounding box
                ovmax = -np.inf#ovmax为负无穷大的数
                BBGT = np.array(R['bbox']).astype(float)#按置信度顺序提取groudtruth bbox

	            #计算iou
                if BBGT.size > 0:
                    ixmin = np.maximum(BBGT[:, 0], bb[0])
                    iymin = np.maximum(BBGT[:, 1], bb[1])
                    ixmax = np.minimum(BBGT[:, 2], bb[2])
                    iymax = np.minimum(BBGT[:, 3], bb[3])
                    # ixmin = max(BBGT[:, 0], bb[0])
                    # iymin = max(BBGT[:, 1], bb[1])
                    # ixmax = min(BBGT[:, 2], bb[2])
                    # iymax = min(BBGT[:, 3], bb[3])
                    # import pdb
                    # pdb.set_trace()
                    # if ixmin >= ixmax or iymax <= iymin:
                    #     return 0
                    # else:
                    # S1 = (BBGT[2]-BBGT[0])*(BBGT[3]-BBGT[1])
                    # S2 = (bb[2]-bb[0])*(bb[3]-bb[1])
                    iw = np.maximum(ixmax - ixmin + 1., 0.)
                    ih = np.maximum(iymax - iymin + 1., 0.)
                    inters = iw * ih

                    uni = ((bb[2] - bb[0] + 1.) * (bb[3] - bb[1] + 1.) +
                           (BBGT[:, 2] - BBGT[:, 0] + 1.) *
                           (BBGT[:, 3] - BBGT[:, 1] + 1.) - inters)

                    overlaps = inters / uni#重叠率
                    ovmax = np.max(overlaps)#按重叠率的大小从大到小排序重叠率
                    jmax = np.argmax(overlaps)#根据重叠率大小重新排序的索引

                if ovmax > ovthresh:#ovthresh=0.5阈值为0.5，判断tp和fp
                    if not R['difficult'][jmax]:
                        if not R['det'][jmax]:
                            tp[d] = 1.
                            R['det'][jmax] = 1
                        else:
                            fp[d] = 1.
                else:
                    fp[d] = 1.
            else:
                #print("*************************************************************"+str(d))
                fp[d]=1.


        # import pdb
        # pdb.set_trace()    
        print(cls_name[cls_num],  np.sum(tp), np.sum(fp), npos)
        fp = np.cumsum(fp)
        tp = np.cumsum(tp)
        #rec=tp/正样本数
        rec = tp / float(npos)
        #print(rec)
        # print(type(rec))
        #perc=tp/(tp+fp)
        prec = tp / np.maximum(tp + fp, np.finfo(np.float64).eps)
        #?print(prec)
        # print(type(prec))
        ap = voc_ap(rec, prec, use_07_metric)
        #print(ap)
        ###################################################
        
        # mAP_rec+=rec
        # print("mAP_rec:"+str(mAP_rec)+"\n")
        # mAP_prec+=prec
        # print("mAP_prec:"+str(mAP_prec)+"\n")
        # mAP+=ap
        # print("mAP:"+str(mAP)+"\n")
        ###################################################
        if cls_name[cls_num] not in pr_data:
            pr_data.update({cls_name[cls_num]:[rec,prec,ap]})
    return pr_data #,mAP/7,np.array(mAP_rec/7),np.array(mAP_prec/7)

def voc_ap(rec, prec, use_07_metric):#由于use_07_metric=true时计算结果于实际更接近
    #计算ap，use_07_metric=true，# 2010年以前按recall等间隔取11个不同点处的精度值做平均(0., 0.1, 0.2, …, 0.9, 1.0)
    if use_07_metric:
        ap = 0.
        for t in np.arange(0., 1.1, 0.1):#（[0.0,0.1,0.2,0.3,...,1.0])
            #print(11111111111111111111111111)
            if np.sum(rec >= t) == 0:
                p = 0
            else:
                p = np.max(prec[rec >= t])
            ap = ap + p / 11.
    #use_07_metric=false # 2010年以后取所有不同的recall对应的点处的精度值做平均
    else:
        mrec = np.concatenate(([0.], rec, [1.]))
        mpre = np.concatenate(([0.], prec, [0.]))
        for i in range(mpre.size - 1, 0, -1):
            mpre[i - 1] = np.maximum(mpre[i - 1], mpre[i])
        i = np.where(mrec[1:] != mrec[:-1])[0]
        ap = np.sum((mrec[i + 1] - mrec[i]) * mpre[i + 1])
    return ap

def draw_pr(prediction_file,prediction_algorithm,ground_truth_file,cls,use_07_metric):
    # for i in range(len(prediction_file)):
    #     pr_data=get_pr_data("Prediction/"+prediction_file[i],"Ground_Truth/"+ground_truth_file,cls)
    #     for cls_name in pr_data:
    #         plt.plot(pr_data[cls_name][0],pr_data[cls_name][1],label=prediction_algorithm[i]+' mAP='+str(round(pr_data[cls_name][2],3)))
    #         title='PR Curve of '+cls_name
    #         plt.title(title)
    #         plt.xlabel('Recall')
    #         plt.ylabel('Precision')
    #         plt.ylim([0.0, 1.0])
    #         plt.xlim([0.0, 1.0])
    #         plt.grid(ls='-.')
    #         plt.legend()
    #         plt.savefig("Images/"+title+'.png', dpi=300)
    #         plt.show()
    #         plt.close()
    # for i in range(len(prediction_file)):
    #     pr_data=get_pr_data("Prediction/"+prediction_file[i],"Ground_Truth/"+ground_truth_file,cls)

    #画多个算法多个类别的pr曲线
    MAP=[0,0,0,0,0,0]#用于统计map
    for cls_name in cls:
        for i in range(len(prediction_file)):
            
            pr_data=get_pr_data("Prediction/"+prediction_file[i],"Ground_Truth/"+ground_truth_file[i],cls,use_07_metric[i])
            # import pdb
            # pdb.set_trace()
            MAP[i]=MAP[i]+pr_data[cls_name][2]
            # print(pr_data)

            if i==0:
                plt.plot(pr_data[cls_name][0],pr_data[cls_name][1],label=prediction_algorithm[i]+' AP='+str(round(pr_data[cls_name][2],3)),color="#054E9F")
            elif i==1:
                plt.plot(pr_data[cls_name][0],pr_data[cls_name][1],label=prediction_algorithm[i]+' AP='+str(round(pr_data[cls_name][2],3)),color="#FFA500")
            elif i==2:
                plt.plot(pr_data[cls_name][0],pr_data[cls_name][1],label=prediction_algorithm[i]+' AP='+str(round(pr_data[cls_name][2],3)),color="#B0C4DE")
            elif i==3:
                plt.plot(pr_data[cls_name][0],pr_data[cls_name][1],label=prediction_algorithm[i]+' AP='+str(round(pr_data[cls_name][2],3)),color="#008000")
            elif i==4:
                plt.plot(pr_data[cls_name][0],pr_data[cls_name][1],label=prediction_algorithm[i]+' AP='+str(round(pr_data[cls_name][2],3)),color="#BA55D3")
            elif i==5:
                plt.plot(pr_data[cls_name][0],pr_data[cls_name][1],label=prediction_algorithm[i]+' AP='+str(round(pr_data[cls_name][2],3)),color="#FF0000")
            # plt.plot(pr_data[cls_name][0],pr_data[cls_name][1],label=prediction_algorithm[i]+' AP='+str(round(pr_data[cls_name][2],3)),color="#054E9F")
            title=cls_name+' PR Curve'
            plt.title(title,fontsize=15)
            plt.xlabel('Recall',fontsize=15)
            plt.ylabel('Precision',fontsize=15)
            plt.ylim([0.0, 1.0])
            plt.xlim([0.0, 1.0])
            plt.grid(ls='-.')
            plt.legend()
            plt.savefig("Image_PR/"+title+'.png', dpi=600)
            print("save....ok!!!")
        plt.show()
        plt.close() 

    #画各个算法map的pr曲线
    # for i in range(len(prediction_file)): 
    #     pr_data_map=get_pr_data_map("Prediction/"+prediction_file[i],"Ground_Truth/"+ground_truth_file[i],cls,use_07_metric[i])
    #     if i==0:
    #         plt.plot(pr_data_map[0],pr_data_map[1],label=prediction_algorithm[i]+' mAP='+str(round(MAP[i]/len(cls),3)),color="#054E9F")
    #     elif i==1:
    #         plt.plot(pr_data_map[0],pr_data_map[1],label=prediction_algorithm[i]+' mAP='+str(round(MAP[i]/len(cls),3)),color="#FFA500")
    #     elif i==2:
    #         plt.plot(pr_data_map[0],pr_data_map[1],label=prediction_algorithm[i]+' mAP='+str(round(MAP[i]/len(cls),3)),color="#B0C4DE")
    #     elif i==3:
    #         plt.plot(pr_data_map[0],pr_data_map[1],label=prediction_algorithm[i]+' mAP='+str(round(MAP[i]/len(cls),3)),color="#008000")
    #     elif i==4:
    #         plt.plot(pr_data_map[0],pr_data_map[1],label=prediction_algorithm[i]+' mAP='+str(round(MAP[i]/len(cls),3)),color="#BA55D3")
    #     elif i==5:
    #         plt.plot(pr_data_map[0],pr_data_map[1],label=prediction_algorithm[i]+' mAP='+str(round(MAP[i]/len(cls),3)),color="#FF0000")
    #     #plt.plot(pr_data_map[0],pr_data_map[1],label=prediction_algorithm[i]+' mAP='+str(round(MAP[i]/len(cls),3)),color="#054E9F")
    #     title='Precision-Recall Curve'
    #     plt.title(title,fontsize=15)
    #     plt.xlabel('Recall',fontsize=15)
    #     plt.ylabel('Precision',fontsize=15)
    #     plt.ylim([0.0, 1.0])
    #     plt.xlim([0.0, 1.0])
    #     plt.grid(ls='-.')
    #     plt.legend()
    #     plt.savefig("Image_PR/"+title+'.png', dpi=600)
    #     print("save....ok!!!")
    # plt.show()
    # plt.close()

你可能感兴趣的:(#,航拍,&,人脸数据集,计算机视觉,python,目标检测)

Python-tkinter自制登录界面（含注册） GCHEK python 开发语言
简单的用户登录、注册界面importtkinterastkimporttimeimportsubprocessimportsysimportosimporttkinter.messageboxwindow=tk.Tk()window.title('GCHEK')window.geometry('400x300')#设置储存用户信息的容器，这里用的txt。ifnotos.path.exists('U
Python爬虫requests(详细) dme. Python爬虫零基础入门爬虫 python
本文来学爬虫使用requests模块的常见操作。1.URL参数无论是在发送GET/POST请求时，网址URL都可能会携带参数，例如：http://www.5xclass.cn?age=19&name=dengres=requests.get(url="https://www.5xclass.cn?age=19&name=deng")res=requests.get(url="https://www
解决安装 Node 出现的问题 code_stream #其他内容 node.js
日期：2025-2-16最近要开启一个新项目，我需要使用最新的Node环境。但是我重装之后，出现了一些列的问题，参考网络上的教程，基本上都无法解决，什么配置环境变量，什么创建文件夹，都没有作用，教程太落后了，问AI也是绕圈，毕竟AI的数据集也是来自互联网。最后总算解决了。方式就是，傻瓜式安装（下载node后，安装一直下一步就好，它会帮你完成一切配置），安装之后，最重要的一步来了，记得重启电脑！！！
RHEL 安装 Hadoop 服务器 XhClojure hadoop 服务器大数据
在这篇文章中，我们将探讨如何在RedHatEnterpriseLinux(RHEL)上安装和配置Hadoop服务器。Hadoop是一个开源的分布式数据处理框架，用于处理大规模数据集。以下是在RHEL上安装Hadoop的详细步骤。步骤1：安装Java在安装Hadoop之前，我们需要确保系统上安装了JavaDevelopmentKit(JDK)。执行以下命令安装JDK：sudoyuminstallja
100道计算机网络面试八股文（答案、分析和深入提问）整理守护海洋的猫计算机网络面试职场和发展 python django
1.说一说POST与GET有哪些区别回答在计算机网络中，POST和GET是HTTP协议中两种主要的请求方法，它们各自具有不同的特性和用途。下面是二者的主要区别：1.数据传输方式GET：数据通过URL传递，参数以查询字符串的形式附加在URL后面。示例：http://example.com/api?name=value&age=30POST：数据包含在HTTP请求的主体部分，数据不会显示在URL中。示
使用python计算等比数列求和的方法 HAMYHF windows
在python中，计算Sum=m+mm+mmm+mmmm+.....+mmmmm.....,输入两个数m,n。m的位数累加到n的值，列出算式并计算出结果：#为了打印出算式，并计算出结果，将m,mm这些放入到列表中#定义列表中的m初始值为0,用Ele来代表m,mm....Ele=0#定义总和为0Sum=0#定义一个空列表List=[]#输入两个值n=int(input("inputadigit：")
Python+Playwright常用元素定位方法 HAMYHF python 功能测试
CSSselector选择器在CSS中，定位元素主要通过选择器完成，以下是几种常见的CSS选择器定位方法：标签选择器(element):直接使用HTML元素名称来定位，例如p会选择所有段落元素。属性选择器(attribute):选择所有具有指定属性的元素，无论该属性的值是什么。例如，[title]会选择所有包含title属性的元素。选择具有指定属性，并且该属性值完全等于给定值的元素。例如，[typ
图像识别与应用狂踹瘸子那条好脚 python
图像识别作为人工智能领域的重要分支，近年来取得了显著进展，其中卷积神经网络（CNN）功不可没。CNN凭借其强大的特征提取能力，在图像分类、目标检测、人脸识别等任务中表现出色，成为图像识别领域的核心技术。一、卷积神经网络：图像识别的利器CNN是一种专门处理网格状数据的深度学习模型，其结构设计灵感来源于生物视觉系统。与全连接神经网络不同，CNN通过卷积层、池化层等结构，能够有效提取图像的局部特征，并逐
Python中的 redis keyspace 通知_python 操作redis psubscribe(‘__keyspace@0__ ‘) 2301_82243733 程序员 python 学习面试
最后Python崛起并且风靡，因为优点多、应用领域广、被大牛们认可。学习Python门槛很低，但它的晋级路线很多，通过它你能进入机器学习、数据挖掘、大数据，CS等更加高级的领域。Python可以做网络应用，可以做科学计算，数据分析，可以做网络爬虫，可以做机器学习、自然语言处理、可以写游戏、可以做桌面应用…Python可以做的很多，你需要学好基础，再选择明确的方向。这里给大家分享一份全套的Pytho
Python数据分析与可视化程序媛小果 python python 数据分析开发语言
Python数据分析与可视化在数据驱动的商业世界中，数据分析和可视化成为了理解复杂数据集、做出明智决策的关键工具。Python，作为一种功能强大且易于学习的编程语言，提供了丰富的库和框架，使得数据分析和可视化变得简单高效。本文将探讨Python在数据分析和可视化中的应用，包括数据预处理、分析、以及如何通过可视化工具将数据洞察转化为可操作的策略。1.数据分析的重要性数据分析是提取数据中有用信息的过程
代理IP助力AI图像处理，开启行业新篇章傻啦嘿哟关于代理IP那些事儿人工智能 tcp/ip 图像处理
目录一、代理IP技术简介二、代理IP在AI图像处理中的应用1.提升数据访问速度2.增强数据处理能力3.突破网络限制三、代理IP在AI图像处理中的实际案例案例一：AI图像生成软件案例二：AI动画创作四、代理IP技术的未来展望五、结语在科技日新月异的今天，AI图像处理技术以其广泛的应用前景和强大的处理能力，正深刻改变着我们的世界。从人脸识别、自动驾驶到医学影像分析，AI图像处理技术无处不在，发挥着不可
【Python 学习 / 7】模块与文件操作卜及中 Python基础 python 学习数据库
文章目录前言一、导入模块1.导入整个模块2.导入模块中的特定函数3.给模块或函数起别名二、常用模块1.`math`模块2.`random`模块3.`os`模块4.`sys`模块三、文件处理1.打开文件2.读取文件3.写入文件4.关闭文件5.使用`with`语句管理文件四、日期时间1.`datetime`模块获取当前日期和时间创建日期和时间对象格式化日期和时间解析字符串为日期对象2.`time`模块
焦虑驱动的成长：从Bushcraft到AI的启示
腾讯的IMA工具，将公众号和我个人的知识库融合在一起，精准地回答了这个问题：Bushcraft和Glamping玩法的区别是什么？我在想，2019年那时，我受长安邀请，参加了《天空下周末》的Glamping大会，我们创建了Bushcraft野营区，野性和Glamping的文明有些格格不入。那个时候，我被一个问题困扰：都是美好的生活方式，Glamping我喜欢，Bushcraft我也喜欢，到底应该选
经销商管理系统架构设计方案（附 Java版本和Python版本源代码详解） AI天才研究院 DeepSeek R1 &大数据AI人工智能大模型 AI大模型企业级应用开发实战 AI大模型应用入门实战与进阶计算科学神经计算深度学习神经网络大数据人工智能大型语言模型 AI AGI LLM Java Python 架构设计 Agent RPA
经销商管理系统架构设计方案（Java实现源代码详解）关键词：经销商管理系统，Java，SpringBoot，MyBatis，MySQL，架构设计，源代码1.背景介绍随着市场竞争的日益激烈，企业对经销商的管理越来越重视。传统的经销商管理方式效率低下，信息滞后，难以适应现代企业的发展需求。为了提高经销商管理效率，降低运营成本，越来越多的企业开始采用信息化的手段来管理经销商，而经销商管理系统应运而生。经
Python:数据从Excel表格链接到Word文档更新Excel即可自动更新Word 一个花生米生花 python excel word
要使用Python来创建或更新一个Word文档，并将数据从Excel表格链接到Word文档中，你可以使用python-docx库来操作Word文档和openpyxl或pandas库来读取Excel文件。不过，需要注意的是，python-docx库并不支持将外部文件链接到Word文档的功能。你可以在Word文档中插入Excel数据的快照，但它们不会自动更新。如果你想要在Word文档中插入Excel数
使用Odoo Shell卸载模块 odoo中国 odoo odoo 开源软件 erp
使用OdooShell卸载模块我们在Odoo使用过程中，因为模块安装错误或者前端错误等导致odoo无法通过界面登录，这时候你可以使用OdooShell来卸载模块。OdooShell是一个交互式Pythonshell，允许你直接与Odoo数据库和模型进行交互。以下是使用OdooShell卸载模块的详细步骤：步骤1：启动OdooShell要启动OdooShell，你需要在终端中运行以下命令。确保你已经
NumPy的基本使用 Mo思编程学习 numpy python 开发语言 pip
在Python的数据科学与数值计算领域，NumPy无疑是一颗耀眼的明星。作为Python中用于科学计算的基础库，NumPy提供了高效的多维数组对象以及处理这些数组的各种工具。本文将带您深入了解NumPy的基本使用，感受它的强大魅力。一、安装与导入在使用NumPy之前，首先要确保它已经安装在您的Python环境中。如果您使用的是Anaconda发行版，NumPy通常已经预装。若未安装，可以使用如下命
FOKS-TROT: 一个高效、易用的全功能开源知识图谱生成工具柳旖岭
FOKS-TROT:一个高效、易用的全功能开源知识图谱生成工具项目简介FOKS-TROT是一个基于Python的全功能开源知识图谱生成工具，旨在帮助研究人员和开发者快速构建具有丰富信息的知识图谱。该项目由hkx3upper在GitCode上开发并维护。通过FOKS-TROT，您可以轻松地将各种数据源（如文本文件、数据库、API）转换为结构化的知识图谱，并对其进行可视化分析和机器学习任务。此外，该工
python实现word文档合并 v2.0 task138 python自动化 python 自动化运维开发
目录前言要求运行效果脚本下载链接前言之前发表了一个小工具，python用于合并word文档以完成特定的工作任务，现在领导给出了新需求，适当的调整了一下word文档的合并情况。同时，各位同事反馈说，环境部署太难了，脚本的使用成本比较高，难度大，所以我这次把脚本打包成一个EXE可执行文件，直接双击即可使用。要求由于脚本的具体逻辑发生了变化，因此，exe文件的同级目录下，一定要存在一个txt文件，否则无
2025年全国CTF夺旗赛-从零基础入门到竞赛，看这一篇就稳了！白帽安全-黑客4148 安全 web安全网络网络安全 CTF
目录一、CTF简介二、CTF竞赛模式三、CTF各大题型简介四、CTF学习路线4.1、初期1、html+css+js（2-3天）2、apache+php（4-5天）3、mysql（2-3天）4、python(2-3天)5、burpsuite（1-2天）4.2、中期1、SQL注入（7-8天）2、文件上传（7-8天）3、其他漏洞（14-15天）4.3、后期五、CTF学习资源5.1、CTF赛题复现平台5.
2025年全国CTF夺旗赛-从零基础入门到竞赛，看这一篇就稳了！白帽安全-黑客4148 网络安全 web安全 linux 密码学 CTF
目录一、CTF简介二、CTF竞赛模式三、CTF各大题型简介四、CTF学习路线4.1、初期1、html+css+js（2-3天）2、apache+php（4-5天）3、mysql（2-3天）4、python(2-3天)5、burpsuite（1-2天）4.2、中期1、SQL注入（7-8天）2、文件上传（7-8天）3、其他漏洞（14-15天）4.3、后期五、CTF学习资源5.1、CTF赛题复现平台5.
基于python深度学习遥感影像地物分类与目标识别、分割实践技术应用 xiao5kou4chang6kai4 深度学习遥感勘测 python 深度学习分类
专题一：深度学习发展与机器学习深度学习的历史发展过程机器学习，深度学习等任务的基本处理流程梯度下降算法讲解不同初始化，学习率对梯度下降算法的实例分析从机器学习到深度学习算法专题二深度卷积网络、卷积神经网络、卷积运算的基本原理池化操作，全连接层，以及分类器的作用BP反向传播算法的理解一个简单CNN模型代码理解特征图，卷积核可视化分析专题三TensorFlow与keras介绍与入门TensorFlow
python 快速实现链接转 word 文档嘿嘿潶黑黑 python word
python快速实现链接转word文档演示代码展示最后演示代码展示fromnewspaperimportArticlefromdocximportDocumentfromdocx.sharedimportPt,RGBColorfromdocx.enum.styleimportWD_STYLE_TYPEfromdocx.oxml.nsimportqn#tkinterGUIimporttkintera
Python入门笔记「已注销」计算机
文章目录第0周课程导学第1周Python基本语法元素保留字数据类型语句与函数输入函数第2周Python基本图形绘制turtle库绝对坐标海龟坐标turtle角度坐标体系RGB色彩体系画笔控制函数运动控制函数方向控制函数循环语句第3周基本数据类型整型浮点数科学计数法复数类型数值运算操作符二元操作符有对应的增强赋值操作符数值运算函数字符串类型的表示字符串切片字符串类型及操作字符串类型格式化time库时
pythonxml模块高级用法_Python minidom模块用法示例【DOM写入和解析XML】 Lucy-露西娅 pythonxml模块高级用法
本文实例讲述了Pythonminidom模块用法。分享给大家供大家参考，具体如下：一、DOM写XML文件#-*-coding:utf-8-*-#!python3#导入minidomfromxml.domimportminidom#1.创建DOM树对象dom=minidom.Document()#2.创建根节点。每次都要用DOM对象来创建任何节点。root_node=dom.createElemen
CCNP350-401学习笔记（351-400题）殊彦_sy CCNP题库学习
351、WhichnewenhancementwasimplementedinWi-Fi6?A.4096QuadratureAmplitudeModulationModeB.ChannelbondingC.Wi-FiProtectedAccess3D.UplinkandDownlinkOrthogonalFrequencyDivisionMultipleAccess352、HowdoesIGMPf
React 渲染 Flash 接口数据 ox0080 #北漂+滴滴出行 VIP 激励 Web react.js 前端前端框架
1.后端Python代码使用Flask创建多个接口，每个接口返回不同的数据，并使用自定义装饰器来绑定路由。代码：#app.pyfromflaskimportFlask,jsonifyapp=Flask(__name__)defapi_route(route,methods=['GET']):"""自定义装饰器，用于将函数与HTTP路由绑定"""defdecorator(func):app.rout
LQB---基础练习---十六进制转八进制「已注销」 #LQB LQB
试题基础练习十六进制转八进制资源限制内存限制：512.0MBC/C++时间限制：1.0sJava时间限制：3.0sPython时间限制：5.0s问题描述给定n个十六进制正整数，输出它们对应的八进制数。输入格式输入的第一行为一个正整数n（1<=n<=10）。接下来n行，每行一个由09、大写字母AF组成的字符串，表示要转换的十六进制正整数，每个十六进制数长度不超过100000。输出格式输出n行，每行为
【2025年】全国CTF夺旗赛-从零基础入门到竞赛，看这一篇就稳了！网安詹姆斯 web安全 CTF 网络安全大赛 python linux
【2025年】全国CTF夺旗赛-从零基础入门到竞赛，看这一篇就稳了！基于入门网络安全/黑客打造的：黑客&网络安全入门&进阶学习资源包目录一、CTF简介二、CTF竞赛模式三、CTF各大题型简介四、CTF学习路线4.1、初期1、html+css+js（2-3天）2、apache+php（4-5天）3、mysql（2-3天）4、python(2-3天)5、burpsuite（1-2天）4.2、中期1、S
机器学习·文本数据读写处理 AAA顶置摸鱼 python 深度学习机器学习人工智能数据处理
前言在自然语言处理的第一步，需要面对的是各种各样以不同形式表现的文本数据，比如，txt、Excel中的表格数据，还有无法直接打开的pkl文件等。针对这些不同类型的数据，可以基于Python中的基本功能函数或者调用某些库进行读写以及作一些基本的处理。一、文本数据读写方法1.读写TXT文件读取方法：read()：读取整个文件，返回字符串。readline()：逐行读取，返回字符串。readlines(
解读Servlet原理篇二---GenericServlet与HttpServlet 周凡杨 java HttpServlet 源理 GenericService 源码
在上一篇《解读Servlet原理篇一》中提到，要实现javax.servlet.Servlet接口（即写自己的Servlet应用），你可以写一个继承自javax.servlet.GenericServletr的generic Servlet ，也可以写一个继承自java.servlet.http.HttpServlet的HTTP Servlet（这就是为什么我们自定义的Servlet通常是exte
MySQL性能优化 bijian1013 数据库 mysql
性能优化是通过某些有效的方法来提高MySQL的运行速度，减少占用的磁盘空间。性能优化包含很多方面，例如优化查询速度，优化更新速度和优化MySQL服务器等。本文介绍方法的主要有： a.优化查询 b.优化数据库结构
ThreadPool定时重试 dai_lm java ThreadPool thread timer timertask
项目需要当某事件触发时，执行http请求任务，失败时需要有重试机制，并根据失败次数的增加，重试间隔也相应增加，任务可能并发。由于是耗时任务，首先考虑的就是用线程来实现，并且为了节约资源，因而选择线程池。为了解决不定间隔的重试，选择Timer和TimerTask来完成 package threadpool; public class ThreadPoolTest {
Oracle 查看数据库的连接情况周凡杨 sql oracle 连接
首先要说的是，不同版本数据库提供的系统表会有不同，你可以根据数据字典查看该版本数据库所提供的表。 select * from dict where table_name like '%SESSION%'; 就可以查出一些表，然后根据这些表就可以获得会话信息 select sid,serial#,status,username,schemaname,osuser,terminal,ma
类的继承朱辉辉33 java
类的继承可以提高代码的重用行，减少冗余代码；还能提高代码的扩展性。Java继承的关键字是extends 格式:public class 类名（子类）extends 类名（父类）{ } 子类可以继承到父类所有的属性和普通方法，但不能继承构造方法。且子类可以直接使用父类的public和 protected属性，但要使用private属性仍需通过调用。子类的方法可以重写，但必须和父类的返回值类
android 悬浮窗特效肆无忌惮_ android
最近在开发项目的时候需要做一个悬浮层的动画，类似于支付宝掉钱动画。但是区别在于，需求是浮出一个窗口，之后边缩放边位移至屏幕右下角标签处。效果图如下：一开始考虑用自定义View来做。后来发现开线程让其移动很卡，ListView+动画也没法精确定位到目标点。后来想利用Dialog的dismiss动画来完成。自定义一个Dialog后，在styl
hadoop伪分布式搭建林鹤霄 hadoop
要修改4个文件 1: vim hadoop-env.sh 第九行 2: vim core-site.xml <configuration> &n
gdb调试命令 aigo gdb
原文：http://blog.csdn.net/hanchaoman/article/details/5517362 一、GDB常用命令简介 r run 运行.程序还没有运行前使用 c cuntinue
Socket编程的HelloWorld实例 alleni123 socket
public class Client { public static void main(String[] args) { Client c=new Client(); c.receiveMessage(); } public void receiveMessage(){ Socket s=null; BufferedRea
线程同步和异步百合不是茶线程同步异步
多线程和同步 : 如进程、线程同步，可理解为进程或线程A和B一块配合，A执行到一定程度时要依靠B的某个结果，于是停下来，示意B运行；B依言执行，再将结果给A；A再继续操作。所谓同步，就是在发出一个功能调用时，在没有得到结果之前，该调用就不返回，同时其它线程也不能调用这个方法多线程和异步:多线程可以做不同的事情,涉及到线程通知 &
JSP中文乱码分析 bijian1013 java jsp 中文乱码
在JSP的开发过程中，经常出现中文乱码的问题。首先了解一下Java中文问题的由来： Java的内核和class文件是基于unicode的，这使Java程序具有良好的跨平台性，但也带来了一些中文乱码问题的麻烦。原因主要有两方面，
js实现页面跳转重定向的几种方式 bijian1013 JavaScript 重定向
js实现页面跳转重定向有如下几种方式：一.window.location.href <script language="javascript"type="text/javascript"> window.location.href="http://www.baidu.c
【Struts2三】Struts2 Action转发类型 bit1129 struts2
在【Struts2一】 Struts Hello World http://bit1129.iteye.com/blog/2109365中配置了一个简单的Action，配置如下 <!DOCTYPE struts PUBLIC "-//Apache Software Foundation//DTD Struts Configurat
【HBase十一】Java API操作HBase bit1129 hbase
Admin类的主要方法注释： 1. 创建表 /** * Creates a new table. Synchronous operation. * * @param desc table descriptor for table * @throws IllegalArgumentException if the table name is res
nginx gzip ronin47 nginx gzip
Nginx GZip 压缩 Nginx GZip 模块文档详见：http://wiki.nginx.org/HttpGzipModule 常用配置片段如下： gzip on; gzip_comp_level 2; # 压缩比例，比例越大，压缩时间越长。默认是1 gzip_types text/css text/javascript; # 哪些文件可以被压缩 gzip_disable &q
java-7.微软亚院之编程判断俩个链表是否相交给出俩个单向链表的头指针，比如 h1 ， h2 ，判断这俩个链表是否相交 bylijinnan java
public class LinkListTest { /** * we deal with two main missions: * * A. * 1.we create two joined-List(both have no loop) * 2.whether list1 and list2 join * 3.print the join
Spring源码学习-JdbcTemplate batchUpdate批量操作 bylijinnan java spring
Spring JdbcTemplate的batch操作最后还是利用了JDBC提供的方法，Spring只是做了一下改造和封装 JDBC的batch操作： String sql = "INSERT INTO CUSTOMER " + "(CUST_ID, NAME, AGE) VALUES (?, ?, ?)";
[JWFD开源工作流]大规模拓扑矩阵存储结构最新进展 comsci 工作流
生成和创建类已经完成,构造一个100万个元素的矩阵模型,存储空间只有11M大,请大家参考我在博客园上面的文档"构造下一代工作流存储结构的尝试",更加相信的设计和代码将陆续推出......... 竞争对手的能力也很强.......,我相信..你们一定能够先于我们推出大规模拓扑扫描和分析系统的....
base64编码和url编码 cuityang base64 url
import java.io.BufferedReader; import java.io.IOException; import java.io.InputStreamReader; import java.io.PrintWriter; import java.io.StringWriter; import java.io.UnsupportedEncodingException;
web应用集群Session保持 dalan_123 session
关于使用 memcached 或redis 存储 session ，以及使用 terracotta 服务器共享。建议使用 redis，不仅仅因为它可以将缓存的内容持久化，还因为它支持的单个对象比较大，而且数据类型丰富，不只是缓存 session，还可以做其他用途，一举几得啊。1、使用 filter 方法存储这种方法比较推荐，因为它的服务器使用范围比较多，不仅限于tomcat ，而且实现的原理比较简
Yii 框架里数据库操作详解-[增加、查询、更新、删除的方法 'AR模式'] dcj3sjt126com 数据库
public function getMinLimit () { $sql = "..."; $result = yii::app()->db->createCo
solr StatsComponent（聚合统计） eksliang solr聚合查询 solr stats
StatsComponent 转载请出自出处：http://eksliang.iteye.com/blog/2169134 http://eksliang.iteye.com/ 一、概述 Solr可以利用StatsComponent 实现数据库的聚合统计查询，也就是min、max、avg、count、sum的功能二、参数
百度一道面试题 greemranqq 位运算百度面试寻找奇数算法 bitmap 算法
那天看朋友提了一个百度面试的题目：怎么找出{1,1,2,3,3,4,4,4,5,5,5,5} 找出出现次数为奇数的数字. 我这里复制的是原话，当然顺序是不一定的，很多拿到题目第一反应就是用map,当然可以解决，但是效率不高。还有人觉得应该用算法xxx,我是没想到用啥算法好...！还有觉得应该先排序... 还有觉
Spring之在开发中使用SpringJDBC ihuning spring
在实际开发中使用SpringJDBC有两种方式： 1. 在Dao中添加属性JdbcTemplate并用Spring注入； JdbcTemplate类被设计成为线程安全的，所以可以在IOC 容器中声明它的单个实例，并将这个实例注入到所有的 DAO 实例中。JdbcTemplate也利用了Java 1.5 的特定(自动装箱，泛型，可变长度
JSON API 1.0 核心开发者自述 | 你所不知道的那些技术细节 justjavac json
2013年5月，Yehuda Katz 完成了JSON API(英文，中文) 技术规范的初稿。事情就发生在 RailsConf 之后，在那次会议上他和 Steve Klabnik 就 JSON 雏形的技术细节相聊甚欢。在沟通单一 Rails 服务器库—— ActiveModel::Serializers 和单一 JavaScript 客户端库——&
网站项目建设流程概述 macroli 工作
一.概念网站项目管理就是根据特定的规范、在预算范围内、按时完成的网站开发任务。二.需求分析项目立项　　我们接到客户的业务咨询，经过双方不断的接洽和了解，并通过基本的可行性讨论够，初步达成制作协议，这时就需要将项目立项。较好的做法是成立一个专门的项目小组，小组成员包括：项目经理，网页设计，程序员，测试员，编辑/文档等必须人员。项目实行项目经理制。客户的需求说明书　　第一步是需
AngularJs 三目运算表达式判断 qiaolevip 每天进步一点点学习永无止境众观千象 AngularJS
事件回顾：由于需要修改同一个模板，里面包含2个不同的内容，第一个里面使用的时间差和第二个里面名称不一样，其他过滤器，内容都大同小异。希望杜绝If这样比较傻的来判断if-show or not，继续追究其源码。 var b = "{{", a = "}}"; this.startSymbol = function(a) {
Spark算子：统计RDD分区中的元素及数量 superlxw1234 spark spark算子 Spark RDD分区元素
关键字：Spark算子、Spark RDD分区、Spark RDD分区元素数量 Spark RDD是被分区的，在生成RDD时候，一般可以指定分区的数量，如果不指定分区数量，当RDD从集合创建时候，则默认为该程序所分配到的资源的CPU核数，如果是从HDFS文件创建，默认为文件的Block数。可以利用RDD的mapPartitionsWithInd
Spring 3.2.x将于2016年12月31日停止支持 wiselyman Spring 3
Spring 团队公布在2016年12月31日停止对Spring Framework 3.2.x（包含tomcat 6.x）的支持。在此之前spring团队将持续发布3.2.x的维护版本。请大家及时准备及时升级到Spring
fis纯前端解决方案fis-pure zccst JavaScript
作者：zccst FIS通过插件扩展可以完美的支持模块化的前端开发方案，我们通过FIS的二次封装能力，封装了一个功能完备的纯前端模块化方案pure。 1，fis-pure的安装 $ fis install -g fis-pure $ pure -v 0.1.4 2，下载demo到本地 git clone https://github.com/hefangshi/f