最近在尝试对自建的人体关键点数据集进行姿态聚类,使用labelme标注的数据集需要的时间太长了,于是自己标注了一部分,其他的数据想用数据增强方法进行扩充。
于是在CSDN上找到了可以对json文件和图片都进行修改的文章,在此感谢原作者we34dfg,文章链接如下基于labelme的图片和json文件的扩增_we34dfg的博客-CSDN博客,然而在运行代码的过程中,遇到了一些问题,于是在原文的基础上,对出现的问题进行了修改,修改后的代码放在文章的最后。
1.代码报错“TypeError: Object of type float32 is not JSON serializable”
错误原因是json文件写入时内容的类型不对,在此感谢评论区southerx做出的解答,解决方式为修改 write_points_to_json x,y 前加上float new_point = [float(aug_points.keypoints[k].x), float(aug_points.keypoints[k].y)],在代码中的体现如下:
原代码64-67行
for j in range(len(shapes[i]["points"])):
new_point = [aug_points.keypoints[k].x, aug_points.keypoints[k].y]
new_json['shapes'][i]["points"][j] = new_point
k = k + 1
————————————————
版权声明:本文为CSDN博主「we34dfg」的原创文章,遵循CC 4.0 BY-SA版权协议,转载请附上原文出处链接及本声明。
原文链接:https://blog.csdn.net/we34dfg/article/details/104752336
修改为
for j in range(len(shapes[i]["points"])):
new_point = [float(aug_points.keypoints[k].x), float(aug_points.keypoints[k].y)]
new_json['shapes'][i]["points"][j] = new_point
k = k + 1
2.运行后的文件使用labelme查看变蓝
第一个错误解决之后,代码跑起来了,但是用labelme查看生成的json文件,图片居然变蓝了(这里已经进行过数据增强,只是放的是截图,增强部分不在图里)
但是生成好的图片却没变
由此推断,是保存json文件时出了问题,在网上查找了cv2库相关的说明(因为读入原图片使用的是cv2,写入json文件时,也是写入使用cv2读到的图片数据),发现cv2读入图片的方式不太一样,读入的顺序是BGR,与通常的RGB通道相反。cv2在查看和存储图片时,函数会自动转换通道,所以显示的图片还是原图,但当我们用其他方法处理时,此时的颜色通道是不正确的,所以会出现图片变蓝的问题,所以在存储json文件的imageData时,要进行图片颜色通道的转换,函数是cv2.cvtColor(image_aug, cv2.COLOR_BGR2RGB),具体修改如下:
for idx_aug in range(aug_times):
image_aug, kps_aug = seq(image=idx_img, keypoints=kps)
#image_aug.astype(np.uint8)
new_img_path = os.path.join(out_img_dir, idx_jpg_path.split(os.sep)[-1][:-4] + str(idx_aug) + '.jpg')
image_aug.astype(np.uint8)
cv2.imwrite(new_img_path, image_aug)
# write aug_points in json file
image_aug = cv2.cvtColor(image_aug, cv2.COLOR_BGR2RGB)
idx_new_json = write_points_to_json(idx_json, kps_aug)
idx_new_json["imagePath"] = idx_jpg_path.split(os.sep)[-1][:-4] + str(idx_aug) + '.jpg'
idx_new_json["imageData"] = str(utils.img_arr_to_b64(image_aug), encoding='utf-8')
# save
new_json_path = os.path.join(out_json_dir, idx_jpg_path.split(os.sep)[-1][:-4] + str(idx_aug) + '.json')
save_jsonfile(idx_new_json, new_json_path)
此处对应原文章代码的最后一部分,需要注意的是,前面说到cv2在查看和存储图片时,函数会自动转换通道,所以存储增强后的图片需要在转换图片颜色通道之前进行,相应的,存储json文件的imageData这步也需要变换位置。
3.新生成的json文件太大(未解决)
这个问题来自原文评论区,我也遇到了这个问题,处理前60k的json文件,处理后变成了600k,目前2kjson文件的文件夹大小已经有1G以上,然而我的目标是2w文件...
尝试多次更换存储方式和文件读入方式,问题都没有解决,实在不行只能自己多标注一点了。
最后放上完整代码,根据自己需要改变文件夹路径即可
# -*- coding: utf-8 -*-
import sys
import os
import glob
import cv2
import numpy as np
import json
#---below---imgaug module
import imgaug as ia
import imgaug.augmenters as iaa
from imgaug.augmentables import Keypoint, KeypointsOnImage
from labelme import utils
'''
ticks:
1) picture type : jpg;
2) while augumenting, mask not to go out image shape;
3) maybe some error because data type not correct.
'''
def mkdir(path):
isExists = os.path.exists(path)
if not isExists:
os.mkdir(path)
print('====================')
print('creat path : ', path)
print('====================')
return 0
def check_json_file(path):
for i in path:
json_path = i[:-3] + 'json'
if not os.path.exists(json_path):
print('error')
print(json_path, ' not exist !!!')
sys.exit(1)
def read_jsonfile(path):
with open(path, 'r',encoding='utf-8') as f:
return json.load(f)
def save_jsonfile(object, save_path):
json.dump(object, open(save_path, 'w',encoding='utf-8'), ensure_ascii=True, indent=2)
def get_points_from_json(json_file):
point_list = []
shapes = json_file['shapes']
for i in range(len(shapes)):
for j in range(len(shapes[i]["points"])):
point_list.append(shapes[i]["points"][j])
return point_list
def write_points_to_json(json_file, aug_points):
k = 0
new_json = json_file
shapes = new_json['shapes']
for i in range(len(shapes)):
for j in range(len(shapes[i]["points"])):
new_point = [float(aug_points.keypoints[k].x), float(aug_points.keypoints[k].y)]
new_json['shapes'][i]["points"][j] = new_point
k = k + 1
return new_json
#-----------------------------Sequential-augument choose here-----
ia.seed(1)
# Define our augmentation pipeline.
seq = iaa.Sequential([
# weather
iaa.Affine(
rotate=(-5, 5),
translate_px={"x":3,"y":3},
) # rotate by -3 to 3 degrees (affects segmaps)
], random_order=True)
if __name__ == '__main__':
# TO-DO-BELOW
aug_times = 3
in_dir = "../test/data1" #输入数据的文件夹,含有img文件和json文件
out_img_dir = "../test/out1/img" #输出img文件夹
out_json_dir = "../test/out1/json" #输出json文件夹
#---check-------------
mkdir(out_img_dir)
mkdir(out_json_dir)
imgs_dir_list = glob.glob(os.path.join(in_dir, '*.jpg'))
check_json_file(imgs_dir_list)
# for : image
for idx_jpg_path in imgs_dir_list:
idx_json_path = idx_jpg_path[:-3] + 'json'
# get image file
#idx_img = cv2.imdecode(np.fromfile(idx_jpg_path, dtype=np.uint8), 1)
idx_img = cv2.imread(idx_jpg_path)
# get json file
idx_json = read_jsonfile(idx_json_path)
# get point_list from json file
points_list = get_points_from_json(idx_json)
# convert to Keypoint(imgaug mode)
kps = KeypointsOnImage([Keypoint(x=p[0], y=p[1]) for p in points_list], shape=idx_img.shape)
# Augument Keypoints and images
for idx_aug in range(aug_times):
image_aug, kps_aug = seq(image=idx_img, keypoints=kps)
#image_aug.astype(np.uint8)
new_img_path = os.path.join(out_img_dir, idx_jpg_path.split(os.sep)[-1][:-4] + str(idx_aug) + '.jpg')
image_aug.astype(np.uint8)
cv2.imwrite(new_img_path, image_aug)
# write aug_points in json file
image_aug = cv2.cvtColor(image_aug, cv2.COLOR_BGR2RGB)
idx_new_json = write_points_to_json(idx_json, kps_aug)
idx_new_json["imagePath"] = idx_jpg_path.split(os.sep)[-1][:-4] + str(idx_aug) + '.jpg'
idx_new_json["imageData"] = str(utils.img_arr_to_b64(image_aug), encoding='utf-8')
# save
new_json_path = os.path.join(out_json_dir, idx_jpg_path.split(os.sep)[-1][:-4] + str(idx_aug) + '.json')
save_jsonfile(idx_new_json, new_json_path)