最近在做图像标注,会出现以下的图片,需要去除其中的边框。
1.思路
- 人工标注画框的范围P,并使用标注工具在画框上画一个点A。
- 获取点A的坐标和颜色。在范围P内,将与点A颜色相似的每一个点x的颜色,替换为点x上下(或左右)范围内若干个点的平均颜色。但是对于边缘部分,可能存在如下图所示的问题:
- 对边框的边缘部分Q进行额外处理。边框的颜色可能与点A的颜色有较大差异,因此需要利用其他方法进行处理。我使用异常值检测算法,因为Q中未在步骤2去除的一些残留点,与原始图片会存在较大差异,结合频度信息可以筛选出这些点,并得到以下的效果:
2.代码实现
- 导包和必要的参数、模型:
from PIL import Image
import numpy as np
from sklearn.ensemble import IsolationForest
from collections import Counter
threhold=15
threhold_border=15
redius=20
redius_border=20
step_add=37
step_max=2000
step=0
all_outliers=[]
top_outlier_num=10
model = IsolationForest(contamination=0.05)
- 辅助函数
def my_similar(a,b):
if sum([abs(x-y) for (x,y) in zip(list(a),list(b))])/3<threhold:
return True
return False
def my_similar_border(a,b_s):
for b in b_s:
if sum([abs(x-y) for (x,y) in zip(list(a),list(b)[0])])/3<threhold_border:
return True
return False
def get_outlier(tmp,model):
normal_point=[]
unnormal_point=[]
model.fit(tmp)
labels = model.predict(tmp)
for i, label in enumerate(labels):
if label == -1:
unnormal_point.append(tmp[i])
if label==1:
normal_point.append(tmp[i])
return normal_point,unnormal_point
def get_right_neighborhood(target_color,neighborhood_x,neighborhood_y):
diff_x=0
diff_y=0
for neighborhood in neighborhood_x:
diff_x+=abs(sum([x-y for x,y in zip(list(neighborhood),list(target_color))])/3)
for neighborhood in neighborhood_y:
diff_y+=abs(sum([x-y for x,y in zip(list(neighborhood),list(target_color))])/3)
if diff_x>diff_y:
return neighborhood_x,1
return neighborhood_y,2
- 主函数和必要的准备数据
x_1=1427
y_1=723
x_2=2061
y_2=1363
x = 1495
y = 1294
def replace_color_around_point(image_path, x, y, radius=redius):
step=0
image = Image.open(image_path)
pixels = image.load()
width, height = image.size
target_color = pixels[x, y]
for i in range(x_1,x_2):
for j in range(y_1,y_2):
if my_similar(pixels[i, j],target_color):
neighborhood_x = []
neighborhood_y=[]
for m in range(i - redius, i + radius + 1):
if 0 <= m < width:
neighborhood_x.append(pixels[m, j])
for n in range(j - redius, j + radius + 1):
if 0 <= n < height:
neighborhood_y.append(pixels[i, n])
neighborhood,direction=get_right_neighborhood(target_color,neighborhood_x,neighborhood_y)
neighborhood=[n for n in neighborhood if not my_similar(n,target_color)]
neighborhood = np.array(neighborhood)
average_color=tuple(np.mean(neighborhood, axis=0, dtype=int))
average_color_part_1 = tuple(np.mean(neighborhood[0:int(len(neighborhood)/2)], axis=0, dtype=int))
average_color_part_2 = tuple(np.mean(neighborhood[int(len(neighborhood)/2)+1:], axis=0, dtype=int))
pixels[i, j] = average_color
if direction==1:
for m in range(i - int(redius_border/3), i):
if 0 <= m < width:
pixels[m,j]=average_color_part_1
for m in range(i, i + int(redius_border/3) + 1):
if 0 <= m < width:
pixels[m,j]=average_color_part_2
if direction==2:
for n in range(j - int(redius_border/3), j):
if 0 <= n < height:
pixels[i,n]=average_color_part_1
for n in range(j ,j + int(redius_border/3) + 1):
if 0 <= n < height:
pixels[i,n]=average_color_part_2
step+=1
if step%step_add==0 and step<step_max:
normal_point,unnormal_point=get_outlier(neighborhood,model)
unnormal_point=[tuple(x) for x in unnormal_point]
all_outliers.extend(unnormal_point)
all_outlier_counts = Counter(all_outliers)
top_outliers = all_outlier_counts.most_common(top_outlier_num)
for i in range(x_1,x_2):
for j in range(y_1,y_2):
if my_similar_border(pixels[i, j] ,top_outliers):
neighborhood_x = []
neighborhood_y=[]
for m in range(i - int(radius/3), i + int(radius/3) + 1):
if 0 <= m < width:
neighborhood_x.append(pixels[m, j])
for n in range(j - int(radius/3), j + int(radius/3) + 1):
if 0 <= n < height:
neighborhood_y.append(pixels[i, n])
neighborhood,direction=get_right_neighborhood(target_color,neighborhood_x,neighborhood_y)
neighborhood=[n for n in neighborhood if not n in top_outliers]
neighborhood = np.array(neighborhood)
average_color=tuple(np.mean(neighborhood, axis=0, dtype=int))
pixels[i, j] = average_color
image.save(r"../data/bbb.jpg")
- 调用
image_path = r"../data/aaa.jpg"
replace_color_around_point(image_path, x, y)
3.存在的问题
- 在进行颜色替换时,仅仅使用了平均值(代码中的average_color相关内容),也许可以使用其他线性插值算法。
- 需要对参数进行精心调节,否则可能导致框内的图像会出现以下的“毛刺现象”,且无法把“框”完全去除:
将threhold参数调小后,毛刺消失。也可以对其他参数进行调节。