python 图像分割

Stuck behind the paywall? Click here to read the full article with my friend link.

卡在收费墙后面？ 单击此处 以阅读完整的文章以及我的朋友链接。

Welcome back!

欢迎回来！

This is the second part of a three part series on image classification. I recommend you to first go through Part 1 of the series if you haven’t already (link below). I’ve gone through the details of setting up the environment and working with image data from Google Drive in Google Colab there. We’ll be using output from that code here.

这是关于图像分类的三部分系列的第二部分。我建议您先阅读本系列的第1部分(如果尚未使用)(下面的链接)。我已经详细介绍了设置环境以及使用Google Colab中的Google云端硬盘中的图像数据的详细信息。我们将在这里使用该代码的输出。

Image segmentation is the process of “partitioning a digital image into multiple segments”. Since we are just concerned about background removal here, we will just be dividing the images into the foreground and the background.

图像分割是“将数字图像分割成多个片段”的过程。由于我们在这里只关心背景去除，因此我们将图像分为前景和背景。

This consists of five basic steps:

这包括五个基本步骤：

Convert the image to grayscale.
将图像转换为灰度。
Apply thresholding to the image.
对图像应用阈值处理。
Find the image contours (edges).
找到图像轮廓(边缘)。
Create a mask using the largest contour.
使用最大轮廓创建蒙版。
Apply the mask on the original image to remove the background.
在原始图像上应用遮罩以删除背景。

I’ll be explaining and coding each step. Onward!

我将解释和编码每个步骤。向前！

设置工作区 (Setting Up the Workspace)

If you have gone through Part I and have executed the code till the end, you can skip this section.

如果您已阅读了第一部分并执行了代码直到最后，则可以跳过本节。

For those who haven’t, and are here just to learn image segmentation, I’m assuming that you know how Colab works. In case you don’t please go through Part I.

对于那些还没有来这里学习图像分割的人，我假设您知道Colab的工作原理。如果您不满意，请阅读第一部分。

The data set is available here. This is the result of the code from Part I. Open the link while signed in to your Google account so that it’s available in the “Shared with me” folder of your Google Drive. Then open Google Colab, connect to a runtime, and mount your Google Drive to it:

数据集可在此处获得。这是第一部分代码的结果。登录到您的Google帐户后，打开链接，以便可以在Google云端硬盘的“与我共享”文件夹中使用该链接。然后打开Google Colab ，连接到运行时，然后将Google云端硬盘安装到该运行时：

Follow the URL, select the Google account which you used to access the data-set, and grant Colab permissions to your Drive. Paste the authorization code at the text box in the cell output and you’ll get the message Mounted at /gdrive.

跟随该URL，选择用于访问数据集的Google帐户，然后将Colab权限授予您的云端硬盘。将授权代码粘贴到单元输出的文本框中，您将收到消息Mounted at /gdrive 。

Then we import all the necessary libraries:

然后，我们导入所有必需的库：

import cv2
import glob
import matplotlib.pyplot as plt
import numpy as np
from PIL import Image
import random
from tqdm.notebook import tqdmnp.random.seed(1)

Our notebook is now set up!

现在我们的笔记本已经设置好了！

从驱动器读取图像 (Reading Images from Drive)

If you’re using data from the link shared in this article, the path for you will be ‘/gdrive/Shared with me/LeafImages/color/Grape*/*.JPG’.

如果您使用的是本文共享链接中的数据，则您的路径将为'/gdrive/ Shared with me /LeafImages/color/Grape*/*.JPG' 。

Those who followed Part I and used the entire training set should see 4062 paths.

遵循第一部分并使用了整个培训集的人员应该看到4062条路径。

Next we load the images from the paths and save them to a NumPy array:

接下来，我们从路径加载图像并将其保存到NumPy数组中：

A shape of (20, 256, 256, 3) signifies that we have 20 256x256 sized images, with three color channels.

(20、256、256、3)的形状表示我们有20个256x256大小的图像，带有三个颜色通道。

Let’s see how these images look:

让我们看看这些图像的外观：

plt.figure(figsize=(9,9))for i, img in enumerate(orig[0:16]):
    plt.subplot(4,4,i+1)
    plt.xticks([])
    plt.yticks([])
    plt.grid(False)
    plt.imshow(img)plt.suptitle("Original", fontsize=20)
plt.show()

Original Images 原始图像

灰阶 (Grayscaling)

The first step in segmenting is converting the images to grayscale. Grayscaling is the process of removing colors from an image and representing each pixel only by its intensity, with 0 being black and 255 being white.

分割的第一步是将图像转换为灰度。灰度是从图像中去除颜色并仅通过其强度表示每个像素的过程，0为黑色，255为白色。

OpenCV makes this easy:

OpenCV使这个变得容易：

We can see from the shape that the color channels have been removed.

从形状中我们可以看到颜色通道已被移除。

Display the converted images:

显示转换后的图像：

plt.figure(figsize=(9,9))for i, img in enumerate(gray[0:16]):
    plt.subplot(4,4,i+1)
    plt.xticks([])
    plt.yticks([])
    plt.grid(False)
    plt.imshow(cv2.cvtColor(img, cv2.COLOR_GRAY2RGB))plt.suptitle("Grayscale", fontsize=20)
plt.show()

Grayscale images 灰度图像

In the first image, we can see that the first pixel (top-left) is white, while the ones at the bottom-left are black. This can be verified by checking the pixel array for the first image:

在第一个图像中，我们可以看到第一个像素(左上角)是白色，而左下角的像素是黑色。可以通过检查第一个图像的像素阵列来验证：

Indeed, the pixels at the top left are white (255), and the ones at the bottom-left are black (0).

实际上，左上角的像素是白色(255)，左下角的像素是黑色(0)。

门槛 (Thresholding)

In image processing, thresholding is the process of creating a binary image from a grayscale image. A binary image is one whose pixels can have only two values — 0 (black) or 255 (white).

在图像处理中，阈值处理是从灰度图像创建二进制图像的过程。一幅二值图像是其像素只能具有两个值的图像-0(黑色)或255(白色)。

In the simplest case of thresholding, you select a value as a threshold and any pixel above this value becomes white (255), while any below becomes black (0). Check out the OpenCV documentation for image thresholding for more types and the parameters involved.

在最简单的阈值情况下，您选择一个值作为阈值，并且高于此值的任何像素将变为白色(255)，而低于此值的任何像素将变为黑色(0)。请查看OpenCV文档中的图像阈值，以了解更多类型和涉及的参数。

thresh = [cv2.threshold(img, np.mean(img), 255, cv2.THRESH_BINARY_INV)[1] for img in tqdm(gray)]

The first parameter passed to cv2.threshold() is the grayscale image to be converted, the second is the threshold value, the third is the value which will be assigned to the pixel if it crosses the threshold, and finally we have the type of thresholding.

传递给cv2.threshold()的第一个参数是要转换的灰度图像，第二个是阈值，第三个是如果超过阈值将分配给像素的值，最后我们得到阈值化。

cv2.threshold() returns two values, the first being an optimal threshold calculated automatically if you use cv2.THRESH_OTSU, and the second being the actual thresholded object. Since we’re only concerned about the object, we subscript [1] to append only the second returned value in our thresh list.

cv2.threshold()返回两个值，第一个是使用cv2.THRESH_OTSU自动计算的最佳阈值，第二个是实际的阈值对象。由于我们只关心对象，因此我们在下标[1]下标仅将第二个返回值附加到thresh值列表中。

You can choose a static threshold, but then it won’t be able to take the different lighting conditions of different photos into account. I’ve chosen np.mean(), which gives the average value of a pixel for the image. Lighter images will have a value greater than 127.5 (255/2), while darker images will have a lower value. This lets you threshold images based on their lighting conditions.

您可以选择一个静态阈值，但是它将无法考虑不同照片的不同照明条件。我选择了np.mean() ，它给出了图像像素的平均值。较亮的图像将具有大于127.5(255/2)的值，而较暗的图像将具有较低的值。这使您可以根据图像的照明条件对图像进行阈值处理。

For the first image, the threshold is 126.34, which means that the image is slightly darker than average. Any pixel which has a value greater than this will be converted to white, and any less, will be made black. But wait! If you notice the grayscale image, the leaf is darker than the background. If we apply a normal threshold, the darker pixels become black, while lighter pixels become white. This will apply a black mask on the leaf, not the background. To deal with this, we use the THRESH_BINARY_INV method, which inverts the thresholding process. Now, pixels having an intensity greater than the threshold will be made black — those with less, white.

对于第一张图像，阈值为126.34，这意味着该图像比平均值稍暗。值大于此值的任何像素都将转换为白色，而小于该值的像素将变为黑色。可是等等！如果您注意到灰度图像，则叶子比背景暗。如果我们应用正常阈值，则较暗的像素将变为黑色，而较亮的像素将变为白色。这将在叶子上而不是背景上应用黑色蒙版。为了解决这个问题，我们使用THRESH_BINARY_INV方法，该方法反转了阈值处理。现在，强度大于阈值的像素将变为黑色-强度较小的像素变为白色。

Lets have a look at the pixel intensities for the first thresholded image:

让我们看一下第一个阈值图像的像素强度：

As you can see, the pixels which were lighter (top row) in the grayscale array are now black, while those which were darker (bottom row), are now white.

如您所见，灰度阵列中较亮的像素(顶部行)现在为黑色，而较暗的像素(底部行)现在为白色。

Lets see the thresholded images to verify:

让我们查看阈值图像以进行验证：

边缘检测 (Edge Detection)

Edge detection, as the name suggests, is the process of finding boundaries (edges) of objects within an image. In out case, it will be the boundary between the white and black pixels.

顾名思义，边缘检测是在图像中查找对象边界(边缘)的过程。在极少数情况下，它将是白色和黑色像素之间的边界。

OpenCV lets you implement this using the Canny algorithm.

OpenCV允许您使用Canny算法实现此目的。

edges = [cv2.dilate(cv2.Canny(img, 0, 255), None) for img in tqdm(thresh)]

Dilate is a noise removal technique which helps in joining broken parts of an edge together, so that they form a continuous contour. Read more about some other noise removal techniques in edge detection here. You can also experiment with them and see if the results look better.

膨胀是一种除噪技术，可帮助将边缘的断裂部分连接在一起，使它们形成连续的轮廓。在此处阅读更多关于边缘检测中其他噪声消除技术的信息。您也可以尝试使用它们，看看结果是否更好。

0 and 255 here are the lower and upper threshold values respectively. You can read about their use in the documentation. In our case, since the images are already thresholded, these values don’t really matter.

0和255分别是下阈值和上阈值。您可以在文档中了解其用法。在我们的例子中，由于图像已经达到阈值，因此这些值并不重要。

Lets plot the edges:

让我们绘制边缘：

plt.figure(figsize=(9,9))for i, edge in enumerate(edges[0:16]):
    plt.subplot(4,4,i+1)
    plt.xticks([])
    plt.yticks([])
    plt.grid(False)
    plt.imshow(cv2.cvtColor(edge, cv2.COLOR_GRAY2RGB))plt.suptitle("Edges", fontsize=20)
plt.show()

Edge detection 边缘检测

遮罩和分割 (Masking and Segmenting)

Here at last. This involves quite a few steps, so I’ll be taking a break from list comprehensions for easy comprehension.

终于到了这涉及很多步骤，因此为了便于理解，我将从列表理解中休息一下。

Masking is the process of creating a mask from an image to be applied to another. We take the mask and apply it on the original image to get the final segmented image.

遮罩是根据要创建的图像创建遮罩的过程。我们将蒙版应用到原始图像上以获得最终的分割图像。

We want to mask out the background from our images. For this, we first need to find the edges (already done), and then find the largest contour by area. The assumption is that this will be the edge of the foreground object.

我们想掩盖图像中的背景。为此，我们首先需要查找边缘(已完成)，然后按面积查找最大的轮廓。假设这将是前景对象的边缘。

cnt = sorted(cv2.findContours(img, cv2.RETR_LIST, cv2.CHAIN_APPROX_SIMPLE)[-2], key=cv2.contourArea)[-1]

This is already a handful — let’s dissect!

这已经是少数了-让我们进行剖析！

First we find all the contours cv2.findContours(img, cv2.RETR_LIST, cv2.CHAIN_APPROX_SIMPLE). The documentation is your friend again if you want to get into the details of the second and third parameters. This returns the image, the contours, and the contour hierarchy. Since we want only the contour, we subscript it with [-2] to retrive the second last returned item. Since we have to find the contour with the largest area, we wrap the entire function within sorted(), and use cv2.contourArea as the key. Since sorted sorts in ascending order by default, we pick the last item with [-1] which gives us the largest contour.

首先，我们找到所有轮廓cv2.findContours(img, cv2.RETR_LIST, cv2.CHAIN_APPROX_SIMPLE) 。如果您想了解第二个和第三个参数的详细信息，该文档将再次成为您的朋友。这将返回图像，轮廓和轮廓层次。由于只需要轮廓，因此我们用[-2]下标以检索倒数第二个项目。由于必须找到面积最大的轮廓，因此将整个函数包装在sorted() ，并使用cv2.contourArea作为键。由于默认情况下，排序是按升序排序的，因此我们选择[-1]作为最后一个项，这将为我们提供最大的轮廓。

Then we create a black canvas of the same size as our images mask = np.zeros((256,256), np.uint8). I call this “mask” as this will be the mask after the foreground has been removed from it.

然后我们创建一个与图像mask = np.zeros((256,256), np.uint8)大小相同的黑色画布。我将其称为“蒙版”，因为这是从前景中移除后的蒙版。

To vizualise this, we merge the largest contour on the mask, and fill it with white cv2.drawContours(mask, [cnt], -1, 255, -1)). The third parameter ,-1 in our case, is the number of contours to draw. Since we have already selected just the largest contour, you can use either 1 or -1 (for all) here. The second parameter is the fill color. Since we have a single channel and want to fill with white, it is 255. Last is the thickness.

为了对此进行生动化，我们将蒙版上的最大轮廓合并，并用白色cv2.drawContours(mask, [cnt], -1, 255, -1))填充。在我们的示例中，第三个参数-1是要绘制的轮廓数。由于我们仅选择了最大的轮廓，因此您可以在此处使用1或-1(对于所有轮廓)。第二个参数是填充颜色。由于我们只有一个频道，并且想用白色填充，因此为255 。最后是厚度。

Since a picture is worth a thousand words, let the below hastily made illustration make the process a bit simpler to understand:

由于一张图片价值一千个单词，因此让下面的草稿插图使该过程更容易理解：

I think you can guess the final step — superimposing the final mask on the original image to effectively remove the background.

我认为您可以猜出最后一步-将最终蒙版叠加在原始图像上以有效去除背景。

This can be done using the bitwise_and operation. This tutorial will help you to understand how that actually works.

这可以使用bitwise_and操作来完成。本教程将帮助您了解其实际工作原理。

dst = cv2.bitwise_and(orig, orig, mask=mask)

Now we just wrap this section inside a loop to append all the masks and segmented images to their respective arrays, so we can finally see what our work looks like:

现在，我们将这部分包装在循环中，以将所有遮罩和分割的图像附加到它们各自的数组中，这样我们最终可以看到我们的工作是什么样的：

masked = []
segmented = []for i, img in tqdm(enumerate(edges)):
    cnt = sorted(cv2.findContours(img, cv2.RETR_LIST, cv2.CHAIN_APPROX_SIMPLE)[-2], key=cv2.contourArea)[-1]
    mask = np.zeros((256,256), np.uint8)
    masked.append(cv2.drawContours(mask, [cnt],-1, 255, -1))
    dst = cv2.bitwise_and(orig[i], orig[i], mask=mask)
    segmented.append(cv2.cvtColor(dst, cv2.COLOR_BGR2RGB))

Plotting the masks:

绘制面具：

plt.figure(figsize=(9,9))for i, maskimg in enumerate(masked[0:16]):
    plt.subplot(4,4,i+1)
    plt.xticks([])
    plt.yticks([])
    plt.grid(False)
    plt.imshow(maskimg, cmap='gray')plt.suptitle("Mask", fontsize=20)
plt.show()

Masks 口罩

And the final segmented images:

最后的分割图像：

plt.figure(figsize=(9,9))for i, segimg in enumerate(segmented[0:16]):
    plt.subplot(4,4,i+1)
    plt.xticks([])
    plt.yticks([])
    plt.grid(False)
    plt.imshow(cv2.cvtColor(segimg, cv2.COLOR_BGR2RGB))plt.suptitle("Segmented", fontsize=20)
plt.show()

Segmented images 分割图像

We can then finally these images in a separate “segmented” folder:

然后，我们最终可以将这些图像放在单独的“分段”文件夹中：

import osfor i, image in tqdm(enumerate(segmented)):
    directory = paths[i].rsplit('/', 3)[0] + '/segmented/' + paths[i].rsplit('/', 2)[1]+ '/'
    os.makedirs(directory, exist_ok = True)
    cv2.imwrite(directory + paths[i].rsplit('/', 2)[2], image)

期待更好的结果？ (Expecting Better Results?)

This was just an introduction to the process — we didn’t delve too deeply into the parameters so the results are far from perfect. Try to figure out which step introduced distortion and think of how you can improve this step. As I said earlier, the OpenCV Image Processing tutorial is a great place to start.

这只是对过程的介绍-我们没有深入研究参数，因此结果远非完美。尝试找出导致失真的步骤，并思考如何改进此步骤。如前所述， OpenCV图像处理教程是一个很好的起点。

Grayscale > Threshold > Edge > Mask 灰度>阈值>边缘>遮罩

I would love to see your results in the comments and learn how you achieved them!

我很乐意在评论中看到您的结果，并学习如何实现它们！

敬请关注… (Stay Tuned…)

You can find the Colab notebook I used here.

您可以在这里找到我使用的Colab笔记本。

This concludes the second part of the trilogy. Stay tuned for the final part where we use these segmented images to train a very basic image classifier. The link to it will be added here once published.

这结束了三部曲的第二部分。请继续关注最后一部分，在这里我们将使用这些分割的图像来训练非常基本的图像分类器。发布后，将在此处添加指向它的链接。

Thanks for reading, bouquets, and brickbats welcome!

感谢您的阅读，花束和欢迎。

翻译自: https://medium.com/better-programming/image-segmentation-python-7a838a464a84