风华明远

十三种基于直方图的图像全局二值化算法原理、实现、代码及效果

十三种基于直方图的图像全局二值化算法实现

1. 什么是基于直方图的图像全局二值化算法
2. 灰度平均值
3. 百分比阈值（P-Tile法）
3. 基于双峰的阈值
- 3.1 基于平均值的阈值
- 3.2 基于最小值的阈值方法
4. 迭代最佳阈值
5.大津(OTSU)法
6.一维最大熵
7.力矩保持法
8.模糊集
9. Kittler最小错误分类法
10. ISODATA
11. Shanbhag 法
12. Yen法

1. 什么是基于直方图的图像全局二值化算法

本文的内容来自与十三种基于直方图的图像全局二值化算法原理、实现、代码及效果。本文的中的算法是基于图像的灰度直方图进行阈值分割。
所谓图像的灰度直方图是在将图像进行灰度处理后，计算灰度值的分布。也就是每个灰度值在灰度图像中有多少个点。一般情况下，图像灰度值的取值范围为0~255，所以灰度直方图是一个有256个元素的数组。
阈值分割就是找到一个合适阈值，对灰度图像进行处理。大于阈值的设置为255，小于阈值的设置为0。也可以根据需要设为其他数值。这样处理灰度图像，可以将前景与背景做一个区分。
此方法的关键就是计算阈值。在前面的链接中，介绍了13种方法并提供了C语言实现。本文将这些算法用Python进行了重写。

2. 灰度平均值

灰度平均值是图像总像素值/总像素数：
$总像素值=\displaystyle \sum_{g=0}^{255}g\times h(g)$
$总像素数=\displaystyle \sum_{g=0}^{255}h(g)$
总像素数其实就是图像的宽度X图像的长度，也是图像的大小。

# coding:utf8

import numpy as np
import cv2
from matplotlib import pyplot as plt


def GrayHist(img):
    grayHist = np.zeros(256, dtype=np.uint64)
    for v in range(256):
        grayHist[v] = np.sum(img == v)
    return grayHist


def GetMeanThreshold(H):
    Amount = np.sum(H)
    Gray_Sum = np.sum(H*np.arange(256))
    return int(float(Gray_Sum/Amount))


def GrayThreshold(image, maxval=255):
    g = GrayHist(image)
    thresh = GetMeanThreshold(g)
    threshImage_out = image.copy()
    # 大于阈值的都设置为maxval
    threshImage_out[threshImage_out > thresh] = maxval
    # 小于阈值的都设置为0
    threshImage_out[threshImage_out <= thresh] = 0
    return thresh, threshImage_out


if __name__ == "__main__":

    img = cv2.imread('bird.png')
    img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

    th, img_new = GrayThreshold(img_gray)
    th1, img_new_1 = cv2.threshold(img_gray, 146, 255, cv2.THRESH_BINARY)
    print(th, th1)
    plt.subplot(131), plt.imshow(img_gray, cmap='gray')
    plt.title('Original Image'), plt.xticks([]), plt.yticks([])
    plt.subplot(132), plt.imshow(img_new, cmap='gray')
    plt.title('Image'), plt.xticks([]), plt.yticks([])
    plt.subplot(133), plt.imshow(img_new_1, cmap='gray')
    plt.title('CV2 Image1'), plt.xticks([]), plt.yticks([])
    plt.show()

对于小鸟的图像其灰度平均阈值为146。

3. 百分比阈值（P-Tile法）

此方法先设定一个阈值p(先验概率)，比如50%。然后从0开始对灰度值求和。计算到灰度值Y时，[0~Y]的综合>=总像素值*p，则设定灰度值Y为阈值。
该方法需要根据先验概率确定一个百分比。人的经验很重要。

# coding:utf8

import numpy as np
import cv2
from matplotlib import pyplot as plt


def GrayHist(img):
    grayHist = np.zeros(256, dtype=np.uint64)
    for v in range(256):
        grayHist[v] = np.sum(img == v)
    return grayHist


def GetPTileThreshold(H,ptile=0.5):
    Amount = np.sum(H)
    A = Amount * ptile
    psum = 0
    for Y in range(256):
        psum += H[Y]
        if psum >= A:
            return Y
    return -1  # 没有符合条件的阈值


def GrayThreshold(image, maxval=255):
    g = GrayHist(image)
    thresh = GetPTileThreshold(g,0.25)
    threshImage_out = image.copy()
    # 大于阈值的都设置为maxval
    threshImage_out[threshImage_out > thresh] = maxval
    # 小于阈值的都设置为0
    threshImage_out[threshImage_out <= thresh] = 0
    return thresh, threshImage_out


if __name__ == "__main__":

    img = cv2.imread('bird.png')
    img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

    th, img_new = GrayThreshold(img_gray)
    th1, img_new_1 = cv2.threshold(img_gray, 127, 255, cv2.THRESH_BINARY)
    print(th, th1)
    plt.subplot(131), plt.imshow(img_gray, cmap='gray')
    plt.title('Original Image'), plt.xticks([]), plt.yticks([])
    plt.subplot(132), plt.imshow(img_new, cmap='gray')
    plt.title('Image'), plt.xticks([]), plt.yticks([])
    plt.subplot(133), plt.imshow(img_new_1, cmap='gray')
    plt.title('CV2 Image1'), plt.xticks([]), plt.yticks([])
    plt.show()

当百分比设为25%时，阈值为123。这种方法需要认为调节阈值，否则很难得到理想的结果。

3. 基于双峰的阈值

此方法适合与有明显双峰值的图像，对于单峰值或者平坦的灰度直方图形并不适合。
其原理是通过迭代对直方图数据进行判断，判端是否是一个双峰的直方图。如果不是，则对直方图数据进行窗口为3的平滑。如果迭代了一定的数量比如1000次后仍未获得一个双峰的直方图，则函数执行失败。如成功获得，则最终阈值取两个双峰之间的谷底值作为阈值。
所谓窗口为3的平滑，就是将灰度p的值，用p-1，p和p+1的值进行平均。
直方图通过迭代形成双峰后，有2种方法获得阈值：
（1）双峰的平均值
（2）双峰之间的谷底值
第一种方法是找到双峰后，取平均值的整数。第二种方法找到谷底的最小值。

3.1 基于平均值的阈值

# coding:utf8
import numpy as np
import cv2
from matplotlib import pyplot as plt


def GrayHist(img):
    grayHist = np.zeros(256, dtype=np.uint64)
    for v in range(256):
        grayHist[v] = np.sum(img == v)
    return grayHist


def GetIntermodesThreshold(H,method=0):
    Iter = 0
    HistGramC = np.array(H, dtype=np.float64)  # 基于精度问题，一定要用浮点数来处理，否则得不到正确的结果
    HistGramCC = np.array(H, dtype=np.float64)  # 求均值的过程会破坏前面的数据，因此需要两份数据

    # 通过三点求均值来平滑直方图
    while IsDimodal(HistGramCC) == False:  # 判断是否已经是双峰的图像了
        HistGramCC[0] = (HistGramC[0] + HistGramC[0] + HistGramC[1]) / 3  # 第一点
        for Y in range(1, 255):
            HistGramCC[Y] = (HistGramC[Y - 1] + HistGramC[Y] + HistGramC[Y + 1]) / 3  # 中间的点
        HistGramCC[255] = (HistGramC[254] + HistGramC[255] + HistGramC[255]) / 3  # 最后一点
        HistGramC = np.array(HistGramCC, dtype=np.float64)  # 备份数据
        Iter += 1
        if Iter >= 1000:
            return -1  # 直方图无法平滑为双峰的，返回错误代码

    if method > 0:
        Peakfound = False
        for Y in range(1,255):
            if HistGramCC[Y - 1] < HistGramCC[Y] and HistGramCC[Y + 1] < HistGramCC[Y]:
                Peakfound = True
            if Peakfound and HistGramCC[Y - 1] >= HistGramCC[Y] and HistGramCC[Y + 1] >= HistGramCC[Y]:
                return Y - 1
        return -1
    else:
        # 阈值为两峰值的平均值
        Peak = np.zeros(2,dtype=np.uint16)
        Index = 0
        for Y in range(1, 255):
            if HistGramCC[Y - 1] < HistGramCC[Y] and HistGramCC[Y + 1] < HistGramCC[Y]:
                Peak[Index] = Y - 1
                Index += 1
        return int((Peak[0] + Peak[1]) / 2)


def IsDimodal(H):  # 检测直方图是否为双峰的

    # 对直方图的峰进行计数，只有峰数位2才为双峰
    Count = 0
    for Y in range(1, 255):
        if H[Y - 1] < H[Y] and H[Y + 1] < H[Y]:
            Count += 1
            if Count > 2: return False
    if Count == 2:
        return True
    else:
        return False


def GrayThreshold(image, maxval=255):
    g = GrayHist(img_gray)
    thresh = GetIntermodesThreshold(g,1)
    threshImage_out = image.copy()
    # 大于阈值的都设置为maxval
    threshImage_out[threshImage_out > thresh] = maxval
    # 小于阈值的都设置为0
    threshImage_out[threshImage_out <= thresh] = 0
    return thresh, threshImage_out


if __name__ == "__main__":
    img = cv2.imread('bird.png')
    img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

    th1, img_new_1 = cv2.threshold(img_gray, 0, 255, cv2.THRESH_TRIANGLE)
    th, img_new = GrayThreshold(img_gray)
    print(th, th1)
    plt.subplot(131), plt.imshow(img_gray, cmap='gray')
    plt.title('Original Image'), plt.xticks([]), plt.yticks([])
    plt.subplot(132), plt.imshow(img_new, cmap='gray')
    plt.title('Image'), plt.xticks([]), plt.yticks([])
    plt.subplot(133), plt.imshow(img_new_1, cmap='gray')
    plt.title('CV2 Image1'), plt.xticks([]), plt.yticks([])
    plt.show()

此种方法得到的小鸟图像阈值为159。原始直方图和平滑后的直方图（黄色）如图：

3.2 基于最小值的阈值方法

该方法是查找双峰之间的最小值。使用此方法需要调用GetIntermodesThreshold(g,1)。最后一个参数设置为1即可。

4. 迭代最佳阈值

此方法是先从左边找到HistGram中第一个数值不为0的点min，再从右边找到第一个数值不为0的点max。然后取阈值threshold为此2点的中间值。通过计算[min,threshold]和(threshold,max]平均值灰度值对应的点，进行迭代直到平均灰度值点与阈值点相等。
平均灰度值计算为：
$G_{min}=\frac{\sum_{g=min}^{T_k}g\times h(g)}{\sum_{g=min}^{T_k}h(g)}$
$G_{max}=\frac{\sum_{g=T_k}^{max}g\times h(g)}{\sum_{g=T_k}^{max}h(g)}$
$T_{k+1}=\frac{G_{min}+G_{max}}{2}$
重复进行运算，直到T_k=T_k+1。
需要注意的是可能存在不收敛的情况。

# coding:utf8

import numpy as np
import cv2
from matplotlib import pyplot as plt


def GrayHist(img):
    grayHist = np.zeros(256, dtype=np.uint64)
    for v in range(256):
        grayHist[v] = np.sum(img == v)
    return grayHist


def GetIterativeBestThreshold(HistGram):
    Iter = 0
    value = np.where(HistGram > 0)[0]  # 非0下标
    index = len(value)
    if index == 0:
        return 0
    elif index == 1 or index == 2:
        return value[0]
    else:
        MinValue = value[0]
        MaxValue = value[-1]

    Threshold = MinValue
    NewThreshold = int(MaxValue + MinValue) >> 1
    while Threshold != NewThreshold:  # 当前后两次迭代的获得阈值相同时，结束迭代

        Threshold = NewThreshold
        #[MinValue,Threshold]灰度平均值对应的点
        SUmInteralOne = np.sum(HistGram[MinValue:Threshold + 1] * np.arange(MinValue, Threshold + 1))
        SumOne = np.sum(HistGram[MinValue:Threshold + 1])
        MeanValueOne = SUmInteralOne / SumOne
        # [Threshold+1,MaxValue]灰度平均值对应的点
        SUmInteralTwo = np.sum(HistGram[Threshold + 1:MaxValue + 1] * np.arange(Threshold + 1, MaxValue + 1))
        SumTwo = np.sum(HistGram[Threshold + 1:MaxValue + 1])
        MeanValueTwo = SUmInteralTwo / SumTwo

        NewThreshold = int(MeanValueOne + MeanValueTwo) >> 1  # 求出新的阈值
        Iter += 1
        if Iter >= 1000:
            return -1
    return Threshold


def GrayThreshold(image, maxval=255):
    g = GrayHist(image)
    thresh = GetIterativeBestThreshold(g)
    threshImage_out = image.copy()
    # 大于阈值的都设置为maxval
    threshImage_out[threshImage_out > thresh] = maxval
    # 小于阈值的都设置为0
    threshImage_out[threshImage_out <= thresh] = 0
    return thresh, threshImage_out


if __name__ == "__main__":
    img = cv2.imread('bird.png')
    img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

    th, img_new = GrayThreshold(img_gray)
    th1, img_new_1 = cv2.threshold(img_gray, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_TRIANGLE)
    print(th, th1)
    plt.subplot(131), plt.imshow(img_gray, cmap='gray')
    plt.title('Original Image'), plt.xticks([]), plt.yticks([])
    plt.subplot(132), plt.imshow(img_new, cmap='gray')
    plt.title('Image'), plt.xticks([]), plt.yticks([])
    plt.subplot(133), plt.imshow(img_new_1, cmap='gray')
    plt.title('CV2 Image1'), plt.xticks([]), plt.yticks([])
    plt.show()

此方法求出的阈值为126。

5.大津(OTSU)法

大津法是日本学者大津在1979年提出的方法，也叫最大类间法。其对单峰值或者比较平坦的灰度值有很好的效果，对双峰或者多峰的情况则较差。其方法如下：
记t为前景与背景的分割阈值，前景点数占图像比例为w₀，平均灰度为u₀；背景点数占图像比例为w₁，平均灰度为u₁。
则图像的总平均灰度为：u=w₀ *u₀+w₁*u₁。
前景和背景图象的方差：g=w₀*(u₀-u)*(u₀-u)+w₁*(u₁-u)*(u₁-u)=w₀*w₁*(u₀-u₁)²,此公式为方差公式。
不断的迭代求解出最大的方差值，可以认为此时前景和背景的相差最大，从而得到分割图像的阈值。

# coding:utf8

import numpy as np
import cv2
from matplotlib import pyplot as plt


def GrayHist(img):
    grayHist = np.zeros(256, dtype=np.uint64)
    for v in range(256):
        grayHist[v] = np.sum(img == v)
    return grayHist


def GetOSTUThreshold(H):
    # 判断只有1种灰度、2种灰度以及多种灰度
    V = np.where(H > 0)[0]  # 非0下标
    I = len(V)
    if I == 0: return 0
    if I == 1: return V[0]
    if I == 2: return V[0] if H[V[0]] < H[V[1]] else V[1]

    MinValue = V[0]
    MaxValue = V[-1]

    PixelBack = 0
    PixelIntegralBack = 0
    Threshold = 0

    Amount = np.sum(H)
    PixelIntegral = np.sum(H * np.arange(256))
    SigmaB = -1
    for Y in range(MinValue, MaxValue):
        PixelBack = PixelBack + H[Y]
        PixelFore = Amount - PixelBack
        OmegaBack = PixelBack / Amount
        OmegaFore = PixelFore / Amount
        PixelIntegralBack += H[Y] * Y
        PixelIntegralFore = PixelIntegral - PixelIntegralBack
        MicroBack = PixelIntegralBack / PixelBack
        MicroFore = PixelIntegralFore / PixelFore
        Sigma = OmegaBack * OmegaFore * (MicroBack - MicroFore) * (MicroBack - MicroFore)
        if Sigma > SigmaB:
            SigmaB = Sigma
            Threshold = Y

    return Threshold


def GrayThreshold(image, maxval=255):
    g = GrayHist(image)
    thresh = GetOSTUThreshold(g)
    threshImage_out = image.copy()
    # 大于阈值的都设置为maxval
    threshImage_out[threshImage_out > thresh] = maxval
    # 小于阈值的都设置为0
    threshImage_out[threshImage_out <= thresh] = 0
    return thresh, threshImage_out


if __name__ == "__main__":
    img = cv2.imread('bird.png')
    img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

    th, img_new = GrayThreshold(img_gray)
    th1, img_new_1 = cv2.threshold(img_gray, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)
    print(th, th1)
    plt.subplot(131), plt.imshow(img_gray, cmap='gray')
    plt.title('Original Image'), plt.xticks([]), plt.yticks([])
    plt.subplot(132), plt.imshow(img_new, cmap='gray')
    plt.title('Image'), plt.xticks([]), plt.yticks([])
    plt.subplot(133), plt.imshow(img_new_1, cmap='gray')
    plt.title('CV2 Image1'), plt.xticks([]), plt.yticks([])
    plt.show()

自定义的OTSU方法与opencv中的OTSU方法得到的阈值是一样的，都是125。

6.一维最大熵

最大熵的思想是将图像分为两部分(可以认为是前景和背景)，循环计算两部分的灰度值的熵，使其和最大。取得最大值的灰度即为阈值。
其中熵的概念为：
$hist=\frac{h(g)}{\sum_{x=0}^{255}h(x)}+最小正浮点值$

$S_1=\sum_{g=0}^{T}hist(g)$

$E_1=-\frac{\sum_{g=0}^{T}hist(g)}{S_1}\times log(\frac{\sum_{g=0}^{T}hist(g)}{S_1})$

$E_2=-\frac{\sum_{g=T+1}^{255}hist(g)}{1-S_1}\times log(\frac{\sum_{g=T}^{255}hist(g)}{1-S_1})$
计算出最大的E₁+E₂的灰度值即为阈值。

# coding:utf8

import numpy as np
import cv2
from matplotlib import pyplot as plt
import math

def GrayHist(img):
    grayHist = np.zeros(256, dtype=np.uint64)
    for v in range(256):
        grayHist[v] = np.sum(img == v)
    return grayHist


def Get1DMaxEntropyThreshold(H):

    V = np.where(H > 0)[0]  # 非0下标
    I = len(V)
    if I == 0: return 0
    if I == 1: return V[0]
    if I == 2: return V[0] if H[V[0]] < H[V[1]] else V[1]

    MinValue = V[0]
    MaxValue = V[-1]
    Threshold=MinValue
    HistGramD = np.zeros(256,dtype=np.float64)
    Amount = np.sum(H[MinValue:MaxValue + 1])
    eps = np.finfo(np.float64).eps
    HistGramD[MinValue:MaxValue+1] = H[MinValue:MaxValue+1]/Amount + eps
    MaxEntropy = 0.0
    for Y in range(MinValue+1,MaxValue):
        SumIntegral = 0.
        for X in range(MinValue,Y+1):
            SumIntegral += HistGramD[X]
        EntropyBack = 0.
        for X in range(MinValue,Y+1):
            EntropyBack += -HistGramD[X] / SumIntegral * np.log(HistGramD[X] / SumIntegral)
        EntropyFore = 0.
        for X in range(Y+1,MaxValue+1):
            EntropyFore += -HistGramD[X] / (1 - SumIntegral) * np.log(HistGramD[X] / (1 - SumIntegral))
        if MaxEntropy < EntropyBack + EntropyFore:

            Threshold = Y
            MaxEntropy = EntropyBack + EntropyFore
    return Threshold


def GrayThreshold(image,maxval=255):
    g = GrayHist(image)
    thresh = Get1DMaxEntropyThreshold(g)
    threshImage_out = image.copy()
    # 大于阈值的都设置为maxval
    threshImage_out[threshImage_out > thresh] = maxval
    # 小于阈值的都设置为0
    threshImage_out[threshImage_out <= thresh] = 0
    return thresh, threshImage_out

if __name__ == "__main__":

    img = cv2.imread('bird.png')
    img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

    th, img_new = GrayThreshold(img_gray)
    th1,img_new_1 = cv2.threshold(img_gray, 0, 255,  cv2.THRESH_BINARY + cv2.THRESH_TRIANGLE)
    print(th,th1)
    plt.subplot(131), plt.imshow(img_gray, cmap='gray')
    plt.title('Original Image'), plt.xticks([]), plt.yticks([])
    plt.subplot(132), plt.imshow(img_new, cmap='gray')
    plt.title('Image'), plt.xticks([]), plt.yticks([])
    plt.subplot(133), plt.imshow(img_new_1, cmap='gray')
    plt.title('CV2 Image1'), plt.xticks([]), plt.yticks([])
    plt.show()

计算出的阈值为104。

7.力矩保持法

力矩法的论文可以从此链接中获得力矩法论文。上述论文是英文版的。此方法比较复杂。
实现方法如下：

# coding:utf8

import numpy as np
import cv2
from matplotlib import pyplot as plt
import math


def GrayHist(img):
    grayHist = np.zeros(256, dtype=np.uint64)
    for v in range(256):
        grayHist[v] = np.sum(img == v)
    return grayHist


def AH(H, index):
    return np.sum(H[:index+1])

def BH(H, index):
    K = np.arange(index+1)
    return np.sum(H[:index+1]*K)

def CH(H, index):
    K = np.float_power(np.arange(index+1),2)
    return np.sum(H[:index + 1] * K)


def DH(H, index):
    K = np.float_power(np.arange(index+1),3)
    return np.sum(H[:index + 1] * K)


def GetMomentPreservingThreshold(H):
    V = np.where(H > 0)[0]  # 非0下标
    I = len(V)
    if I == 0: return 0
    if I == 1: return V[0]
    if I == 2: return V[0] if H[V[0]] < H[V[1]] else V[1]

    MaxValue = V[-1]
    Avec = np.zeros(256, dtype=np.float64)
    Amount = np.sum(H)
    for Y in range(256):
        Avec[Y] = AH(H, Y) / Amount
    Index = 0
    B = BH(H,255)
    A = AH(H,255)
    C = CH(H,255)
    D = DH(H,255)
    X2 = (B * C - A * D) / (A * C - B * B)
    X1 = (B * D - C * C) / (A * C - B * B)
    X0 = 0.5 - (B / A + X2 / 2) / math.sqrt(X2 * X2 - 4 * X1)

    Min = float(MaxValue)
    for Y in range(256):
        if math.fabs(Avec[Y] - X0) < Min:
            Min = math.fabs(Avec[Y] - X0)
            Index = Y

    return int(Index)


def GrayThreshold(image, maxval=255):
    g = GrayHist(image)
    thresh = GetMomentPreservingThreshold(g)
    threshImage_out = image.copy()
    # 大于阈值的都设置为maxval
    threshImage_out[threshImage_out > thresh] = maxval
    # 小于阈值的都设置为0
    threshImage_out[threshImage_out <= thresh] = 0
    return thresh, threshImage_out


if __name__ == "__main__":
    img = cv2.imread('bird.png')
    img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

    th, img_new = GrayThreshold(img_gray)
    th1, img_new_1 = cv2.threshold(img_gray, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_TRIANGLE)
    print(th, th1)
    plt.subplot(131), plt.imshow(img_gray, cmap='gray')
    plt.title('Original Image'), plt.xticks([]), plt.yticks([])
    plt.subplot(132), plt.imshow(img_new, cmap='gray')
    plt.title('Image'), plt.xticks([]), plt.yticks([])
    plt.subplot(133), plt.imshow(img_new_1, cmap='gray')
    plt.title('CV2 Image1'), plt.xticks([]), plt.yticks([])
    plt.show()

8.模糊集

具体方法参考基于模糊集理论的一种图像二值化算法的原理、实现效果及代码

# coding:utf8

import numpy as np
import cv2
from matplotlib import pyplot as plt
import math


def GrayHist(img):
    grayHist = np.zeros(256, dtype=np.uint64)
    for v in range(256):
        grayHist[v] = np.sum(img == v)
    return grayHist


def GetHuangFuzzyThreshold(H):

    Threshold = -1
    BestEntropy = np.finfo(np.float32).max
    # 找到第一个和最后一个非0的色阶值
    V = np.where(H > 0)[0]  # 非0下标
    I = len(V)
    if I == 0: return 0
    if I == 1: return V[0]
    if I == 2: return V[0] if H[V[0]] < H[V[1]] else V[1]

    First = V[0]
    Last = V[-1]

    # 计算累计直方图以及对应的带权重的累计直方图
    S = np.zeros(Last+1,dtype=np.int64)
    W = np.zeros(Last+1,dtype=np.int64)

    S[0] = H[0]
    start = First if First>1 else 1
    for Y in range(start,Last+1):
        S[Y] = S[Y - 1] + H[Y]
        W[Y] = W[Y - 1] + Y * H[Y]

    # 建立公式（4）及（6）所用的查找表
    Smu = np.zeros(Last+1-First,dtype=np.float32)

    for Y in range(1,Last+1-First):
        mu = 1. / (1. + float(Y) / (Last - First))               # 公式（4）
        Smu[Y] = -mu * math.log(mu) - (1 - mu) * math.log(1 - mu)      # 公式（6）

    # 迭代计算最佳阈值
    for Y in range(First,Last+1):
        Entropy = 0
        mu = int(np.round(float(W[Y] / S[Y])))          # 公式17
        for X in range(First,Y+1):
            Entropy += Smu[abs(X - mu)] * H[X]
        mu = int(np.round(float((W[Last] - W[Y]) / (S[Last] - S[Y])))) if Y < Last else 0
        for X in range(Y + 1,Last+1):
            Entropy += Smu[abs(X - mu)] * H[X]       # 公式8
        if BestEntropy > Entropy:
            BestEntropy = Entropy       #取最小熵处为最佳阈值
            Threshold = Y
    return Threshold


def GrayThreshold(image, maxval=255):
    g = GrayHist(image)
    thresh = GetHuangFuzzyThreshold(g)
    threshImage_out = image.copy()
    # 大于阈值的都设置为maxval
    threshImage_out[threshImage_out > thresh] = maxval
    # 小于阈值的都设置为0
    threshImage_out[threshImage_out <= thresh] = 0
    return thresh, threshImage_out


if __name__ == "__main__":
    img = cv2.imread('bird.png')
    img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

    th, img_new = GrayThreshold(img_gray)
    th1, img_new_1 = cv2.threshold(img_gray, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_TRIANGLE)
    print(th, th1)
    plt.subplot(131), plt.imshow(img_gray, cmap='gray')
    plt.title('Original Image'), plt.xticks([]), plt.yticks([])
    plt.subplot(132), plt.imshow(img_new, cmap='gray')
    plt.title('Image'), plt.xticks([]), plt.yticks([])
    plt.subplot(133), plt.imshow(img_new_1, cmap='gray')
    plt.title('CV2 Image1'), plt.xticks([]), plt.yticks([])
    plt.show()

9. Kittler最小错误分类法

具体方法见Kittler, J & Illingworth, J (1986), “Minimum error thresholding”, Pattern Recognition 19: 41-47

# coding:utf8

import numpy as np
import cv2
from matplotlib import pyplot as plt
import math

def GrayHist(img):
    grayHist = np.zeros(256, dtype=np.uint64)
    for v in range(256):
        grayHist[v] = np.sum(img == v)
    return grayHist


def GetKittlerMinError(HistGram):

    value = np.where(HistGram>0)[0] #非0下标
    index = len(value)
    if index == 0:
        return 0
    elif index == 1:
        return value[0]
    elif index == 2:
        return value[0]
    else:
        MinValue = value[0]
        MaxValue = value[-1]

    Threshold = -1
    MinSigma = np.finfo(np.float64).eps
    for Y in range(MinValue,MaxValue):
        PixelBack = 0
        PixelFore = 0
        OmegaBack = 0
        OmegaFore = 0
        for X in range(MinValue,Y+1):
            PixelBack += HistGram[X]
            OmegaBack = OmegaBack + X * HistGram[X]
        for X in range(Y+1,MaxValue+1):
            PixelFore += HistGram[X]
            OmegaFore = OmegaFore + X * HistGram[X]
        OmegaBack = OmegaBack / PixelBack
        OmegaFore = OmegaFore / PixelFore
        SigmaBack = 0
        SigmaFore = 0
        for X in range(MinValue,Y+1):
            SigmaBack = SigmaBack + (X - OmegaBack) * (X - OmegaBack) * HistGram[X]
        for X in range(Y+1,MaxValue):
            SigmaFore = SigmaFore + (X - OmegaFore) * (X - OmegaFore) * HistGram[X]

        if SigmaBack == 0 or SigmaFore == 0:
            if Threshold == -1:
                Threshold = Y
        else:
            SigmaBack = math.sqrt(SigmaBack / PixelBack)
            SigmaFore = math.sqrt(SigmaFore / PixelFore)
            Sigma = 1 + 2 * (PixelBack * math.log(SigmaBack / PixelBack) + PixelFore * math.log(SigmaFore / PixelFore))
            if Sigma < MinSigma:
                MinSigma = Sigma
                Threshold = Y
    return Threshold



def GrayThreshold(image,maxval=255):
    g = GrayHist(image)
    thresh = GetKittlerMinError(g)
    threshImage_out = image.copy()
    # 大于阈值的都设置为maxval
    threshImage_out[threshImage_out > thresh] = maxval
    # 小于阈值的都设置为0
    threshImage_out[threshImage_out <= thresh] = 0
    return thresh, threshImage_out

if __name__ == "__main__":

    img = cv2.imread('bird.png')
    img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

    th, img_new = GrayThreshold(img_gray)
    th1,img_new_1 = cv2.threshold(img_gray, 0, 255,  cv2.THRESH_BINARY + cv2.THRESH_TRIANGLE)
    print(th,th1)
    plt.subplot(131), plt.imshow(img_gray, cmap='gray')
    plt.title('Original Image'), plt.xticks([]), plt.yticks([])
    plt.subplot(132), plt.imshow(img_new, cmap='gray')
    plt.title('Image'), plt.xticks([]), plt.yticks([])
    plt.subplot(133), plt.imshow(img_new_1, cmap='gray')
    plt.title('CV2 Image1'), plt.xticks([]), plt.yticks([])
    plt.show()

10. ISODATA

参考论文：

Ridler, TW & Calvard, S (1978), “Picture thresholding using an iterative selection method”, IEEE Transactions on Systems, Man and Cybernetics 8: 630-632, http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=4310039

# coding:utf8

import numpy as np
import cv2
from matplotlib import pyplot as plt
import math

def GrayHist(img):
    grayHist = np.zeros(256, dtype=np.uint64)
    for v in range(256):
        grayHist[v] = np.sum(img == v)
    return grayHist


def GetIsoDataThreshold(H):

    V = np.where(H > 0)[0]  # 非0下标
    I = len(V)
    if I == 0: return 0
    if I == 1: return V[0]
    if I == 2: return V[0] if H[V[0]] < H[V[1]] else V[1]
    MinValue = V[0]
    MaxValue = V[-1]
    threshold = MinValue

    while (True):
        SumOne = 0
        SumInteralOne = 0
        SumTwo = 0
        SumInteralTwo = 0
        for i in range(threshold):
            SumOne += H[i]
            SumInteralOne += H[i] * i
        for i in range(threshold+1,MaxValue):
            SumTwo += H[i]
            SumInteralTwo += (H[i] * i)

        if SumOne > 0 and SumTwo > 0:
            SumInteralOne /= SumOne
            SumInteralTwo /= SumTwo

            if threshold == int(np.round((SumInteralOne + SumInteralTwo) / 2.0)):
                break
        threshold += 1
        if threshold > 253:
            return 0
    return threshold


def GrayThreshold(image,maxval=255):
    g = GrayHist(image)
    thresh = GetIsoDataThreshold(g)
    threshImage_out = image.copy()
    # 大于阈值的都设置为maxval
    threshImage_out[threshImage_out > thresh] = maxval
    # 小于阈值的都设置为0
    threshImage_out[threshImage_out <= thresh] = 0
    return thresh, threshImage_out

if __name__ == "__main__":


    img = cv2.imread('bird.png')
    img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

    th, img_new = GrayThreshold(img_gray)
    th1,img_new_1 = cv2.threshold(img_gray, 0, 255,  cv2.THRESH_BINARY + cv2.THRESH_TRIANGLE)
    print(th,th1)
    plt.subplot(131), plt.imshow(img_gray, cmap='gray')
    plt.title('Original Image'), plt.xticks([]), plt.yticks([])
    plt.subplot(132), plt.imshow(img_new, cmap='gray')
    plt.title('Image'), plt.xticks([]), plt.yticks([])
    plt.subplot(133), plt.imshow(img_new_1, cmap='gray')
    plt.title('CV2 Image1'), plt.xticks([]), plt.yticks([])
    plt.show()

11. Shanbhag 法

参考论文：

  Shanbhag, Abhijit G. (1994), "Utilization of information measure as a means of image thresholding", Graph. Models Image Process. (Academic Press, Inc.) 56 (5): 414--419, ISSN 1049-9652, DOI 10.1006/cgip.1994.1037

# coding:utf8

import numpy as np
import cv2
from matplotlib import pyplot as plt
import math


def GrayHist(img):
    grayHist = np.zeros(256, dtype=np.uint64)
    for v in range(256):
        grayHist[v] = np.sum(img == v)
    return grayHist


def GetShanbhagThreshold(H):
    V = np.where(H > 0)[0]  # 非0下标
    I = len(V)
    if I == 0: return 0
    if I == 1: return V[0]
    if I == 2: return V[0] if H[V[0]] < H[V[1]] else V[1]

    P1 = np.zeros(256, dtype=np.float64)
    P2 = np.zeros(256, dtype=np.float64)

    Amount = np.sum(H)
    norm_histo = H / Amount

    P1[0] = norm_histo[0]
    P2[0] = 1.0 - P1[0]

    for ih in range(1, 256):
        P1[ih] = P1[ih - 1] + norm_histo[ih]
        P2[ih] = 1.0 - P1[ih]

    first_bin = 0
    for ih in range(1, 256):
        if not (math.fabs(P1[ih]) < np.finfo(np.float64).eps):
            first_bin = ih
            break

    last_bin = 255
    for ih in range(255, first_bin - 1, -1):
        if not (math.fabs(P2[ih]) < np.finfo(np.float64).eps):
            last_bin = ih
            break

    threshold = -1
    min_ent = np.finfo(np.float64).max
    for it in range(first_bin, last_bin + 1):
        ent_back = 0.0
        term = 0.5 / P1[it]
        for ih in range(1, it + 1):
            ent_back -= norm_histo[ih] * math.log(1.0 - term * P1[ih - 1]);
        ent_back *= term
        ent_obj = 0.0
        term = 0.5 / P2[it]
        for ih in range(it + 1, 256):
            ent_obj -= norm_histo[ih] * math.log(1.0 - term * P2[ih])

        ent_obj *= term

        tot_ent = math.fabs(ent_back - ent_obj)

        if tot_ent < min_ent:
            min_ent = tot_ent
            threshold = it

    return threshold


def GrayThreshold(image, maxval=255):
    g = GrayHist(image)
    thresh = GetShanbhagThreshold(g)
    threshImage_out = image.copy()
    # 大于阈值的都设置为maxval
    threshImage_out[threshImage_out > thresh] = maxval
    # 小于阈值的都设置为0
    threshImage_out[threshImage_out <= thresh] = 0
    return thresh, threshImage_out


if __name__ == "__main__":

    img = cv2.imread('bird.png')
    img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

    th, img_new = GrayThreshold(img_gray)
    th1, img_new_1 = cv2.threshold(img_gray, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_TRIANGLE)
    print(th, th1)
    plt.subplot(131), plt.imshow(img_gray, cmap='gray')
    plt.title('Original Image'), plt.xticks([]), plt.yticks([])
    plt.subplot(132), plt.imshow(img_new, cmap='gray')
    plt.title('Image'), plt.xticks([]), plt.yticks([])
    plt.subplot(133), plt.imshow(img_new_1, cmap='gray')
    plt.title('CV2 Image1'), plt.xticks([]), plt.yticks([])
    plt.show()

12. Yen法

参考论文：

 1) Yen J.C., Chang F.J., and Chang S. (1995) "A New Criterion  for Automatic Multilevel Thresholding" IEEE Trans. on Image  Processing, 4(3): 370-378
 2) Sezgin M. and Sankur B. (2004) "Survey over Image Thresholding Techniques and Quantitative Performance Evaluation" Journal of  Electronic Imaging, 13(1): 146-165

# coding:utf8

import numpy as np
import cv2
from matplotlib import pyplot as plt
import math


def GrayHist(img):
    grayHist = np.zeros(256, dtype=np.uint64)
    for v in range(256):
        grayHist[v] = np.sum(img == v)
    return grayHist


def GetYenThreshold(H):
    V = np.where(H > 0)[0]  # 非0下标
    I = len(V)
    if I == 0: return 0
    if I == 1: return V[0]
    if I == 2: return V[0] if H[V[0]] < H[V[1]] else V[1]

    P1 = np.zeros(256, dtype=np.float64)
    P1_sq = np.zeros(256, dtype=np.float64)
    P2_sq = np.zeros(256, dtype=np.float64)

    norm_histo = H/ np.sum(H)

    P1[0] = norm_histo[0]

    for ih in range(1, 256):
        P1[ih] = P1[ih - 1] + norm_histo[ih]
    P1_sq[0] = norm_histo[0] * norm_histo[0]
    for ih in range(1, 256):
        P1_sq[ih] = P1_sq[ih - 1] + norm_histo[ih] * norm_histo[ih]

    P2_sq[255] = 0.0
    for ih in range(254, -1, -1):
        P2_sq[ih] = P2_sq[ih + 1] + norm_histo[ih + 1] * norm_histo[ih + 1]

    threshold = -1
    max_crit = np.finfo(np.float64).eps

    for it in range(256):
        l1 = math.log(P1_sq[it] * P2_sq[it]) if P1_sq[it] * P2_sq[it] > 0.0 else 0.0
        l2 = math.log(P1[it] * (1.0 - P1[it])) if P1[it] * (1.0 - P1[it]) > 0.0 else 0.0
        crit = -1.0 * l1 + 2 * l2
        if crit > max_crit:
            max_crit = crit
            threshold = it

    return threshold


def GrayThreshold(image, maxval=255):
    g = GrayHist(image)
    thresh = GetYenThreshold(g)
    threshImage_out = image.copy()
    # 大于阈值的都设置为maxval
    threshImage_out[threshImage_out > thresh] = maxval
    # 小于阈值的都设置为0
    threshImage_out[threshImage_out <= thresh] = 0
    return thresh, threshImage_out


if __name__ == "__main__":

    img = cv2.imread('bird.png')
    img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

    th, img_new = GrayThreshold(img_gray)
    th1, img_new_1 = cv2.threshold(img_gray, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_TRIANGLE)
    print(th, th1)
    plt.subplot(131), plt.imshow(img_gray, cmap='gray')
    plt.title('Original Image'), plt.xticks([]), plt.yticks([])
    plt.subplot(132), plt.imshow(img_new, cmap='gray')
    plt.title('Image'), plt.xticks([]), plt.yticks([])
    plt.subplot(133), plt.imshow(img_new_1, cmap='gray')
    plt.title('CV2 Image1'), plt.xticks([]), plt.yticks([])
    plt.show()

你可能感兴趣的:(Python,图像处理,算法,python)

Dubbo应用接入 weixin_34281477 java python
2019独角兽企业重金招聘Python工程师标准>>>一、应用配置1、pom文件引入下面的jar包com.niwodai.infdubbo-cat-extend3.0.02、如果要对全局dubbo服务加监控，增加如下filter：服务提供方：服务消费方：如果只需要对单个服务加监控，增加如下filter：服务提供方：服务消费方：3、配置disconf开关，一个应用只需要配置一次。Disconf监控开
Python通过RS485串口控制码垛机器人 mosquito_lover1 计算机视觉人工智能 python 机器人
先看代码，再看后面的说明importserialimporttimeclassPalletizingRobot:def__init__(self,port,baudrate=9600,timeout=1):self.port=portself.baudrate=baudrateself.timeout=timeoutself.serial_conn=Nonedefconnect(self):"""
【贪心算法2】 m0_46150269 贪心算法算法
力扣122.买卖股票最佳时机Ⅱ链接:link思路要求最大利润，可以分解成子问题求解，在最低价格买入，最高价格卖出。假如第0天价格最低，第3天价格最高，利润=prices[3]-pricnes[0],可以将利润公式拆解成(prices[3]-prices[2])+(prices[2]-prices[1])+(prices[1]-prices[0])最终变成了求相邻两天的利润，所以可以得到一个关于利润
【贪心算法】柠檬水找零 I_Am_Me_ 贪心算法贪心算法算法
1.题目解析860.柠檬水找零-力扣（LeetCode）2.讲解算法原理分情况讨论5---》直接收下10---》找五元，收下20----》10+5△----》5+5+5由于5元更有用，则尽可能保留5元3.代码classSolution{publicbooleanlemonadeChange(int[]bills){intfive=0,ten=0;for(intx:bills){if(x==5){f
Python学习总结 serve the people 巨人的肩膀 python 开发语言
第一个python程序print("HelloWorld")#缩进一般4个空格键或者1个tab键，但是所有代码块语句必须是相同的缩进，这个必须严格执行，不同的缩进会导致程序不能运行，不能混用空格和tabifTrue:print("True")else:print("False")python注释符单行注释（行注释）#print("HelloWorld")多行注释（块注释）'''print("Hel
leetcode 贪心算法 gufly- leetcode 贪心算法算法
刷题记录以局部最优推出整体最优，且想不到反例，则可以尝试贪心算法455.分发饼干从后向前遍历孩子数组，用大饼干满足胃口大，并统计满足小孩数量classSolution(object):deffindContentChildren(self,g,s):g.sort()s.sort()res=0ind=len(s)-1foriinrange(len(g)-1,-1,-1):ifind>=0ands[i
python贪心算法几个经典例子_贪心算法经典例子 weixin_39637979
一、定义什么是贪心算法呢？所谓贪心算法是指，在对问题求解时，总是做出在当前看来最好的选择。也就是说，不从整体最优解出发来考虑，它所做出的仅是在某种意义上的局部最优解。贪心算法不是对所有问题都能得到整体最优解，但对范围相当广泛的许多问题都能产生整体最优解或整体最优解的近似解。贪心算法的基本思路如下：1.建立数学模型来描述问题。2.把求解的问题分成若干个子问题。3.对每个子问题求解，得到每个子问题的局
python贪心算法几个经典例子_贪心算法及几个经典例子 weixin_39786850
一、定义什么是贪心算法呢？所谓贪心算法是指，在对问题求解时，总是做出在当前看来最好的选择。也就是说，不从整体最优解出发来考虑，它所做出的仅是在某种意义上的局部最优解。贪心算法不是对所有问题都能得到整体最优解，但对范围相当广泛的许多问题都能产生整体最优解或整体最优解的近似解。贪心算法的基本思路如下：1.建立数学模型来描述问题。2.把求解的问题分成若干个子问题。3.对每个子问题求解，得到每个子问题的局
简单区分五大算法分析策略（分治、动态规划、贪心、回溯、分支限界）土味儿~ 数据结构与算法数据结构与算法
一、分治法1、设计思想将一个难以直接解决的大问题，分割成k个规模较小的子问题，这些子问题相互独立，且与原问题相同，然后各个击破，分而治之。2、递归算法分治法常常与递归结合使用：通过反复应用分治，可以使子问题与原问题类型一致而规模不断缩小，最终使子问题缩小到很容易求出其解，由此自然导致递归算法。3、子问题规模根据分治法的分割原则，应把原问题分割成多少个子问题才比较适宜？每个子问题是否规模相同或怎样才
贪心算法 tzc_fly 白景屹-算法栈贪心算法
贪心算法框架贪心算法（greedyalgorithm）是一个容易想象但难以证明的算法，算法框架包括：可选对象集合S，S是全集；已选对象集合T；判断解是否合法的函数isValid(T)；评价解的函数payoff(T)；目标：从S中选出T，使isValid(T)为True，同时，满足payoff(T)最大；做法：从空集开始，每次增加一个元素使当前payoff最大最后求解完成需要验证是不是全局最优贪心算
LeetCode刷题实战522：最长特殊序列 II 编程IT圈字符串算法 leetcode java 数据结构
算法的重要性，我就不多说了吧，想去大厂，就必须要经过基础知识和业务逻辑面试+算法面试。所以，为了提高大家的算法能力，这个公众号后续每天带大家做一道算法题，题目就从LeetCode上面选！今天和大家聊的问题叫做最长特殊序列II，我们先来看题面：https://leetcode-cn.com/problems/longest-uncommon-subsequence-ii/Givenanarrayof
Click Event Simulation：无需浏览器触发动态数据加载亿牛云爬虫专家 python 代理IP 爬虫代理浏览器动态数据 Click Event 模拟点击 python 爬虫代理代理IP
一、明确目标与前置知识目标使用Python模拟点击事件，直接发送HTTP请求采集拼多多上商品价格和优惠信息。采用爬虫代理（代理IP）的技术，设置好Cookie和User-Agent，以防止被目标网站屏蔽。利用多线程技术加速数据采集，提高效率。前置知识基本的Python编程知识HTTP协议与请求头、Cookie的概念多线程编程基础（如线程、队列的使用）代理IP的使用原理二、按步骤拆解操作1.环境准备
贪心算法及几个经典例子 G11176593 贪心算法算法动态规划
贪心算法一、基本概念：所谓贪心算法是指，在对问题求解时，总是做出在当前看来是最好的选择。也就是说，不从整体最优上加以考虑，他所做出的仅是在某种意义上的局部最优解。贪心算法没有固定的算法框架，算法设计的关键是贪心策略的选择。必须注意的是，贪心算法不是对所有问题都能得到整体最优解，选择的贪心策略必须具备无后效性，即某个状态以后的过程不会影响以前的状态，只与当前状态有关。所以对所采用的贪心策略一定要仔细
【Python爬虫实战】从多类型网页数据到结构化JSON数据的高效提取策略易辰君 python爬虫 python 爬虫开发语言
个人主页：https://blog.csdn.net/2401_86688088?type=blog系列专栏：https://blog.csdn.net/2401_86688088/category_12797772.html目录前言一、数据类型及其对应的提取策略（一）文本数据（二）数值数据（三）链接（四）图像数据（五）表格数据（六）JSON数据（七）动态数据（八）元数据（九）总结二、结构化数据提
贪心算法解题框架+经典反例分析，效率提升300% Reese_Cool 洛谷贪心算法算法 c++蓝桥杯
贪心算法是一种在每一步选择中都采取当前状态下的最优决策，从而希望最终达到全局最优解的算法策略。以下从其定义、特点、一般步骤、应用场景及实例等方面进行讲解：定义与基本思想•贪心算法在对问题求解时，总是做出在当前看来是最好的选择。也就是说，不从整体最优上加以考虑，它所做出的仅仅是在某种意义上的局部最优解。它通常以自顶向下的方式进行，每一步都选择当前的最优解，而不考虑之前或之后的步骤。特点•无后效性：即
python mongo异步操作_让python调用mongo读写速度加速10倍的方法 weixin_39867125 python mongo异步操作
1.把mongo读写封装成api2.在api初始化时保持数据库长链接；并且用线程每2分钟遍历一次所有的表并count一次importsysimporttimeimportpymongoimportjsonimportlogimporttracebackimportthreading//库名test，表名test_tableserver_list=['test-mongos.all.serv:636
2025年渗透测试面试题总结-快某手-安全实习生（一面、二面）（题目+回答）独行soc 2025年渗透测试面试指南安全科技网络面试护网 2015年
网络安全领域各种资源，学习文档，以及工具分享、前沿信息分享、POC、EXP分享。不定期分享各种好玩的项目及好用的工具，欢迎关注。目录快某手-安全实习生一面一、Linux操作：查看进程PID的5种方法二、Elasticsearch（ES）核心要点三、HTTPS建立过程（TLS1.3优化版）四、Python内存管理机制五、深拷贝与浅拷贝对比六、Python多线程局限性七、XSS防御方案八、SQL注入防
Python product函数介绍无尽的沉默函数用法 python
通过fromitertoolsimportproduct引入product函数。Product函数可以实现对矩阵做笛卡尔积importitertoolsforiteminitertools.product([1,2],[10,20]):print(item)'''(1,10)(1,20)(2,10)(2,20)'''iterables是可迭代对象,repeat指定iterable重复几次,即:pr
基于大数据架构的就业岗位推荐系统的设计与实现【java或python】—计算机毕业设计源码+LW文档 qq_375279829 大数据架构 python 课程设计算法
摘要随着互联网技术的迅猛发展和大数据时代的到来，就业市场日益复杂多变，求职者与招聘方之间的信息不对称问题愈发突出。为解决这一难题，本文设计并实现了一个基于大数据架构的就业岗位推荐系统。该系统通过收集、整合并分析大量求职者简历信息、企业招聘信息以及市场动态数据，运用先进的机器学习算法，为求职者提供个性化的岗位推荐服务，同时帮助企业快速定位到合适的候选人。本文将从系统设计的背景与意义、技术基础、需求分
句子改写器在线转换的原创性提升策略 hjehheje 算法人工智能 python
在文本处理领域，"句子改写器在线转换"的原创性提升并非单纯依赖工具升级，而是需要融合算法优化、人工干预与策略设计的系统工程。以下从技术底层到应用层拆解核心方法，辅以实验数据验证其可行性：一、语义拓扑重构技术（SemanticTopologyReconstruction）原理突破传统同义词替换仅影响表层词汇（LexicalLevel），而STR技术通过依存句法分析，构建句子的语义网络拓扑图，对主谓宾
玩转Mysql系列 - 第26篇：聊聊mysql如何实现分布式锁？「已注销」 mysql 分布式数据库 java 服务器
Mysql系列的目标是：通过这个系列从入门到全面掌握一个高级开发所需要的全部技能。欢迎大家加我微信itsoku一起交流java、算法、数据库相关技术。这是Mysql系列第26篇。本篇我们使用mysql实现一个分布式锁。分布式锁的功能分布式锁使用者位于不同的机器中，锁获取成功之后，才可以对共享资源进行操作锁具有重入的功能：即一个使用者可以多次获取某个锁获取锁有超时的功能：即在指定的时间内去尝试获取锁
1:1精准还原！用Python+Adobe Acrobat DC实现PDF转Word全自动化朴拙Python交易猿 python pdf word
以下是您请求的博客文章，包含详细的代码注释及分步解析：1:1精准还原！用Python+AdobeAcrobatDC实现PDF转Word全自动化一、为什么要选择AdobeAcrobatDC？作为PDF标准的制定者，AdobeAcrobatDC在格式转换领域具有无可比拟的优势：精准还原-保持原始布局、字体和格式表格保留-完整保留表格结构和数据批量处理-支持自动化执行重复任务OCR支持-自动识别扫描件中
后台运行python脚本 ch_atu #python之路 python linux
运行nohuppython-usocket_api.py>data.out2>&1&注：data.out是输出文件
搜索插入位置（js实现，LeetCode：35）充气大锤算法 leetcode 算法数据结构学习笔记 javascript 二分查找
给定一个排序数组和一个目标值，在数组中找到目标值，并返回其索引。如果目标值不存在于数组中，返回它将会被按顺序插入的位置。请必须使用时间复杂度为O(logn)的算法。示例1:输入:nums=[1,3,5,6],target=5输出:2示例2:输入:nums=[1,3,5,6],target=2输出:1示例3:输入:nums=[1,3,5,6],target=7输出:4提示:1<=nums.lengt
Vue中vfor循环创建DOM时Key的理解之Vue中的diff算法充气大锤前端性能优化 vue.js javascript 前端学习笔记算法 ecmascript
在Vue开发过程中vfor遍历数组创建Dom是最常见的方式，在vfor时，标签中有一个key值，key值的作用是啥呢？这就不得不提到Vue中的diff算法。一、什么是diff算法Vue会用虚拟DOM来表述真实DOM，这样的目的是为了计算出DOM的最小的变化从而更加快速的更新真实DOM二、diff算法的计算过程1、遍历老虚拟DOM2、遍历新虚拟DOM3、重新排序这样做会有个问题，就是节点数越多，计算
Python批量Word转PDF神器，让你从此轻松转换文档！码无止尽 Python办公自动化 python word pdf
大家好！今天我们来聊聊工作中可能遇到的一个“头大”问题：如何批量将Word文档转成PDF？是不是光听听都感觉头皮发麻？不用担心，今天我们就来分享一个Python小技巧，让你在批量转换文档时再也不用抓狂！为什么需要批量Word转PDF？想象一下，你是公司的行政小能手，每天面对成堆的合同、报告需要转换格式，手动操作简直不敢想象的累。关键是，老板还老催！Python作为技术潮人必备的技能之一，这时候就派
一周学会Flask3 Python Web开发-使用SQLAlchemy动态创建数据库表 java1234_小锋 Flask3视频教程 python 数据库开发语言 flask3 flask
锋哥原创的Flask3PythonWeb开发Flask3视频教程：2025版Flask3Pythonweb开发视频教程(无废话版)玩命更新中~_哔哩哔哩_bilibili前面我们定义了模型，我们可以通过sqlalchemy对象提供的create_all()方法来映射和动态创建数据库表。因为我们用到了模块化蓝图blueprint，这个sqlalchemy对象会在app.py和蓝图模块之间互相调用，导
用Python实现PDF转Doc格式小程序 Bruce_xiaowei 总结经验笔记编程 python pdf 小程序
用Python实现PDF转Doc格式小程序以下是一个使用Python实现PDF转DOC格式的GUI程序，采用Tkinter和pdf2docx库：importtkinterastkfromtkinterimportfiledialog,messageboxfrompdf2docximportConverterimportosclassPDFtoDOCConverter:def__init__(sel
python 使用flask+sqlalchemy 实现简单数据查询接口 darling331 python flask 开发语言后端
数据库表结构和部分数据SETNAMESutf8mb4;SETFOREIGN_KEY_CHECKS=0;--------------------------------Tablestructureforuser------------------------------DROPTABLEIFEXISTS`user`;CREATETABLE`user`(`id`int(11)NOTNULLAUTO_I
《Python实战进阶》No20: 网络爬虫开发：Scrapy框架详解带娃的IT创业者 Python实战进阶 python 爬虫 scrapy
No20:网络爬虫开发：Scrapy框架详解摘要本文深入解析Scrapy核心架构，通过中间件链式处理、布隆过滤器增量爬取、Splash动态渲染、分布式指纹策略四大核心技术，结合政府数据爬取与动态API逆向工程实战案例，构建企业级爬虫系统。提供完整代码与运行结果，包含法律合规设计与反爬对抗方案。Scrapy是适用于Python的一个快速、高层次的屏幕抓取和web抓取框架，用于抓取web站点并从页面中
Enum 枚举 120153216 enum 枚举
原文地址：http://www.cnblogs.com/Kavlez/p/4268601.html Enumeration 于Java 1.5增加的enum type...enum type是由一组固定的常量组成的类型，比如四个季节、扑克花色。在出现enum type之前，通常用一组int常量表示枚举类型。比如这样： public static final int APPLE_FUJI = 0
Java8简明教程 bijian1013 java jdk1.8
Java 8已于2014年3月18日正式发布了，新版本带来了诸多改进，包括Lambda表达式、Streams、日期时间API等等。本文就带你领略Java 8的全新特性。一.允许在接口中有默认方法实现 Java 8 允许我们使用default关键字，为接口声明添
Oracle表维护快速备份删除数据 cuisuqiang oracle 索引快速备份删除
我知道oracle表分区，不过那是数据库设计阶段的事情，目前是远水解不了近渴。当前的数据库表，要求保留一个月数据，且表存在大量录入更新，不存在程序删除。为了解决频繁查询和更新的瓶颈，我在oracle内根据需要创建了索引。但是随着数据量的增加，一个半月数据就要超千万，此时就算有索引，对高并发的查询和更新来说，让然有所拖累。为了解决这个问题，我一般一个月会进行一次数据库维护，主要工作就是备
java多态内存分析麦田的设计者 java 内存分析多态原理接口和抽象类
“ 时针如果可以回头，熟悉那张脸，重温嬉戏这乐园，墙壁的松脱涂鸦已经褪色才明白存在的价值归于记忆。街角小店尚存在吗？这大时代会不会牵挂，过去现在花开怎么会等待。但有种意外不管痛不痛都有伤害，光阴远远离开，那笑声徘徊与脑海。但这一秒可笑不再可爱，当天心
Xshell实现Windows上传文件到Linux主机被触发 windows
经常有这样的需求，我们在Windows下载的软件包，如何上传到远程Linux主机上？还有如何从Linux主机下载软件包到Windows下；之前我的做法现在看来好笨好繁琐，不过也达到了目的，笨人有本方法嘛；我是怎么操作的： 1、打开一台本地Linux虚拟机，使用mount 挂载Windows的共享文件夹到Linux上，然后拷贝数据到Linux虚拟机里面；（经常第一步都不顺利，无法挂载Windo
类的加载ClassLoader 肆无忌惮_ ClassLoader
类加载器ClassLoader是用来将java的类加载到虚拟机中，类加载器负责读取class字节文件到内存中，并将它转为Class的对象（类对象），通过此实例的 newInstance()方法就可以创建出该类的一个对象。其中重要的方法为findClass(String name)。如何写一个自己的类加载器呢？首先写一个便于测试的类Student
html5写的玫瑰花知了ing html5
<html> <head> <title>I Love You!</title> <meta charset="utf-8" /> </head> <body> <canvas id="c"></canvas>
google的ConcurrentLinkedHashmap源代码解析矮蛋蛋 LRU
原文地址： http://janeky.iteye.com/blog/1534352 简述 ConcurrentLinkedHashMap 是google团队提供的一个容器。它有什么用呢？其实它本身是对 ConcurrentHashMap的封装，可以用来实现一个基于LRU策略的缓存。详细介绍可以参见 http://code.google.com/p/concurrentlinke
webservice获取访问服务的ip地址 alleni123 webservice
1. 首先注入javax.xml.ws.WebServiceContext, @Resource private WebServiceContext context; 2. 在方法中获取交换请求的对象。 javax.xml.ws.handler.MessageContext mc=context.getMessageContext(); com.sun.net.http
菜鸟的java基础提升之道——————>是否值得拥有百合不是茶
1，c++，java是面向对象编程的语言，将万事万物都看成是对象；java做一件事情关注的是人物，java是c++继承过来的，java没有直接更改地址的权限但是可以通过引用来传值操作地址，java也没有c++中繁琐的操作，java以其优越的可移植型，平台的安全型，高效性赢得了广泛的认同，全世界越来越多的人去学习java，我也是其中的一员 java组成：
通过修改Linux服务自动启动指定应用程序 bijian1013 linux
Linux中修改系统服务的命令是chkconfig (check config)，命令的详细解释如下: chkconfig 功能说明：检查，设置系统的各种服务。语　　法：chkconfig [ -- add][ -- del][ -- list][系统服务] 或 chkconfig [ -- level <</SPAN>
spring拦截器的一个简单实例 bijian1013 java spring 拦截器 Interceptor
Purview接口 package aop; public interface Purview { void checkLogin(); } Purview接口的实现类PurviesImpl.java package aop; public class PurviewImpl implements Purview { public void check
[Velocity二]自定义Velocity指令 bit1129 velocity
什么是Velocity指令在Velocity中，#set,#if, #foreach, #elseif, #parse等，以#开头的称之为指令，Velocity内置的这些指令可以用来做赋值，条件判断，循环控制等脚本语言必备的逻辑控制等语句，Velocity的指令是可扩展的，即用户可以根据实际的需要自定义Velocity指令自定义指令(Directive)的一般步骤 &nbs
【Hive十】Programming Hive学习笔记 bit1129 programming
第二章 Getting Started 1.Hive最大的局限性是什么？一是不支持行级别的增删改(insert, delete, update)二是查询性能非常差(基于Hadoop MapReduce）,不适合延迟小的交互式任务三是不支持事务2. Hive MetaStore是干什么的？Hive persists table schemas and other system metadata.
nginx有选择性进行限制 ronin47 nginx 动静　限制
http { limit_conn_zone $binary_remote_addr zone=addr:10m; limit_req_zone $binary_remote_addr zone=one:10m rate=5r/s;... server {... location ~.*\.(gif|png|css|js|icon)$ {
java-4.-在二元树中找出和为某一值的所有路径 . bylijinnan java
/* * 0.use a TwoWayLinkedList to store the path.when the node can't be path,you should/can delete it. * 1.curSum==exceptedSum:if the lastNode is TreeNode,printPath();delete the node otherwise
Netty学习笔记 bylijinnan java netty
本文是阅读以下两篇文章时： http://seeallhearall.blogspot.com/2012/05/netty-tutorial-part-1-introduction-to.html http://seeallhearall.blogspot.com/2012/06/netty-tutorial-part-15-on-channel.html 我的一些笔记 ===
js获取项目路径 cngolon js
//js获取项目根路径，如： http://localhost:8083/uimcardprj function getRootPath(){ //获取当前网址，如： http://localhost:8083/uimcardprj/share/meun.jsp var curWwwPath=window.document.locati
oracle 的性能优化 cuishikuan oracle SQL Server
在网上搜索了一些Oracle性能优化的文章，为了更加深层次的巩固[边写边记]，也为了可以随时查看，所以发表这篇文章。 1.ORACLE采用自下而上的顺序解析WHERE子句，根据这个原理，表之间的连接必须写在其他WHERE条件之前，那些可以过滤掉最大数量记录的条件必须写在WHERE子句的末尾。（这点本人曾经做过实例验证过，的确如此哦！
Shell变量和数组使用详解 daizj linux shell 变量数组
Shell 变量定义变量时，变量名不加美元符号（$，PHP语言中变量需要），如： your_name="w3cschool.cc" 注意，变量名和等号之间不能有空格，这可能和你熟悉的所有编程语言都不一样。同时，变量名的命名须遵循如下规则：首个字符必须为字母（a-z，A-Z）。中间不能有空格，可以使用下划线（_）。不能使用标点符号。不能使用ba
编程中的一些概念，KISS、DRY、MVC、OOP、REST dcj3sjt126com REST
KISS、DRY、MVC、OOP、REST （1）KISS是指Keep It Simple,Stupid（摘自wikipedia），指设计时要坚持简约原则，避免不必要的复杂化。（2）DRY是指Don't Repeat Yourself（摘自wikipedia），特指在程序设计以及计算中避免重复代码，因为这样会降低灵活性、简洁性，并且可能导致代码之间的矛盾。（3）OOP 即Object-Orie
[Android]设置Activity为全屏显示的两种方法 dcj3sjt126com Activity
1. 方法1：AndroidManifest.xml 里，Activity的 android:theme 指定为" @android:style/Theme.NoTitleBar.Fullscreen" 示例: <application
solrcloud 部署方式比较 eksliang solrCloud
solrcloud 的部署其实有两种方式可选，那么我们在实践开发中应该怎样选择呢？第一种：当启动solr服务器时，内嵌的启动一个Zookeeper服务器，然后将这些内嵌的Zookeeper服务器组成一个集群。第二种：将Zookeeper服务器独立的配置一个集群，然后将solr交给Zookeeper进行管理谈谈第一种：每启动一个solr服务器就内嵌的启动一个Zoo
Java synchronized关键字详解 gqdy365 synchronized
转载自：http://www.cnblogs.com/mengdd/archive/2013/02/16/2913806.html 多线程的同步机制对资源进行加锁，使得在同一个时间，只有一个线程可以进行操作，同步用以解决多个线程同时访问时可能出现的问题。同步机制可以使用synchronized关键字实现。当synchronized关键字修饰一个方法的时候，该方法叫做同步方法。当s
js实现登录时记住用户名 hw1287789687 记住我记住密码 cookie 记住用户名记住账号
在页面中如何获取cookie值呢? 如果是JSP的话,可以通过servlet的对象request 获取cookie,可以参考:http://hw1287789687.iteye.com/blog/2050040 如果要求登录页面是html呢?html页面中如何获取cookie呢? 直接上代码了页面:loginInput.html 代码: <!DOCTYPE html PUB
开发者必备的 Chrome 扩展 justjavac chrome
Firebug：不用多介绍了吧https://chrome.google.com/webstore/detail/bmagokdooijbeehmkpknfglimnifench ChromeSnifferPlus：Chrome 探测器，可以探测正在使用的开源软件或者 js 类库https://chrome.google.com/webstore/detail/chrome-sniffer-pl
算法机试题李亚飞 java 算法机试题
在面试机试时，遇到一个算法题，当时没能写出来，最后是同学帮忙解决的。这道题大致意思是：输入一个数，比如4,。这时会输出： &n
正确配置Linux系统ulimit值字符串 ulimit
在Linux下面部署应用的时候，有时候会遇上Socket/File: Can’t open so many files的问题；这个值也会影响服务器的最大并发数，其实Linux是有文件句柄限制的，而且Linux默认不是很高，一般都是1024，生产服务器用其实很容易就达到这个数量。下面说的是，如何通过正解配置来改正这个系统默认值。因为这个问题是我配置Nginx+php5时遇到了，所以我将这篇归纳进
hibernate调用返回游标的存储过程 Supanccy2013 java DAO oracle Hibernate jdbc
注：原创作品，转载请注明出处。上篇博文介绍的是hibernate调用返回单值的存储过程，本片博文说的是hibernate调用返回游标的存储过程。此此扁博文的存储过程的功能相当于是jdbc调用select 的作用。 1，创建oracle中的包，并在该包中创建的游标类型。 ---创建oracle的程
Spring 4.2新特性-更简单的Application Event wiselyman application
1.1 Application Event Spring 4.1的写法请参考10点睛Spring4.1-Application Event 请对比10点睛Spring4.1-Application Event 使用一个@EventListener取代了实现ApplicationListener接口,使耦合度降低; 1.2 示例包依赖 <p