利用遗传算法(Genetic Algorithm, GA)构建了一个求一元二次方程近似解的项目。
语言:python 3.6.8
操作系统:win10
所用的python包:
这里将解表示为二进制代码,默认定义一共有24位,前8位是解的整数位,第一位是表示正负的符号位(0-负,1-正),后16位是解的小数位,整个解的搜索空间是 [+127.65535, -127.65535]
。
I N T = 2 7 + 2 6 + 2 5 + . . . + 2 0 = 127 F l o a t = 2 15 + 2 14 + 2 13 + . . . + 2 0 = 65535 INT = 2^7+2^6+2^5+...+2^0=127 \\ Float= 2^{15}+2^{14}+2^{13}+...+2^0=65535 INT=27+26+25+...+20=127Float=215+214+213+...+20=65535
子解,也即子代的编码来自于适合度较高的一对父/母解,子解的每一位通过随机的方式来自于父解或母解。这里默认选择适合度(升序)排名前20的解两两随机配对组成父母解,每一对父母可以生产10个子解。
注:图中仅标注了父母解同位不同编码情况下子解的编码来源,实际上是每一位都要随机选定来自父/母解。
这里为了增大突变的差异,默认定义了发生一次突变一次性更改子解的任意五个位,变为与之前相反的数字,比如0变1,1变0。
T a r g e t = T ( x ) = c o e f 1 ∗ x 2 + c o e f 2 ∗ x F i t n e s s = ( T a r g e t − T ( x 1 ) ) 2 Target = T(x) = coef_1*x^2+coef_2*x \\ Fitness = (Target - T(x_1))^2 Target=T(x)=coef1∗x2+coef2∗xFitness=(Target−T(x1))2
coef1 和 coef2 是系数常量,Target 也是一个常量 由于我构建遗传的目的是要逼近一元二次方程的一个近似解,因此这里的适合度我定义作与 Target 的误差平方和,且适合度越小的个体,存活率越高。
遗传算法的具体介绍可以参考:《遗传算法(Genetic Algorithm)从了解到实例运用(上)(python)》
这里我简单概括一下他的特点和我的理解。
遗传算法是定向随机优化的一种,需要实现它包含了章节1中的4个特点:1. 编码;2. 交叉融合;3. 突变;4. 适合度定义。适合度是进化的方向,好的适合度定义对遗传算法来说至关重要,我们通过适合度优胜劣汰,择取“优质”的个体,两两配对通过交叉融合和突变产生新的群体,又再不断筛选适合度高的个体进行配对产生下一代,如此反复,达到适合的迭代次数后便能产生得到符合我们预期的群体。
遗传算法体现了“物竞天择,适者生存”的哲学观点。一般来说,虽然大家都说遗传算法是全局寻优的,可是我并不这么认为,在交叉融合和突变过程中如果操作人不当的安排是达不到全局寻优的,这里我的安排还是有缺陷的,我是选择适合度排名前20的个体任意两两配对,每对父母产生10个子解,当适合度的最高的物种的数量较少时,我这么安排会有很大的可能性导致由于交叉融合和突变的子代丧失了父母的高适合度,从而湮灭在浩荡的种群生存历史中。比较好的做法应该是直接复制多个适合度最高的个体编码赋予子代,直接保留父母的特性。
另外遗传算法的搜索在解空间的跳跃距离也是达到全局寻优的关键,遗传算法的跳跃能力取决于交叉融合和突变的策略。这里我做的交叉融合策略更像“同位继承”,在一定的世代迭代后,我们的群体适合度往往相差无几,甚至具有同一编码,那么在是“同位继承”下是得不到具有差异的子代的,甚至直接是父母的副本,群体到了这一步,只能依靠发挥不太稳定的突变了,如果依照我目前的交叉融合策略,父母的组成上我需要保证他们的多样性,很可惜,我并没有,我也是后知后觉,但是还好脚本运作正常,可以得到最优解的近似解。另外,在交叉融合上还有其他更花的方式,扩展到真正的生物体染色体上,简直就是变态,比如,直接把编码倒位,父母解的编码不同部分任意融合等,然而真正的生物体上的染色体确实存在交叉互换,倒位,易位,缺失等,不过概率比较低。 所以,在交叉融合和突变的策略上需要保证产生的子代的多样性的同时还要保证子代于与父母的不同,同时又要通过产生和父母编码相同的个体来保留父母的特性来延续群体,而编码有差异的个体就是开拓的先锋,群体通过这些差异个体来寻找更好的编码,从而更好的适应环境。群体浩浩荡荡,经久不衰的生存史,是建立在不少先辈英勇牺牲上的,我们当下的幸福是一位位革命先烈和为祖国事业做出伟大贡献的人流血拼命换来的,希望诸君,勤学好问,砥砺前行,将来轮到我们站出来时,有能力也有血气的肩住黑暗的闸门放,下一辈到宽阔光明的地方去,希望每个人都能尽一份光与热,萤火便发一份萤火的光,若世上没有火炬我便是唯一的光。
细节模式和非细节模式的控制在于DETAIL
变量的False
和True
if __name__ == "__main__":
# Define a express
f="1000=12x**2+25x"
# Individual code information
XINTSIZE = 9
XDECSIZE = 16
# Population information
POPULATION_SIZE = 100
CHILD_SIZE = 10
MAX_GENERATION_SIZE = 50
MUTATE_SIZE = 5
MUTATE_RATE = 0.05
CROSS_MERGE_RATE = 0.5
# visualization detail
X_RANGE = [-100,100]
Y_RANGE = [-500,6000]
# Print detail
DETAIL = False
PAUSE = True
# Create a population
numPopulation = Population(
popSize = POPULATION_SIZE,
childsSize = CHILD_SIZE,
maxGenerationSize = MAX_GENERATION_SIZE,
formulate=f,
)
numPopulation.live()
# visualization
if not DETAIL:
numPopulation.visualization()
else:
numPopulation.visualizationDetail()
print("-"*80)
best = numPopulation.species[0]
bestX = best.getDecimal()
bestTarget = numPopulation.getTarget(bestX)
print("Best x = %.3f"%bestX)
print("Best Target = %.3f"%bestTarget)
print("%.3f = %.3fx**2 + %.3fx"%(numPopulation.target,numPopulation.xCoef,numPopulation.yCoef))
这里演示了整数位长9位,小数位长16位,群体大小为100,每对父母产生10个子代,最大世代为50代,突变规模为一次5位,突变率为0.05,交叉融合来自父/母各一半的概率下,一元二次方程:
1000 = 12 ∗ x 2 + 25 x 1000=12*x^2+25x 1000=12∗x2+25x
下的脚本输出
打印每个世代前20个体的信息。最后会可视化每一代的取值情况。
控制台打印
可视化世代取值
可见大概五代左右就出现了两个最优解,可是其中一个解随着世代的进行,慢慢消失了。
非细节模式打印下,会计算每个世代前20个个体的平均值并打印输出。
这里主要讲解了代码部分我认为比较重要的地方以及具体的流程
他是解的定义,内部拥有一个code
列表变量,默认情况下有24位,即是解的编码储存,一共拥有5个方法,具体的介绍可见对应代码块中的注释部分。他是解个体在代码中的容器。
他是群体,通过species
列表变量管理每个个体,这些个体是NumberSpecies
类的实例。其中live
API是管理整个群体迭代的部分,它是一个循环,定义达到最大世代数为止。
def live(self):
"""
群体生存周期
"""
while self.generation < self.maxGenerationSize:
self.nextGeneration()
def nextGeneration(self):
"""
产生下一个世代
"""
# calculate current generation fitness
fitnessList = self.calFitness(self.species)
# order by fitness
self.sortByFitness(fitnessList, self.species)
#Survival individual
# child born
childs = self.getChildPopulation()
#choose survival child as next generation
fitnessListChild = self.calFitness(childs)
self.sortByFitness(fitnessListChild, childs)
self.species = childs[:self.popSize]
#
self.generation += 1
#
self.show(detail = DETAIL)
这是遗传算法的处理逻辑,如下:
def sortByFitness(self, fitnessList, speciesList):
"""
根据适合度排序,选择排序,升序
"""
L = len(fitnessList)
for i in range(L):
for j in range(i,L):
if fitnessList[i] > fitnessList[j]:
fitnessList[i], fitnessList[j] = fitnessList[j], fitnessList[i]
speciesList[i], speciesList[j] = speciesList[j], speciesList[i]
通过冒泡排序,按照适合度从小到大的顺序排列个体和适合度列表。
def getChildPopulation(self):
"""
子代出生
return child
"""
# selectParent
fathersI, mothersI = self.selectParent()
L = len(fathersI)
# get childs
childs = []
for i in range(L):
for j in range(self.childsSize):
# child born
fI = fathersI[i]
mI = mothersI[i]
child = self.getChild(self.species[fI], self.species[mI])
# add child to child population
childs.append(child)
return childs
默认选择适合度排名前20的个体随机两两组成一对父母,返回父亲个体所在species
变量的索引列表fathersI
,母亲类似。然后每对父母依次通过交叉融合和突变后得到10个孩子,添加到子代群体中并返回子代群体。
def getChild(self,f,m):
"""
1.二进制编码交叉融合
2.突变编码
单个孩子
"""
assert isinstance(f,NumberSpecies),"crossMerge(f,m) f and m must be NumberSpecies class"
assert isinstance(m,NumberSpecies),"crossMerge(f,m) f and m must be NumberSpecies class"
seed = random.uniform(0,1)
# do crossMerge?
# decide cross position
childCode = []
for i in range(f.totalSize):
fromSeed = random.uniform(0,1)
if fromSeed > self.crossMergeRate:
childCode.append(f.code[i])
else:
childCode.append(m.code[i])
# do mutate?
# randomly choose x position to mutate
if seed < self.mutateRate:
tempPosIndex = [i for i in range(f.totalSize)]
mutatePos = random.sample(tempPosIndex,self.mutatePosSize)
# Change code
for i in mutatePos:
if childCode[i] == 0:
childCode[i] = 1
else:
childCode[i] = 0
# child born
child = NumberSpecies(XINTSIZE,XDECSIZE)
child.assignCode(childCode)
return child
输入父母实例。随机数的产生通过random.uniform (0, 1)
从一个0到1均等分布中抽取。默认如果大于0.5则来自父方的编码位,否则来自母方。如果产生的随机数默认小于0.05则,任意突变5个编码位。将编码直接赋予NumberSpecies
新的实例,并作为单个子代放回。
def selectParent(self,size = 20):
"""
默认选择适合度排名前20的父母,随机两两配对
return index list of select species one is father another is mother
"""
assert size < len(self.species), "selectParent Func size=%d par must less population%d"%(size, len(self.species))
# get size of couple
coupleSize = size // 2
# get total index of couple
total = set([i for i in range(coupleSize*2)])
# father and mother
father = set(random.sample(total,coupleSize))
mother = list(total - father)
mother = random.sample(mother,coupleSize)
father = random.sample(father,coupleSize)
return father, mother
父母的随机分配通过random.sample (population, times)
随机从population中抽取times个样本,作为父母的索引列表返回。
#python3
#--------------------------
# Author: little shark
# Date: 2022/4/13
"""
遗传算法求一元二次方程
"""
import random
from matplotlib import pyplot as plt
from time import sleep
import re
def getRange(x,y):
"""
x: x^2的系数
y: x的系数
返回可视化的取值范围[-50, 50]
"""
targets = []
scale = [i for i in range(X_RANGE[0], X_RANGE[1], 1)]
for i in scale:
t = x * i**2 + y*i
targets.append(t)
return targets
class NumberSpecies:
"""
定义了一个逼近一元二次方程解的物种
"""
def __init__(self, xIntSize = 8, xDecSize = 16):
"""
xIntSize: x元的二进制编码整数位数,默认8位,第一位是正负位
xDecSize: x元的二进制编码小数位数,默认16位
默认一共24位,排列如下
8位x元整数位 16位x元小数位
"""
# define a bit size to x
self.xIntSize = xIntSize
# define a bit size to decimal
self.xDecSize = xDecSize
# total size
self.totalSize = xIntSize + xDecSize
# define code
self.code=[0 for i in range(self.totalSize)]
# random it code
self.random()
def assignCode(self,code):
"""
直接赋予数字物种二进制代码
"""
self.code = code.copy()
def show(self):
"""
打印二进制编码信息及其对应的x元,y元的十进制
"""
print("code = ",self.code)
print("numb = ",self.getDecimal())
#print("fitness = ",self.fitness)
def random(self):
"""
Ramdom code
随机其编码
"""
self.code=[random.choice([0,1]) for i in range(self.totalSize)]
def getDecimal(self):
"""
turn code into decimal
将二进制编码转为十进制
"""
#---------------------------------
# X part
# part of x int
xIntNum = 0
start = 1
signXIndex = 0
end = self.xIntSize
xIntCode = self.code[start: end]
for i in range(self.xIntSize-1):
xIntNum += pow(2,self.xIntSize - i - 1)*xIntCode[i]
# part of x decimal
xDecNum = 0
start = end
end = end + self.xDecSize
xDecCode = self.code[start: end]
for i in range(self.xDecSize):
xDecNum += pow(2,self.xDecSize - i - 1)*xDecCode[i]
# x str -> float
xDecStr = str(xIntNum) + "." + str(xDecNum)
xFinalNum = float(xDecStr)
if self.code[signXIndex] == 0:
xFinalNum *= -1
return xFinalNum
def codeClone(self):
"""
return a code clone
"""
return self.code.copy()
class Population:
"""
管理NumberSpecies:
1.控制物种突变;
2.控制交叉融合;
3.计算适合度;
4.不断更新群体,淘汰适合度低的物种,选择适合度高的父/母本进行配对,诞生下一代。
"""
def __init__(self, popSize = 100, childsSize = 4, maxGenerationSize = 50, formulate = "" ):
"""
popSize: 群体大小
childsSize: 每对夫妇的子女数
maxGenerationSize: 最大世代数
formulate: 公式 constant = c1*x**n + c2*y
"""
# Is expression legal?
assert formulate != "", "You must input a formulate like target = c1*x**n + c2*y"
# Extract information from formulate express
m = re.search("([\d\.-]+)=([\d\.-]+)x\*\*2\+([\d\.-]+)x",formulate)
# Is formulate legal?
assert m != None,"Your formulate is not legal like: target = c1x**n + c2y "
# Assign, formulate
self.target = float(m.group(1))
self.xCoef = float(m.group(2))
self.xPower = 2
self.yCoef = float(m.group(3))
# Define the index of current generation
self.generation = 0
# Assign, generation information
self.popSize = popSize
self.childsSize = childsSize
self.maxGenerationSize = maxGenerationSize
# Assign, about mutate and cross merge
self.mutateRate = MUTATE_RATE
self.crossMergeRate = CROSS_MERGE_RATE
self.mutatePosSize = MUTATE_SIZE # mutate code number default 3
# Preduce population
self.species = [NumberSpecies(xIntSize = XINTSIZE, xDecSize = XDECSIZE) for i in range(popSize)]
# history of population: fitness & x & y & predict constant "Target"
self.historicalMeanFitness = []
self.historicalMeanXDecimal = []
self.historicalMeanYDecimal = []
self.historicalMeanTargetDecimal = []
self.XhistoricalTopPopulation = []
self.TargethistoricalTopPopulation = []
def getTop20Mean(self,xDecimalList,fitnessList):
"""
计算适合度排名前20物种的x和y元的平均系数以及适合度并返回
return average of coef about x & y and its target by define express
"""
assert len(xDecimalList) > 20, "Your population must more than 20"
meanXDecimal = sum(xDecimalList[:20])/len(xDecimalList[:20])
meanfitness = sum(fitnessList[:20])/len(fitnessList[:20])
return meanXDecimal, meanfitness
def calFitness(self, speciesList):
"""
计算适合度 (与定义的目标数字的误差平方)
"""
fitnessList = []
for i,num in enumerate(speciesList):
xDecNum = num.getDecimal()
# get fitness base on express
fitness = (self.target - self.getTarget(xDecNum)) ** 2
# save
fitnessList.append(fitness)
return fitnessList
def sortByFitness(self, fitnessList, speciesList):
"""
根据适合度排序,选择排序,升序
"""
L = len(fitnessList)
for i in range(L):
for j in range(i,L):
if fitnessList[i] > fitnessList[j]:
fitnessList[i], fitnessList[j] = fitnessList[j], fitnessList[i]
speciesList[i], speciesList[j] = speciesList[j], speciesList[i]
def getTarget(self,x):
"""
根据公式和给出的x的系数计算其结果。
"""
return self.xCoef * x ** self.xPower + self.yCoef * x
def show(self, top=20, detail = False, pause = False):
"""
打印世代Top N的信息:包括:世代索引,适合度,x,预测的结果,真实的结果
有两个模式:细节模式和非细节模式
细节模式:detail = True
打印Top N 物种的上述信息
非细节模式:datail = False
打印Top 20 物种的上述信息的平均值
可以进行可视化,会保存种群进化过程中的世代信息
"""
# Get decimal of x and y
xDecNums = []
for i in self.species:
xDecNum = i.getDecimal()
xDecNums.append(xDecNum)
# Get fitness
fitnessList = self.calFitness(self.species)
# Start show information
if detail:
print()
print("="*80)
print(" "*30,"*- Top %d -*"%top)
print(" "*25,"*- %d generation -*"%self.generation)
print("="*80)
print("%-12s %-12s %-12s %-12s %-12s"%("Index","Fitness","XDecimal","MyTarget","RealTarget"))
print("-"*80)
myTargets = []
for i in range(top):
my_target = self.getTarget(xDecNums[i])
myTargets.append(my_target)
print("%-12d %-12.3f %-12.3f %-12.3f %-12.3f"%(i, fitnessList[i], xDecNums[i], my_target, self.target))
print("-"*80)
# Save
self.XhistoricalTopPopulation.append(xDecNums[:top])
self.TargethistoricalTopPopulation.append(myTargets.copy())
if pause:
sleep(0.5)
else:
xDecimal, fitness = self.getTop20Mean(xDecNums, fitnessList)
my_target = self.getTarget(xDecimal)
if self.generation == 1:
print("%-12s %-12s %-12s %-12s %-12s"%("Generation","Fitness","XDecimal","MyTarget","RealTarget"))
print("-"*100)
print("%-12d %-12.3f %-12.3f %-12.3f %-12.3f"%(self.generation,fitness, xDecimal, my_target, self.target))
# Save history
self.historicalMeanFitness.append(fitness)
self.historicalMeanXDecimal.append(xDecimal)
self.historicalMeanTargetDecimal.append(my_target)
if pause:
sleep(0.5)
def visualization(self):
"""
可视化世代历史信息
"""
# Fitness information about
plt.figure(figsize=(8,5))
plt.subplot(2,1,1)
plt.plot([i+1 for i in range(self.generation)],self.historicalMeanFitness,linestyle="--",marker="o")
plt.ylabel("Fitness")
# My Target information about
plt.subplot(2,1,2)
plt.plot([i+1 for i in range(self.generation)],self.historicalMeanTargetDecimal,linestyle="--",marker="o")
plt.ylabel("My Target real = %.3f"%self.target)
plt.show()
def visualizationDetail(self):
"""
可视化世代取值
"""
plt.ion()
plt.figure(figsize=(8,5))
plt.ion()
xScale = [i for i in range(X_RANGE[0], X_RANGE[1])]
yScale = getRange(self.xCoef, self.yCoef)
for i in range(len(self.XhistoricalTopPopulation)):
plt.cla()
plt.plot(xScale,yScale, alpha=0.7, linestyle=":",label="Express Space",color="blue")
plt.axhline(self.target,color="red",alpha=0.7,linestyle="--",label="Real Target Base Line")
plt.scatter(
self.XhistoricalTopPopulation[i],
self.TargethistoricalTopPopulation[i],
alpha=0.7,
color="red",
label = "My Target"
)
plt.title("Generation %d max is %d"%(i,self.maxGenerationSize))
plt.ylim(Y_RANGE[0],Y_RANGE[1])
plt.xlim(X_RANGE[0], X_RANGE[1])
plt.legend()
plt.pause(1)
plt.close()
def selectParent(self,size = 20):
"""
默认选择适合度排名前20的父母,随机两两配对
return index list of select species one is father another is mother
"""
assert size < len(self.species), "selectParent Func size=%d par must less population%d"%(size, len(self.species))
# get size of couple
coupleSize = size // 2
# get total index of couple
total = set([i for i in range(coupleSize*2)])
# father and mother
father = set(random.sample(total,coupleSize))
mother = list(total - father)
mother = random.sample(mother,coupleSize)
father = random.sample(father,coupleSize)
return father, mother
def live(self):
"""
群体生存周期
"""
while self.generation < self.maxGenerationSize:
self.nextGeneration()
def nextGeneration(self):
"""
产生下一个世代
"""
# calculate current generation fitness
fitnessList = self.calFitness(self.species)
# order by fitness
self.sortByFitness(fitnessList, self.species)
#Survival individual
# child born
childs = self.getChildPopulation()
#choose survival child as next generation
fitnessListChild = self.calFitness(childs)
self.sortByFitness(fitnessListChild, childs)
#
self.generation += 1
#
self.show(detail = DETAIL, pause = PAUSE)
self.species = childs[:self.popSize]
def getChildPopulation(self):
"""
子代出生
return child
"""
# selectParent
fathersI, mothersI = self.selectParent()
L = len(fathersI)
# get childs
childs = []
for i in range(L):
for j in range(self.childsSize):
# child born
fI = fathersI[i]
mI = mothersI[i]
child = self.getChild(self.species[fI], self.species[mI])
# add child to child population
childs.append(child)
return childs
def getChild(self,f,m):
"""
1.二进制编码交叉融合
2.突变编码
单个孩子
"""
assert isinstance(f,NumberSpecies),"crossMerge(f,m) f and m must be NumberSpecies class"
assert isinstance(m,NumberSpecies),"crossMerge(f,m) f and m must be NumberSpecies class"
seed = random.uniform(0,1)
# do crossMerge?
# decide cross position
childCode = []
for i in range(f.totalSize):
fromSeed = random.uniform(0,1)
if fromSeed > self.crossMergeRate:
childCode.append(f.code[i])
else:
childCode.append(m.code[i])
# do mutate?
# randomly choose x position to mutate
if seed < self.mutateRate:
tempPosIndex = [i for i in range(f.totalSize)]
mutatePos = random.sample(tempPosIndex,self.mutatePosSize)
# Change code
for i in mutatePos:
if childCode[i] == 0:
childCode[i] = 1
else:
childCode[i] = 0
# child born
child = NumberSpecies(XINTSIZE,XDECSIZE)
child.assignCode(childCode)
return child
if __name__ == "__main__":
# Define a express
f="1000=12x**2+25x"
# Individual code information
XINTSIZE = 9
XDECSIZE = 16
# Population information
POPULATION_SIZE = 100
CHILD_SIZE = 10
MAX_GENERATION_SIZE = 50
MUTATE_SIZE = 5
MUTATE_RATE = 0.05
CROSS_MERGE_RATE = 0.5
# visualization detail
X_RANGE = [-100,100]
Y_RANGE = [-500,6000]
# Print detail
DETAIL = False
PAUSE = True
# Create a population
numPopulation = Population(
popSize = POPULATION_SIZE,
childsSize = CHILD_SIZE,
maxGenerationSize = MAX_GENERATION_SIZE,
formulate=f,
)
numPopulation.live()
# visualization
if not DETAIL:
numPopulation.visualization()
else:
numPopulation.visualizationDetail()
print("-"*80)
best = numPopulation.species[0]
bestX = best.getDecimal()
bestTarget = numPopulation.getTarget(bestX)
print("Best x = %.3f"%bestX)
print("Best Target = %.3f"%bestTarget)
print("%.3f = %.3fx**2 + %.3fx"%(numPopulation.target,numPopulation.xCoef,numPopulation.yCoef))
感谢各位阅读,如有错误,敬请指正