上一期
OneMax问题是遗传算法的入门问题,其内容是:如何使一段长度固定的二进制字符串所有位置上数字之和最大。
让我们用一个长度为5的二进制字符串为例:
对一般人,显而易见,当所有位数都为1时,该字符串的和最大,但在我们用遗传算法解决该问题时,遗传算法本身并没有这样的知识。接下来我们将不依靠任何遗传算法的包,从头开始用遗传算法解决OneMax问题。
首先,我们得把这个问题转换成一个遗传算法问题,即:我们得定义个体、种群,选择、杂交、突变方法、适应度函数等。假设有一个长度为100的字符串,我们可以做出以下定义:
若对上述定义不太了解的,可以回看遗传算法系列的第二期。
以下将分步骤解释每一部分的代码,完整代码在本文的最后可见。
import randomimport matplotlib.pyplot as pltrandom.seed(39)
导入所有需要的包:
# 1. define individual and populationdef CreateIndividual(): return([random.randint(0,1) for _ in range(100)])def CreatePopulation(size): return([CreateIndividual() for _ in range(size)])
# 2.1. define select functiondef tournament(population,size): participants = random.sample(population,size) # evaluate function is defined in 3.4 winner = max(participants,key=lambda ind:evaluate(ind)) return(winner.copy())def select(population,size): return([tournament(population,size) for _ in range(len(population))])
# 2.2. define mate functiondef SinglePointCrossover(ind1,ind2): loc = random.randint(0,len(ind1)-1) genes1 = ind1[loc:] genes2 = ind2[loc:] ind1[loc:] = genes2 ind2[loc:] = genes1 return([ind1.copy(),ind2.copy()])def mate(population,probability): new_population = [] for i in range(0,len(population),2): ind1 = population[i].copy() ind2 = population[i+1].copy() if random.random() < probability: new_population.extend(SinglePointCrossover(ind1,ind2)) else: new_population.extend([ind1,ind2]) return(new_population)
# 2.3. define mutate functiondef flipOneGene(ind): loc = random.randint(0,len(ind)-1) ind[loc] = 1 - ind[loc] # 0->1 or 1->0 return(ind.copy())def mutate(population,probability): new_population = [] for ind in population: if random.random() < probability: new_population.append(flipOneGene(ind)) else: new_population.append(ind.copy()) return(new_population)
# 2.4. define evaluate functiondef evaluate(individual): return(sum(individual))
OneMax的适应度函数就是列表中所有数字之和。
# 2.5. define statistical metrics to monitor algorithm performancedef population_score_max(population): return(max([evaluate(ind) for ind in population]))def population_score_mean(population): return(sum([evaluate(ind) for ind in population])/len(population))
为了追踪算法的进度和发现算法中可能出现的错误,我们可以统计每次迭代中种群适应度的最大值与均值。
# 3. Run genetic algorithmdef main( POPULATION_SIZE = 100, TOURNAMENT_SIZE = 3, CROSSOVER_PROB = 0.9, MUTATE_PROB = 0.1, MAX_GENERATIONS = 100): generation = 0 population = CreatePopulation(POPULATION_SIZE) max_scores = [population_score_max(population)] mean_scores = [population_score_mean(population)] best_individual = [] while generation < MAX_GENERATIONS: population = select(population,TOURNAMENT_SIZE) population = mate(population,CROSSOVER_PROB) population = mutate(population,MUTATE_PROB) # collect statistics max_scores.append(population_score_max(population)) mean_scores.append(population_score_mean(population)) best_individual = max( best_individual, max(population,key=lambda ind: evaluate(ind)) ).copy() generation += 1 print("Best Solution:") print(best_individual) plt.plot(max_scores, color='red',label="Max Score") plt.plot(mean_scores, color='green',label="Mean Score") plt.legend() plt.xlabel("Generations") plt.ylabel("Fitness Score") plt.grid() plt.show() if __name__ == "__main__": main()
在运行算法前,我们首先得定义一些参数:
在迭代过程中,每一代里我们都进行选择(select),杂交(mate),突变(mate)运算,并收集种群的最大和平均适应度数据,用以追踪算法的进度,或发现算法中存在的问题。视问题而定,我们还可以记录下每代中适应度最高的个体(best individual),以防止其因杂交和突变而消失。
最后,我们观察最优解与种群的数据。
我们成功获得了OneMax的最优解(所有位置上都是1):[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]
种群的进化过程如下图所示,我们可以观察到种群的进化速度由快到慢,且在61代左右停止进化,且在第58代时已经产生了最优解。
import randomimport matplotlib.pyplot as pltrandom.seed(39)# 1. define individual and populationdef CreateIndividual(): return([random.randint(0,1) for _ in range(100)])def CreatePopulation(size): return([CreateIndividual() for _ in range(size)])# 2. define select, mate, mutate, evaluate function ----# 2.1. define select functiondef tournament(population,size): participants = random.sample(population,size) # evaluate function is defined in 3.4 winner = max(participants,key=lambda ind:evaluate(ind)) return(winner.copy())def select(population,size): return([tournament(population,size) for _ in range(len(population))])# 2.2. define mate functiondef SinglePointCrossover(ind1,ind2): loc = random.randint(0,len(ind1)-1) genes1 = ind1[loc:] genes2 = ind2[loc:] ind1[loc:] = genes2 ind2[loc:] = genes1 return([ind1.copy(),ind2.copy()])def mate(population,probability): new_population = [] for i in range(0,len(population),2): ind1 = population[i].copy() ind2 = population[i+1].copy() if random.random() < probability: new_population.extend(SinglePointCrossover(ind1,ind2)) else: new_population.extend([ind1,ind2]) return(new_population)# 2.3. define mutate functiondef flipOneGene(ind): loc = random.randint(0,len(ind)-1) ind[loc] = 1 - ind[loc] # 0->1 or 1->0 return(ind.copy())def mutate(population,probability): new_population = [] for ind in population: if random.random() < probability: new_population.append(flipOneGene(ind)) else: new_population.append(ind.copy()) return(new_population)# 2.4. define evaluate functiondef evaluate(individual): return(sum(individual))# 2.5. define statistical metrics to monitor algorithm performancedef population_score_max(population): return(max([evaluate(ind) for ind in population]))def population_score_mean(population): return(sum([evaluate(ind) for ind in population])/len(population))# 3. Run genetic algorithmdef main( POPULATION_SIZE = 100, TOURNAMENT_SIZE = 3, CROSSOVER_PROB = 0.9, MUTATE_PROB = 0.1, MAX_GENERATIONS = 100): generation = 0 population = CreatePopulation(POPULATION_SIZE) max_scores = [population_score_max(population)] mean_scores = [population_score_mean(population)] best_individual = [] while generation < MAX_GENERATIONS: population = select(population,TOURNAMENT_SIZE) population = mate(population,CROSSOVER_PROB) population = mutate(population,MUTATE_PROB) # collect statistics max_scores.append(population_score_max(population)) mean_scores.append(population_score_mean(population)) best_individual = max( best_individual, max(population,key=lambda ind: evaluate(ind)) ).copy() generation += 1 print("Best Solution:") print(best_individual) plt.plot(max_scores, color='red',label="Max Score") plt.plot(mean_scores, color='green',label="Mean Score") plt.legend() plt.xlabel("Generations") plt.ylabel("Fitness Score") plt.grid() plt.show() if __name__ == "__main__": main()
为了深入的解释遗传算法,本文中没有使用任何的遗传算法包来解决OneMax问题。而下一期,我们会介绍DEAP(Distributed Evolutionary Algorithm in Python)包,并在之后的文章里用DEAP框架来解决遗传算法问题。
本人近期刚开始写文章,欢迎交流学习!