遗传算法与爬山算法简介
Genetic Algorithms (GAs) are a part of Evolutionary Computing (EC), which is a rapidly growing area of Artificial Intelligence (AI). It inspired by the process of biological evolution based on Charles Darwin’s theory of natural selection, where fitter individuals are more likely to pass on their genes to the next generation. We, as human beings, also are the result of thousands of years of evolution.
遗传算法(GA)是进化计算(EC)的一部分,后者是人工智能(AI)Swift发展的领域。 它受到了基于查尔斯·达尔文(Charles Darwin)的自然选择理论的生物进化过程的启发,在那里,更健康的个体更可能将其基因传给下一代。 作为人类,我们也是数千年进化的结果。
遗传算法的历史 (History of Genetic Algorithms)
The GA, developed by John Holland and his collaborators in the 1960s and 1970s.
由约翰·霍兰德(John Holland)和他的合作者在1960年代和1970年代开发的GA。
- As early as 1962, John Holland’s work on adaptive systems¹ laid the foundation for later developments. 早在1962年,约翰·霍兰德(John Holland)在自适应系统上的工作¹就为以后的发展奠定了基础。
By the 1975, the publication of the book “Adaptation in Natural and Artificial Systems”², by Holland and his students and colleagues.
到1975年,荷兰和他的学生及同事出版了《 自然和人工系统的适应 》一书。
The GA got popular in the late 1980s by was being applied to a broad range of subjects that are not easy to solve using other techniques.
遗传算法在1980年代后期很受欢迎,因为它被广泛地应用到了其他技术难以解决的问题上。
In 1992, John Koza has used genetic algorithm to evolve programs to perform certain tasks. He called his method “genetic programming” (GP)³.
1992年,约翰·科扎(John Koza)使用遗传算法来演化程序以执行某些任务。 他称他的方法为“ 基因编程 ”(GP)³。
现实世界中的进化是什么? (What is evolution in the real world?)
For thousands of years, humans have acted as agents of genetic selection, by breeding offspring with desired traits. All our domesticated animals and food crops are the results. Let review the genetic terms in nature as follows.
几千年来,人类一直通过繁殖具有所需性状的后代来充当基因选择的媒介。 我们所有的驯养动物和粮食作物都是结果。 让我们如下回顾自然界中的遗传术语。
Each cell of a living thing contains chromosomes — strings of DNA.
生物的每个细胞都包含染色体 -DNA串。
Each chromosome contains a set of genes — blocks of DNA
每个染色体包含一组基因 -DNA块
- Each gene determines some aspect of the organism (like eye colour) 每个基因决定了生物体的某些方面(如眼睛的颜色)
A collection of genes is sometimes called a genotype
基因的集合有时被称为基因型
A collection of aspects (like eye colour) is sometimes called a phenotype
方面的集合(例如眼睛的颜色)有时称为表型
Reproduction (crossover) involves recombination of genes from parents and then small amounts of mutation (errors) in copying
繁殖( 交叉 )涉及父母亲基因的重组,然后是复制中的少量突变 (错误)
The fitness of an organism is how much it can reproduce before it dies
有机体的适应性是指其死亡前可以繁殖多少
- Evolution based on “survival of the fittest” 基于“适者生存”的进化
什么是计算机科学中的遗传算法? (What’s Genetic Algorithm in Computer Science?)
Genetic Algorithms are categorized as global search heuristics. A genetic algorithm is a search technique used in computing to find true or approximate solutions to optimization and search problems. It uses techniques inspired by biological evolution such as inheritance, mutation, selection, and crossover.
遗传算法被归类为全局搜索启发式算法。 遗传算法是一种搜索技术,用于计算中,以找到针对优化和搜索问题的真实或近似解。 它使用受生物进化启发的技术,例如遗传,突变,选择和杂交。
We look at the basic process behind a genetic algorithm as follows.
我们看一下遗传算法背后的基本过程。
Initialize population: genetic algorithms begin by initializing a Population of candidate solutions. This is typically done randomly to provide even coverage of the entire search space. A candidate solution is a Chromosome that is characterized by a set of parameters known as Genes.
初始化种群:遗传算法从初始化候选解的种群开始。 通常这是随机进行的,以提供整个搜索空间的均匀覆盖。 候选溶液是C 染色体 ,其特征在于一组称为Genes的参数。
Evaluate: next, the population is evaluated by assigning a fitness value to each individual in the population. In this stage we would often want to take note of the current fittest solution, and the average fitness of the population.
评估:接下来,通过为总体中的每个个体分配适合度值来评估总体。 在此阶段,我们通常需要注意当前最适合的解决方案以及人群的平均适应度。
After evaluation, the algorithm decides whether it should terminate the search depending on the termination conditions set. Usually this will be because the algorithm has reached a fixed number of generations or an adequate solution has been found.
评估之后,算法根据设置的终止条件决定是否应终止搜索。 通常这是因为算法已达到固定的世代数或已找到适当的解决方案。
When the termination condition is finally met, the algorithm will break out of the loop and typically return its finial search results back to the user.
当最终满足终止条件时,该算法将跳出循环并通常将其最终搜索结果返回给用户。
Selection: if the termination condition is not met, the population goes through a selection stage in which individuals from the population are selected based on their fitness score, the higher the fitness, the better chance an individual has of being selected.
选择:如果不满足终止条件,则群体进入选择阶段,在该阶段中,根据其适应度得分从人群中选择个体,适应度越高,则个体被选中的机会就越好。
Two pairs of selected individuals called parents.
两对选定的个体称为父母。
Crossover: the next stage is to apply crossover and mutation to the selected individuals. This stage is where new individuals (children) are created for the next generation.
交叉:下一步是对选定的个体进行交叉和变异。 在此阶段,将为下一代创建新的个人( 孩子 )。
Mutation: at this point the new population goes back to the evaluation step and the process starts again. We call each cycle of this loop a generation.
变异:此时,新种群将返回评估步骤,然后该过程再次开始。 我们将此循环的每个周期称为一代。
用Python语言实现GA示例 (Implementing an example of GA in Python language)
Now, let’s see how to crack a password using a genetic algorithm. Imagine that a friend asks you to solve the following challenge: “You must find the three-letter word I set up as a password in my computer”.
现在,让我们看看如何使用遗传算法破解密码。 想象一下,一个朋友要您解决以下挑战:“ 您必须在计算机上找到我设置为密码的三个字母的单词 ”。
In my example, we’ll start with a password of length 3, with each digit in the password being a letter. An example of password would be: nkA
. We will start with a randomly generated initial sequence of letters, then change one random letter at a time until the word is “Anh”.
在我的示例中,我们将从长度为3的密码开始,密码中的每个数字都是字母。 密码示例为: nkA
。 我们将从随机生成的初始字母序列开始,然后一次更改一个随机字母,直到单词为“ Anh”。
At first, we guess any randomly generated words made of three letters, such as “Ink, aNj, cDh
”. The word Ink
and cDh
have exactly one letter in common with Anh
, the password. We say that they have a score of 1. The word aNj
has a score of 0 since it has no any matching letters with the password.
首先,我们猜测任何由三个字母组成的随机生成的单词,例如“ Ink, aNj, cDh
”。 单词Ink
和cDh
与密码Anh
正好有一个字母。 我们说他们的分数为1。单词aNj
的分数为0,因为它没有与密码匹配的字母。
Since we haven’t found the solution, we can produce a new generation of words by combing some of the word we already have. For example, they are “Inh, aDj
". From these two new words, the word Inh
has a score of 2 and is very close to the password. We say that this second generation is better than the first second since it is closer to the solution.
由于尚未找到解决方案,因此我们可以通过组合已有的某些单词来生成新一代单词。 例如,它们是“ Inh, aDj
”。在这两个新单词中, Inh
单词的得分为2,与密码非常接近。我们说第二代比第一代更好,因为它更接近于第二代。解决方案。
A third-generation can be formed in which the word Inh
can produce the word Anh
in which the I
is randomly mutated into A
. This example is pretty simple to understand GA.
可以形成第三代,其中单词Inh
可以生成单词Anh
,其中I
随机地突变为A
这个例子很容易理解GA。
该算法的伪代码 (Pseudocode of this algorithm)
_letters = [a..zA..Z]
target = "Anh"
guess = get 3 random letters from _letters while guess != target:
index = get random value from [0..length of target]
guess[index] = get 1 random letter from _letters
Python中的示例实现 (The example implementation in Python)
Now, we will implement the example in Python language. Each digit in the password would be considered a Gene
. We need a gene set to use for building guesses. For this example that will be a generic set of letters.
现在,我们将以Python语言实现该示例。 密码中的每个数字都将被视为Gene
。 我们需要一个基因集来进行猜测。 对于此示例,它将是一组通用字母。
geneSet = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ"
It also needs a target password to guess.
它还需要一个目标密码才能猜测。
target = "Anh"
Next the algorithm needs a way to generate a random string from the gene set.
接下来,算法需要一种从基因集生成随机字符串的方法。
import randomdef generate_parent(length):
genes = []
while len(genes) < length:
sampleSize = min(length - len(genes), len(geneSet))
genes.extend(random.sample(geneSet, sampleSize))
return ''.join(genes)
The fitness value the genetic algorithm provides is the only feedback the engine gets to guide it toward a solution. In this project the fitness value is the total number of letters in the guess that match the letter in the same position of the password.
遗传算法提供的适合度值是唯一的 反馈引擎会引导它走向解决方案。 在此项目中,适应度值是猜测中与密码相同位置的字母匹配的字母总数。
def get_fitness(guess):
return sum(1 for expected, actual in zip(target, guess)
if expected == actual)
Next, the engine needs a way to produce a new guess by mutating the current one.
接下来,引擎需要一种方法,通过改变当前猜测来产生新的猜测。
def mutate(parent):
index = random.randrange(0, len(parent))
childGenes = list(parent)
newGene, alternate = random.sample(geneSet, 2)
childGenes[index] = alternate \
if newGene == childGenes[index] \
else newGene
return ''.join(childGenes)
We also need display function to show information. Normally the display function also outputs the fitness value and how much time has elapsed.
我们还需要显示功能来显示信息。 通常情况下,显示功能还会输出适应度值以及经过了多少时间。
import datetimedef display(guess):
timeDiff = datetime.datetime.now() - startTime
fitness = get_fitness(guess)
print("{0}\t{1}\t{2}".format(guess, fitness, str(timeDiff)))
Finally, we will try to run this solution by using the above functions as following.
最后,我们将尝试通过使用以下功能来运行此解决方案。
random.seed()
geneSet = " abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ"
target = "Anh"
startTime = datetime.datetime.now()
bestParent = generate_parent(len(target))
bestFitness = get_fitness(bestParent)
display(bestParent)
while True:
child = mutate(bestParent)
childFitness = get_fitness(child) if bestFitness >= childFitness:
continue
display(child)
if childFitness >= len(bestParent):
break
bestFitness = childFitness
bestParent = child
Run the above code and we’ll see the following output.
运行上面的代码,我们将看到以下输出。
It’s very easy to understand about genetic algorithms. Right?
关于遗传算法,这很容易理解。 对?
翻译自: https://towardsdatascience.com/an-introduction-to-genetic-algorithms-c07a81032547
遗传算法与爬山算法简介