python - 为什么遗传算法有效?他们什么时候停止工作?
问题描述
我一直在学习一些化学开发算法,我遇到了遗传算法。所以我写了一个简单的遗传算法,它试图从给定的一组符号(基因)中将目标字符串归零。
所以,
genes = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ 1234567890, .-;:_!\"#%&/()=?@${[]}"
和target = to be or not be that is the question
。
这是我的头文件:
#a chromosome in a genetic algorithm is a possible solution to the problem
import random
genes = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOP \
QRSTUVWXYZ 1234567890, .-;:_!\"#%&/()=?@${[]}"
class Individual:
#define some properties
def __init__(self, chromosome):
self.chromosome = chromosome #the actual solution
self.fitness = self.calc_fitness()
def mutated_genes(self):
#mutate the genes you have in an Individual
gene = random.choice(genes)
return gene
def create_gnome(self, target):
gnome_len = len(target)
return [self.mutated_genes() for _ in range(gnome_len)]
def mate(self, par2):
#mate with another individual
#child chromosome
child_chromosome = []
for gp1, gp2 in zip(self.chromosome, par2.chromosome):
#generate a random number
prob = random.random()
#if prob is less than 0.45, accept gene from parent 1
if prob<0.45:
child_chromosome.append(gp1)
elif prob < 0.9:
child_chromosome.append(gp2) #if between 0.45 and 0.9, accept gene from parent 2
else:
child_chromosome.append(self.mutated_genes())
return Individual(child_chromosome)
def calc_fitness(self):
#calculate a fitness score
#this is the number of characters in the string which
#match the target
fitness = 0
for gs,gt in zip(self.chromosome, target):
if gs!=gt:
fitness += 1
return fitness
这是我的驱动程序代码:
%run "string_search.ipynb"
population_size = 100
target = "to be or not to be that is the question"
generation = 1 #setting up generations to evolve
found = False #boolean
population = []
#generate a population
for i in range(population_size):
indiv = Individual([])
gnome = indiv.create_gnome(target)
population.append(Individual(gnome))
while not found:
#sort the population in increasing order of fitness score
population = sorted(population, key = lambda x:x.fitness)
#if the individual having lowest fitness score is 0, then we stop the search
if population[0].fitness == 0:
found = True
break
#otherwise, create a new generation
new_generation = []
#10% of the fittest population goes to the next generation
s = int(0.1*population_size)
new_generation.extend(population[:s])
#from 50% of fittest population, individuals will mate to produce offspring
s = int(0.9*population_size)
for i in range(s):
parent1 = random.choice(population[:50]) #choose some individual from the top 50%
parent2 = random.choice(population[:50]) #choose another individual from the top 50%
child = parent1.mate(parent2)
new_generation.append(child)
population = new_generation
generation += 1
str_chr = "".join(population[0].chromosome)
print("Generation: {} \t String: {} \t Fitness: {}".format(generation, str_chr, population[0].fitness))
这是我的结果:
.
.
.
Generation: 14465 String: to be or not to be that is the 8uestion Fitness: 1
Generation: 14466 String: to be or not to be that is the 8uestion Fitness: 1
Generation: 14467 String: to be or not to be that is the 8uestion Fitness: 1
Generation: 14468 String: to be or not to be that is the 8uestion Fitness: 1
Generation: 14469 String: to be or not to be that is the 8uestion Fitness: 1
Generation: 14470 String: to be or not to be that is the question Fitness: 1
我有点着迷于这个算法如何设法将字符串输出到一个字符之外(question
is 8uestion
)。
我的问题是,为什么遗传算法和它们一样有效?为什么Fitness = 1
在真正收敛到真正的解决方案之前,它们会在很长一段时间内停滞不前?为什么增加人口规模会提高收敛速度?如果我通过适应度分数改变接受父母 1 而不是父母 2 的概率,算法会变得更好吗?
我一直试图理解这一点,但大多数博客只是简单地实现了一些代码和经验状态,即人口越多,收敛时间越短。我会很感激你对我的任何建议。
解决方案
在我看来,这项任务非常适合遗传算法,因为有明显的适应度、交叉和变异选择。您保留具有最正确字符的个人并重新组合它们,这使得以非常相似的字符串结束的机会非常高。每个角色独立地对适应度做出贡献,“基因”之间没有相互作用,这意味着每个角色都可以独立优化。
当谈到最后一点保持很长时间的适应度损失时,这是由于您实施的随机突变造成的。如果父母双方只有一个与目标不同的角色,那么后代将很有可能与目标完全匹配。然而,每个新染色体的 10% 是随机生成的,可能会引入额外的损失。虽然突变对于探索至关重要,但您可能希望在优化过程接近尾声时降低突变率,至少对于一部分人口而言,以便更具开发性和更少探索性。这可能会让你达到你想要的0
.
推荐阅读
- vba - 像运算符一样访问 vba 只返回第一个结果
- xslt-1.0 - 使用 XSLT 版本 1 以 UTC 获取当前日期时间
- reactjs - 有没有内置 JSX 编译器的浏览器?
- string - 二进制字符串减少
- android - 部分屏幕上的 admob 智能横幅
- oracle - ORA-06502: PL/SQL: 在空游标的情况下出现数字或值错误
- node.js - 将标头发送到客户端(Nodejs、MongoDb、Express)后无法设置标头
- ansible - git remote set-url origin 的 Ansible 等价物
- python - 我的模型的验证准确性卡住了!我如何解决它?[张量流/Keras]
- r - 在 R 中显示 ggplot 中所有 bin 的 x 刻度值