首页 > 解决方案 > 为什么遗传算法有效?他们什么时候停止工作?

问题描述

我一直在学习一些化学开发算法,我遇到了遗传算法。所以我写了一个简单的遗传算法,它试图从给定的一组符号(基因)中将目标字符串归零。

所以, genes = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ 1234567890, .-;:_!\"#%&/()=?@${[]}"target = to be or not be that is the question

这是我的头文件:

#a chromosome in a genetic algorithm is a possible solution to the problem

import random 

genes = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOP \
QRSTUVWXYZ 1234567890, .-;:_!\"#%&/()=?@${[]}"


class Individual: 
    #define some properties 
    def __init__(self, chromosome):
        self.chromosome = chromosome #the actual solution 
        self.fitness = self.calc_fitness() 
        
    def mutated_genes(self): 
        #mutate the genes you have in an Individual 
        gene = random.choice(genes) 
        return gene 
    
    def create_gnome(self, target):
        gnome_len = len(target) 
        return [self.mutated_genes() for _ in range(gnome_len)]
    
    def mate(self, par2):
        #mate with another individual 
        #child chromosome 
        child_chromosome = [] 
        for gp1, gp2 in zip(self.chromosome, par2.chromosome):
            #generate a random number 
            prob = random.random() 
            
            #if prob is less than 0.45, accept gene from parent 1 
            if prob<0.45:
                child_chromosome.append(gp1)
            elif prob < 0.9:
                child_chromosome.append(gp2) #if between 0.45 and 0.9, accept gene from parent 2
            else: 
                child_chromosome.append(self.mutated_genes()) 
                
        return Individual(child_chromosome)
                
    def calc_fitness(self):
        #calculate a fitness score
        #this is the number of characters in the string which 
        #match the target 
        fitness = 0
        for gs,gt in zip(self.chromosome, target):
            if gs!=gt:
                fitness += 1
        return fitness 

这是我的驱动程序代码:

%run "string_search.ipynb"
population_size = 100 
target = "to be or not to be that is the question" 
generation = 1 #setting up generations to evolve  
found = False #boolean 
population = [] 
#generate a population 
for i in range(population_size):
    indiv = Individual([])
    gnome = indiv.create_gnome(target)
    population.append(Individual(gnome))
while not found: 
    #sort the population in increasing order of fitness score 
    population = sorted(population, key = lambda x:x.fitness)
    #if the individual having lowest fitness score is 0, then we stop the search
    if population[0].fitness == 0:
        found = True 

        break 
    #otherwise, create a new generation 
    new_generation = [] 
    #10% of the fittest population goes to the next generation 
    s = int(0.1*population_size)
    new_generation.extend(population[:s])
    
    #from 50% of fittest population, individuals will mate to produce offspring 
    s = int(0.9*population_size)
    for i in range(s):
        parent1 = random.choice(population[:50]) #choose some individual from the top 50%
        parent2 = random.choice(population[:50]) #choose another individual from the top 50%
        child = parent1.mate(parent2) 
        new_generation.append(child) 
    population = new_generation 
    generation += 1 
    str_chr = "".join(population[0].chromosome)
    print("Generation: {} \t String: {} \t Fitness: {}".format(generation, str_chr, population[0].fitness))

这是我的结果:

.
.
.
Generation: 14465    String: to be or not to be that is the 8uestion     Fitness: 1
Generation: 14466    String: to be or not to be that is the 8uestion     Fitness: 1
Generation: 14467    String: to be or not to be that is the 8uestion     Fitness: 1
Generation: 14468    String: to be or not to be that is the 8uestion     Fitness: 1
Generation: 14469    String: to be or not to be that is the 8uestion     Fitness: 1
Generation: 14470    String: to be or not to be that is the question     Fitness: 1

我有点着迷于这个算法如何设法将字符串输出到一个字符之外(questionis 8uestion)。

我的问题是,为什么遗传算法和它们一样有效?为什么Fitness = 1在真正收敛到真正的解决方案之前,它们会在很长一段时间内停滞不前?为什么增加人口规模会提高收敛速度?如果我通过适应度分数改变接受父母 1 而不是父母 2 的概率,算法会变得更好吗?

我一直试图理解这一点,但大多数博客只是简单地实现了一些代码和经验状态,即人口越多,收敛时间越短。我会很感激你对我的任何建议。

标签: pythonalgorithmtime-complexitygenetic-algorithm

解决方案


在我看来,这项任务非常适合遗传算法,因为有明显的适应度、交叉和变异选择。您保留具有最正确字符的个人并重新组合它们,这使得以非常相似的字符串结束的机会非常高。每个角色独立地对适应度做出贡献,“基因”之间没有相互作用,这意味着每个角色都可以独立优化。

当谈到最后一点保持很长时间的适应度损失时,这是由于您实施的随机突变造成的。如果父母双方只有一个与目标不同的角色,那么后代将很有可能与目标完全匹配。然而,每个新染色体的 10% 是随机生成的,可能会引入额外的损失。虽然突变对于探索至关重要,但您可能希望在优化过程接近尾声时降低突变率,至少对于一部分人口而言,以便更具开发性和更少探索性。这可能会让你达到你想要的0.


推荐阅读