python遗传算法(GA)DEAP-Overview学习摘要

442 查看

DEAP-Overview

DEAP是一个python遗传算法框架,这里是它的简介。DEAP documentation
今天整理一下DEAP的概览,大体了解一下它的流程。初学,不严谨,仅作为自己的备忘学习笔记。

一. Types

The first thing to do is to think of the appropriate type for your problem.This is done with the creator module.
第一件事就是思考你问题的合适类型(是什么样的优化问题,最小值?最大值?单目标?多目标?)。在creator模块中可以实现。

For example, the following creates a FitnessMin class for a minimization problem and an Individual class that is derived from a list with a fitness attribute set to the just created fitness.
例如,下面为一个最小化问题创建一个FitnessMin类和个体类,它来自于刚刚创建的适应度带有fitness属性值集合的列表。

from deap import base, creator
creator.create("FitnessMin", base.Fitness, weights=(-1.0,))
creator.create("Individual", list, fitness=creator.FitnessMin)

That’s it. More on creating types can be found in the Creating Types tutorial.
更多关于创建类型的问题可以在Creating Types tutorial查看。

二. Initialization

Once the types are created you need to fill them with sometimes random values, sometime guessed ones. Again, DEAP provides an easy mechanism to do just that.
一旦这些类型被创建,你就需要用一些随机的值来填充它们,有时是猜测值。同样的, DEAP极大的提供了这样的一个简单的机制来做这件事儿。

The Toolbox is a container for tools of all sorts including initializers that can do what is needed of them. The following takes on the last lines of code to create the initializers for individuals containing random floating point numbers and for a population that contains them.
toolbox是一个工具容器,可用来包含各种“初始化体”(可以做它们需要的事儿)。以下是以代码来创建包含随机浮点数的初始化和个人人口包含他们最后的线。下面最后几行代码用来创建“初始化体”,为包含那些随机值的“个体”和包含它们的种群。

import random
from deap import tools

IND_SIZE = 10

toolbox = base.Toolbox()
toolbox.register("attribute", random.random)
toolbox.register("individual", tools.initRepeat, creator.Individual,
                 toolbox.attribute, n=IND_SIZE)
toolbox.register("population", tools.initRepeat, list, toolbox.individual)    #是吧,创建的individual放入了population中

This creates functions to initialize populations from individuals that are themselves initialized with random float numbers. The functions are registered in the toolbox with their default arguments under the given name.
这将创建函数来初始化种群(来自那些用随机浮点数初始化而成的个体们)。这些函数在给定名字的默认参数下被“注册到”toolbox中,

register感觉这个单词在这儿的理解可以看做是“创建”的意思,然后toolbox中就有了toolbox.individual。具体理解,还得看后面的详细介绍。
toolbox貌似和function有关,里边的第一个参数就是函数名。因为在Algorithm步骤里时,出现了这样的用法toolbox.population()
More initialization methods are found in the Creating Types tutorial and the various Examples.
更多初始化的方法可以在 Creating Types tutorialExamples中查看。

三. Operator

toolbox.register("mate", tools.cxTwoPoint)

Operators都在toolbox模块里,要给每个算子选择合适算法。在Operator里边可以给每个步骤的算子分别选择合适的算法。

In addition you must create your evaluation function. This is how it is done in DEAP.
就是这个评价功能函数得自己写,其实,也就是适应度函数了!

Note also that fitness values must be iterable, that is why we return a tuple in the evaluate function.
就是注意评价函数返回值必须是可迭代的。

def evaluate(individual):
    return sum(individual),

toolbox.register("mate", tools.cxTwoPoint)
toolbox.register("mutate", tools.mutGaussian, mu=0, sigma=1, indpb=0.1)
toolbox.register("select", tools.selTournament, tournsize=3)
toolbox.register("evaluate", evaluate)

四. Algorithms

Now that everything is ready, we can start to write our own algorithm. It is usually done in a main function. For the purpose of completeness we will develop the complete generational algorithm.
现在一切具备,我们可以开始写自己的算法了。通常在主函数里边写。完整性的目的,我们将开发完整的分代算法。

def main():
    pop = toolbox.population(n=50)
    CXPB, MUTPB, NGEN = 0.5, 0.2, 40

    # Evaluate the entire population
    fitnesses = map(toolbox.evaluate, pop)
    for ind, fit in zip(pop, fitnesses):
        ind.fitness.values = fit

    for g in range(NGEN):
        # Select the next generation individuals
        offspring = toolbox.select(pop, len(pop))
        # Clone the selected individuals
        offspring = map(toolbox.clone, offspring)

        # Apply crossover and mutation on the offspring
        for child1, child2 in zip(offspring[::2], offspring[1::2]):
            if random.random() < CXPB:
                toolbox.mate(child1, child2)
                del child1.fitness.values
                del child2.fitness.values

        for mutant in offspring:
            if random.random() < MUTPB:
                toolbox.mutate(mutant)
                del mutant.fitness.values

        # Evaluate the individuals with an invalid fitness
        invalid_ind = [ind for ind in offspring if not ind.fitness.valid]
        fitnesses = map(toolbox.evaluate, invalid_ind)
        for ind, fit in zip(invalid_ind, fitnesses):
            ind.fitness.values = fit

        # The population is entirely replaced by the offspring
        pop[:] = offspring

    return pop

纵观整个步骤,可以发现DEAP在实现GA过程中的思路一般是很清晰的。

  1. 确定Types--creator创建,选好解决问题的类型

  2. 初始化Initialization--toolbox注册个体啊,种群啊等等函数。

  3. Operator--算子选择,交叉,变异,选择,进化等等。每个算子都有不同的算法,可以选择的!

  4. Algorithms--算法就是具体将上面注册的函数啊,等等应用结合起来,编写流程。

例子

#    This file is part of DEAP.
#
#    DEAP is free software: you can redistribute it and/or modify
#    it under the terms of the GNU Lesser General Public License as
#    published by the Free Software Foundation, either version 3 of
#    the License, or (at your option) any later version.
#
#    DEAP is distributed in the hope that it will be useful,
#    but WITHOUT ANY WARRANTY; without even the implied warranty of
#    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
#    GNU Lesser General Public License for more details.
#
#    You should have received a copy of the GNU Lesser General Public
#    License along with DEAP. If not, see <http://www.gnu.org/licenses/>.


#    example which maximizes the sum of a list of integers
#    each of which can be 0 or 1

import random

from deap import base
from deap import creator
from deap import tools

creator.create("FitnessMax", base.Fitness, weights=(1.0,))   #这里这个base.Fitness是干嘛的???
creator.create("Individual", list, fitness=creator.FitnessMax)    #这里的list,fitness是参数,干嘛的???

toolbox = base.Toolbox()    #base是个很基本的类啊!!!看来很重要

# Attribute generator: define 'attr_bool' to be an attribute ('gene')
#                      which corresponds to integers sampled uniformly
#                      from the range [0,1] (i.e. 0 or 1 with equal
#                      probability)
toolbox.register("attr_bool", random.randint, 0, 1)   #包含了0,1的随机整数。不明白这里是干嘛的???

# Structure initializers: define 'individual' to be an individual
#                         consisting of 100 'attr_bool' elements ('genes')
toolbox.register("individual", tools.initRepeat, creator.Individual,    #tools.initRepeat是干嘛的???
    toolbox.attr_bool, 100)

# define the population to be a list of 'individual's
toolbox.register("population", tools.initRepeat, list, toolbox.individual)

# the goal ('fitness') function to be maximized    注意!!!这里就定义了我们的适应度fitness函数啦!!!因为我们要解决的就是求和问题
# 只要返回一个值给我们的这个适应度函数啊!利用自带的sum函数!
# 这里取名为evalOneMax是因为这里的适应度函数就是我们后面要用来评价的依据,evaluate

def evalOneMax(individual):
    return sum(individual),

#----------
# Operator registration
#----------
# register the goal / fitness function
# 这里的toolbox register语句的理解:注册了一个函数evaluae依据的是后面的evalOneMax 理通了!!!
toolbox.register("evaluate", evalOneMax)

# 瞧瞧,这里的交叉函数,尼玛,crossover不用,非得用个mate,还以为是华为mate呢!你看,这里的交叉算子采用的函数,也是已经停供的,可以选择的
# register the crossover operator
toolbox.register("mate", tools.cxTwoPoint)

# register a mutation operator with a probability to
# flip each attribute/gene of 0.05
toolbox.register("mutate", tools.mutFlipBit, indpb=0.05)

# operator for selecting individuals for breeding the next
# generation: each individual of the current generation
# is replaced by the 'fittest' (best) of three individuals
# drawn randomly from the current generation.
toolbox.register("select", tools.selTournament, tournsize=3)    #这里选择的tournsize又是什么意思呢?

#----------

def main():
    random.seed(64)
    # hash(64)is used
    
    # random.seed方法的作用是给随机数对象一个种子值,用于产生随机序列。
    # 对于同一个种子值的输入,之后产生的随机数序列也一样。
    # 通常是把时间秒数等变化值作为种子值,达到每次运行产生的随机系列都不一样
    
    # create an initial population of 300 individuals (where
    # each individual is a list of integers)
    pop = toolbox.population(n=300)    #定义了300个个体的种群!!!

    # CXPB  is the probability with which two individuals
    #       are crossed
    #
    # MUTPB is the probability for mutating an individual
    #
    # NGEN  is the number of generations for which the
    #       evolution runs   进化运行的代数!果然,运行40代之后,就停止计算了
    CXPB, MUTPB, NGEN = 0.5, 0.2, 40
    
    print("Start of evolution")
    
    # Evaluate the entire population
    fitnesses = list(map(toolbox.evaluate, pop))
    for ind, fit in zip(pop, fitnesses):
        ind.fitness.values = fit
    
    print("  Evaluated %i individuals" % len(pop))    #这时候,pop的长度还是300呢
    
    # Begin the evolution      开始进化了哈!!!注意注意注意!就是一个for循环里了!40次--代数
    for g in range(NGEN):
        print("-- Generation %i --" % g)
        
        # Select the next generation individuals
        offspring = toolbox.select(pop, len(pop))
        # Clone the selected individuals
        offspring = list(map(toolbox.clone, offspring))
    
        # Apply crossover and mutation on the offspring
        for child1, child2 in zip(offspring[::2], offspring[1::2]):

            # cross two individuals with probability CXPB
            if random.random() < CXPB:
                toolbox.mate(child1, child2)

                # fitness values of the children
                # must be recalculated later
                del child1.fitness.values
                del child2.fitness.values

        for mutant in offspring:

            # mutate an individual with probability MUTPB
            if random.random() < MUTPB:
                toolbox.mutate(mutant)
                del mutant.fitness.values
    
        # Evaluate the individuals with an invalid fitness
        invalid_ind = [ind for ind in offspring if not ind.fitness.valid]
        fitnesses = map(toolbox.evaluate, invalid_ind)
        for ind, fit in zip(invalid_ind, fitnesses):
            ind.fitness.values = fit
        
        print("  Evaluated %i individuals" % len(invalid_ind))
        
        # The population is entirely replaced by the offspring
        pop[:] = offspring
        
        # Gather all the fitnesses in one list and print the stats
        fits = [ind.fitness.values[0] for ind in pop]
        
        length = len(pop)
        mean = sum(fits) / length
        sum2 = sum(x*x for x in fits)
        std = abs(sum2 / length - mean**2)**0.5
        
        print("  Min %s" % min(fits))
        print("  Max %s" % max(fits))
        print("  Avg %s" % mean)
        print("  Std %s" % std)
    
    print("-- End of (successful) evolution --")
    
    best_ind = tools.selBest(pop, 1)[0]
    print("Best individual is %s, %s" % (best_ind, best_ind.fitness.values))

if __name__ == "__main__":
    main()