HiveBrain v1.2.0
Get Started
← Back to all entries
patternMinor

Individuals reproduce and mutate

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
andmutatereproduceindividuals

Problem

How can I improve the performance (in terms of computation time) of this code?

# Settings
const nbloci = 100         # length of the genome
const N = 100              # Number individuals in the population
const nbgenerations = 100  # number of generations
const mu = 1/10^5          # mutation rate
const s = 0.01             # effect of a given mutation on fitness

# Packages
using Distributions.Binomial
using Distributions.wsample

# Type and Functions
type Ind
    Genome
end

function makepopulation(N)
    Pop = Array(Ind, N)
    for i = 1:N
        Pop[i] = Ind(ones(Float64, nbloci))
    end
    return(Pop)
end

function calculatefitnesses(Pop)
    fitnesses = Array(Float64, N)
    for (i,I) = enumerate(Pop)
        fit = 1
        for l = I.Genome
            fit *= l
        end
        fitnesses[i] = fit
    end
    return(fitnesses)
end

function mutate(I)
    nbmut = rand(Binomial(nbloci, mu), 1)[1]
    mutposs = rand(1:nbloci, nbmut)
    for mutpos = mutposs
        I.Genome[mutpos] == 1.0 ? I.Genome[mutpos] = mutfit : I.Genome[mutpos] = 1.0
    end
end

function reproduce(Pop)
    fitnesses = calculatefitnesses(Pop)
    newPop = Array(Ind, N)
    for i = 1:N
        newPop[i] = deepcopy(wsample(Pop, fitnesses, 1)[1])
        mutate(newPop[i])
    end
    return(newPop)
end

##### MAIN #####

# Just a useful constant
const mutfit = 1.0-s

# Simulation
Pop = makepopulation(N)             # Create population
for generation = 1:nbgenerations    # Iterate over all generations
    Pop = reproduce(Pop)           
end


The above code simulates a population of individuals (Ind) that all have a genome long of nbloci (independent) loci (a locus (=sing. of loci) is a position on a chromosome). makepopulation creates a population of clones at the beginning of the simulation. Then, iterating over all generations (for loop) the function reproduce is called on the population. The function reproduce calls calcfitnesses who return an array of fitnesse

Solution

You are giving very short names and commneting next to them the meaning

const nbloci = 100         # length of the genome
const N = 100              # Number individuals in the population
const nbgenerations = 100  # number of generations
const mu = 1/10^5          # mutation rate
const s = 0.01             # effect of a given mutation on fitness


It would be better to give long and self-explanatory names without comments:

const genome_length = 100
const population_size = 100 
const no_generations = 100
const mutation_rate = 1/10^5
const s = 0.01                # effect of a given mutation on fitness # Too criptic for  me to understand


Many beginners think:


If I write many many many comments everywhere in my code, it will for sure become better!

But this is not True, you should remove all the meaningless comments such as:

  • # Settings



  • # Packages



  • # Type and Functions




Maybe using functions will slow down my code very much?

No, the opposite, code inside functions runs much faster.

But you should


Avoid writing overly-specific types

(Reference is again the link above).

for l = I.Genome


Readibility counts, avoid using names like l and I because it is easy to mistake one for the other.

mutpos = mutposs


same as above.


Is there a way to improve the calcfitness step, maybe by coding
true/false instead of 1/1-s?

No, fitness must be a number, if you use a true/false value for the fitness then you are not searching genetically but randomly.

Code Snippets

const nbloci = 100         # length of the genome
const N = 100              # Number individuals in the population
const nbgenerations = 100  # number of generations
const mu = 1/10^5          # mutation rate
const s = 0.01             # effect of a given mutation on fitness
const genome_length = 100
const population_size = 100 
const no_generations = 100
const mutation_rate = 1/10^5
const s = 0.01                # effect of a given mutation on fitness # Too criptic for  me to understand
for l = I.Genome
mutpos = mutposs

Context

StackExchange Code Review Q#77345, answer score: 2

Revisions (0)

No revisions yet.