HiveBrain v1.2.0
Get Started
← Back to all entries
patternMinor

Gramatical evolution doesn't result on creating good constants

Submitted by: @import:stackexchange-cs··
0
Viewed 0 times
resultevolutiongramaticalcreatingdoesngoodconstants

Problem

I'm using Grammatical Evolution for symbolic regression tasks (i.e. searching the space of mathematical expressions to find the model that best fits a given dataset).

I'm evolving solutions according to a user-specified grammar:

(0)
  ::=         (0)
          |                  (1)
          |                    (2)
          |                   (3)

 (1)     
  ::= *     (0)
        | +     (1)
        | -     (2) 
        | /     (3)

 (2)
  ::= 0.1            (0)
           | 1.0            (1)
           | 10.0           (2)

 (3)
  ::= x                (0) 

 (4)
  ::= sin             (0)
          | cos             (1) 
          | -               (2)


The population is a collection of arrays of integers. For example:

[0, 1, 2, 0, 0, 1, 1, 1, 0, 1, 1, 1, 1, 1]


is the genotype of the individual +(x, +(1, +(1, 1)) (function $x + 3$).

The translation is performed using the integers to look up the table which describe the BNF grammar:

Step  Sequence  Rule   State
  1      0       0.0     
  2      1       1.1    + 
  3      2       0.2    + 
  4      0       3.0   x + 
  5      0       0.0   x +   
  6      1       1.1   x +  + 
  7      1       0.1   x +  + 
  8      1       0.1   x + 1.0 + 
  9      0       0.1   x + 1.0 +   
 10      1       1.1   x + 1.0 +  + 
 11      1       0.1   x + 1.0 +  + 
 12      1       2.1   x + 1.0 + 1.0 + 
 13      1       0.1   x + 1.0 + 1.0 + 
 14      1       2.1   x + 1.0 + 1.0 + 1.0


The fitness function is the reciprocal of the sum, taken over the fitness
cases, of the absolute error between the evolved and the target function.

My problem is that, while the algorithm finds the general model, it often has trouble finding the right constants.

For example, if the target function is $x^2 + x - 12$, it finds something like:

$$x^2 + x - (10 + \sin(x))$$

or

$$x^2 + x - (1 + 1) \times (10 - 1 - 1 - 1 - 1 - 1 - 1 - 1 - 1) \times ((10 - (10 / (1 + 1)) - (10 / (1 + 1 + 1 + 1 + 1)))$$

which is corr

Solution

It sounds like you're saying that your approach is able to effectively find the shape of the expression, but it takes it a lot longer to find the right constants.

One possible approach is to generate an expression of the correct shape using your genetic programming method, then fill in the constants using a different method. I would recommend trying numerical optimization techniques. Replace each constant in your expression with a symbolic value $\alpha$. Each different constant (at each different syntactic position) gets its own symbolic value. Then, use numerical optimization to find the value of the constants.

Example 1. For instance, consider your complex expression $x^2 + x - (\text{stuff})$. Replace this with $x^2 + x - \alpha$ (when you have an entire subtree that contains only constants, no variables, replace the entire subtree with a single symbolic value). We now get a function $f(x,\alpha) = x^2 + x - \alpha$. Use the training set $(x_1,y_n),\dots,(x_n,y_n)$ to define an objective function:

$$\Phi(\alpha) = \sum_{i=1}^n (f(x_i,\alpha) - y_i)^2.$$

Finally, use numerical optimization methods to search for $\alpha$ that minimizes $\Phi(\alpha)$. You could try gradient descent, Newton's method, or other techniques. The value for that subtree deriving using genetic programming supplies a reasonable starting point for the search.

Example 2. In some cases, you will have more than one constant. For instance, you might get the expression $x^2 + 2x - 3$. Replace each constant with a symbolic value, so we get $f(x,\alpha,\beta) = x^2 + \alpha x - \beta$. Now you can derive an objective function as above:

$$\Phi(\alpha,\beta) = \sum_{i=1}^n (f(x_i,\alpha,\beta)-y_i)^2,$$

and use optimization methods to search for $\alpha,\beta$ that minimize $\Phi(\alpha,\beta)$.

Context

StackExchange Computer Science Q#60712, answer score: 3

Revisions (0)

No revisions yet.