Ch. Hafner's Generalized Genetic Program (GGP)

Last update 21/02/2017

Galina Mikolsic's translation to Beorussian

Translation to Indonesian (by ChamelonJohn.com)

Translation to Portuguese (by Travel Ticker)

Overview

Download and run

Default example 

Chart analysis and stock value prediction

Some results

 

Overview

The approximation and extrapolation of functions is simple and very demanding at the same time. Well-known series expansions (Fourier series, power series, etc.) allow to efficiently approximate many of the functions encountered by engineers, but they are completely useless for extrapolating almost any function known within an interval. Genetic Algorithms (GA), Genetic Programming (GP), and Evolutionary Strategies (ES) are essentially based on the observation that nature performs some optimization with several interesting strategies. Although John Koza's original GP algorithms allow to find a formula that 1) fits the data of a given function quite well and that 2) allows extrapolation, these algorithms are neither robust nor efficient.

Together with Jürg Fröhlich (Extended Genetic Programming and other optimization techniques) and Hansueli Gerber (Java and Oberon versions of the traditional GP algorithm), I am developing Generalized GP codes that are efficient and robust. The main idea of these codes is the approximation of the given function by a series expansion with arbitrary basis functions containing at least one parameter. Thus, we have 1) a set of linear parameters (the amplitudes of the basis functions) that can easily be computed by matrix methods, 2) a set of nonlinear parameters (contained in the basis functions) that can be optimized by known optimization techniques, 3) a set of basis functions that are selected and improved by genetic operations.

One of the most interesting features of GGP is its ability to find approximations of a given data set that also allow to obtain good extrapolations. This allows one to often obtain good time series predictions, even when the problem is ill-posed. Time series predictions have many interesting applications in physics, engineering, economy, etc.

 

Download and run

Download the Demo Version 1.2 of the Generalized Genetic Program (GGP) for Intel platforms under Windows NT/95, contained in the compressed files GGP-DEMO.ZIP (size 579k). When you uncompress GGP-DEMO with WinZip, you will obtain the executable file GGP.EXE (size 1.57M). You can put GGP.EXE in any directory and run it as any EXE file. 

To run GGP, use "My computer" or the "Explorer" to navigate to GGP.EXE and double click it. You may also create a shortcut and double click the shortcut icon. GGP contains a graphic window with the function to be analyzed and an info window that contains some information on the standard procedure for using the GGP Demo version. Follow the instructions in the info window. For a short, but more detailed description (including how to predict the NASDAQ, other indices, and stock values), click here.

For more information, you can also download a paper on GGP and similar methods (Word document GGPPAPER.DOC, size 3750k, contained in the zip file GGPPAPER.ZIP, size 218k or PostScript file ggppaper.ps, size 7808k, contained in the zip file GGPPS.ZIP, size 519k).

Default example

John Koza has presented the following example: The given function is y1=x*x/2. This function is approximated by a formula consisting of the elementary operations +,-,*,/, the two types of terminals, the variable x and random constants. Since the given function is extremely simple, it is no wonder that Koza's GP algorithm finds it.

If we replace y1 by the even simpler formula y2=2.1, the GP algorithm becomes highly inefficient because it has difficulties to construct the number 2.1 by random numbers and the elementary operations. This problem can easily be removed by implementing a nonlinear parameter p as a third terminal type. If p is contained in a formula, its value must be determined by a nonlinear parameter optimization algorithm.

y3=sin(3.1416*x)/(3.1416*x) is not much more complicated than y1 and y2, but it has some properties that can be misleading for the GP algorithm. First of all, the function sin must be added to the elementary operations of the GP algorithm because constructing sin using +,-,*,/ is too complicated. If you watch at y3 in the interval x=-1...+1, you get the impression that cos(3.1416*x) is not far away. Of course, you need to know the result, wen you decide to only add sin to the elementary functions. Otherwise, you probably prefer cos. If you add both, cos and sin, cos is fitter than sin because it has the same symmetry as y3. Moreover, the division in y3 causes numerical problems x is close to zero. These problems are avoided by the implementation of protected operations. However, both the division and the fact that y3 looks more like cos than like sin are misleading - even for an extended GP algorithm which includes a parameter optimization. With a C++ implementation of Koza's GP algorithm, we were not able to find y3 (defined in the interval x=0...2) with a population of 500 individuals (formula) within 200 generations. The GGP code works with a population of only 100 individuals, replaces only 50 of them to generate a new generation, and finds useful results within 10-30 generations, even if only 3-5 fitness computations are performed for the parameter optimization. Depending on the initial (random) population and on the system parameter, GGP needs 500-5000 fitness computations, whereas the original GP requires probably much more than 100'000 fitness computations. Note that the code does not find exactly y3, but excellent approximations that allow excellent extrapolations even for x toward infinity. Three typical solutions are:

7.9513E3*sin(sin(e*1.4727E-5)*sin(Pi*x))/x)
-87.339*sin(sin(2*1.5702)*sin(Pi*x))/(sin(1.5702*e)*x/e)
1.0004*sim(3.1389*x)/(Pi*x)

Note that all of these solutions are more complicated than y3 itself. Nonetheless, these are excellent approximations and extrapolations even for large x. 

y3 is the default function to be analyzed with the Demo Version 1.2 of GGP. Note that this version allows you to graphically watch the GGP search, but it does not allow you to visualize the formulae created by GGP and it also does not allow you to modify the settings of the GGP search parameters.

Chart analysis and stock value prediction

Note that chart analysis is a tricky task for specialists. GGP may be helpful because it does not suffer from expectations and other emotions, but often a chart does simple not provide enough information with a sufficient accuracy for obtaining good predictions. Theoretically, GGP could be used for short-term as well as for long-term predictions, but there seems to be much more noise on short-term charts than on long-term charts. Therefore, GGP long-term predictions usually are much more reliable. For a short description how to predict the NASDAQ, other indices, and stock values, click here.

Some results

You can watch some plots that demonstrate how GGP works.

Back to Ch. Hafner's home page

Back to the Electromagnetic Code Designers home page