SharpNEAT Project Roadmap

Last reviewed / updated 2017-05-21



Short / Near Term

Activation Functions Review

  • Review the set of activations provided. Add new functions, remove unuseful functions.
  • Review naming of functions in line with adopted terminology in recent years, e.g. ReLU, Softmax.
  • Modify activation function curves where necessary, in particular ensure there is a gradient to follow over wide range instead of using hard truncation at the output range extremes.
  • Performance tuning.
  • Efficacy sampling tests.
    • May result in changing the global default activation fn in SharpNEAT, and/or the defaults for each task. In turn this may result in performance improvements, e.g. ReLU versus Logistic function.


Performance Tuning

  • Building on recent improvements, i.e. tuning of multi-threaded speciation algorithm and activation function performance.
  • ANNs operating on single precision floats.
    • Should improve performance by halving memory bandwidth required to move arrays of connection weights, improving CPU cache hit rate, doubling number of weights in vector/SIMD CPU instructions (see below).
    • Consider switching precision level used as the global default and the per task defaults.
    • Genomes will continue to use double precision pending a broader review and restructure of the entire code base (see below)
    • May require new classes, i.e. SinglePrecisionFastCyclic, etc.
  • Use of built in SIMD/Vectorization support in .NET

Mid Term

Full code review, restructure and refactor

Most code bases have room for improvement, and SharpNEAT is no exception. One area I would like to visit is the size of some of the classes in terms of lines of code (LOC) and cyclomatic complexity, and also the size of some methods/functions.

Over the last few years I've adopted a rough heuristic of keeping classes to less than 400 LOC (300 ideally), and methods substantially less than that. This isn't a hard rule, but generally larger classes are a good proxy for poor design choices and presence of defects. So a revisit of the code with these ideas in mind is planned.

  • Functional division used in evolution algorithm class(es) can be improved.
  • IExperiment, IGuiExperiment could be greatly clarified and tidied up.
  • Overall API and functional division review, tidy-up, improvements.
  • Use of IoC and DI

Long Term

Integration with Native Math Libraries

  • The ANN code can tap into highly optimized matrix-vector multiplication subroutines provided by natively compiled math libraries. In particular NEAT should benefit from sparse matrix sub-routines that fully utilise CPU and GPU capabilities such as vector/SIMD instructions, FMA (fused multiple and add) instructions, and massive parallelism. Math libs of note are:
  • Of particular note is the plug-in native math lib support in mathnet-numerics. SharpNEAT may use this to gain access to the abstraction that mathnet provides, thus supporting a widespread range of options rather than tying SharpNEAT to one or two. However, a recent check discovered that the Intel MKL provider did not have support for MKL's sparse matrix sub-routines, so this may have to be addressed (not sure about the other providers, e.g CUDA in particular warrants strong attention).


Speciation Research

I have this vague idea that speciation by comparing genomes is possibly flawed in some significant ways; this fits the narrative around novelty search research, and how following an objective function may not lead you to the desired objective.

For now the ideas under this heading are best covered by a number of fairly rambling blog posts, which I hope to condense into something more concrete at some future time...



Python bindings

Provide python bindings to allow setting up and running of evolutions using python.



Investigate Alea GPU

Alea GPU allows GPU kernels to be written in C# and run from the .NEt runtime environment. It may therefore be possible to obtain significantly faster performance by re-writing carefully chosen portions of SharpNEAT to run on a GPU.



Notes / Miscellany

Performance - ReLU (Rectified Linear Units)

The ReLU activation function is currently in widespread use in AI neural net models; it is considerably simpler to compute compared to the logistic function, which has been the default in SharpNEAT since its inception.

The simplicity of the ReLU computation will likely result in a significant performance improvement over the logistic sigmoid. Some investigation is required to evaluate the scale of the improvement (with some thought given to SIMD based code also), and what the implications are to the wider efficacy of neuro-evolution when using an activation with such a significant difference from the function that has formed the basis of most neuro-evolution research to date.

Notes

Leaky ReLU is defined as y = (x if x>0, .01x if x<0), which is similar but possesses a nonzero gradient for negative values so that it can still learn.



Performance - Floating Point Precision

The relative merits of single versus double precision floats.

There is an open question regarding how much precision is required in neuro-evolution methods. For AI methods that learn weights by gradient following there is an argument to allow very small gradient to be followed, otherwise learning may stop. In NEAT a very similar question arises - what is the smallest useful weight change/delta when mutating a connection weight? And can that smallest level of precision be represented by single precision floats

A single precision float is 32 bits (4 bytes); double precision is 64bits (8 bytes). Therefore there is a potential speed improvement to be gained by using less precision, in terms of fitting more weights into CPU caches, efficient use of memory bandwidth and the ability to apply SIMD instructions to 2x as many weights in one operation (at time of writing 256 bit SIMD instructions are common and 512 bit instructions will be available soon, 512/32 = 16 floats)

A broader question might be whether the precision could be reduced further, given the existence of half precision floats

Notes / Links
Response from Brandyn Webb

Fwiw, I seem to recall productively reducing at least some of the weights in Inkwell (Apple's handwriting recognition engine) to 8 bit logarithmic values (expanded on the fly via a small lookup table). We'd periodically discretize them this way during training so that over time the weights could compensate for each others' discretization errors (otherwise if you just lop blindly at the end the odds are too high of systematic shifts that add up over large numbers of weights). In general I think the more abstract the representation, the less granularity you need. That is, whether a pixel is brighter than its neighbor or not can require quite a precise measurement, and this can matter, as can precise averages over a large number of noisy pixels; but usually whether something is a dog or a tree, and the relative implications thereof, is relatively high contrast--it's not that borderline cases never exist, but just that they're increasingly rare as you ascend the hierarchy of abstraction. You also generally have far fewer examples of more abstract concepts, so there's often little basis for a high precision tally thereof.



Performance - SIMD & Vectorization

SharpNEAT does not currently utilise any of the vector instruction available in modern CPUs. A key issue has been the lack of support for SIMD in the .NET framework. As of .NET 4.6 the new RyuJIT compiler will generate some SIMD instructions when in 64bit mode and when the .NET System.Numerics.Vector classes are in use. There is a limited set of SIMD functionality being exposed via .NET 4.6, but it is worthy of investigation not least because it ought to be a relatively simple exercise to use the available Vector classes where possible, e.g. in the neural network classes.

C++/CLI

To more fully utilise SIMD in modern CPUs (including the coming 512bit wide vector instructions due from Intel CPUs in 2016), it's necessary to run code outside of the .NET managed runtime. The most promising option here appears to be the C++/CLI language/environment, this provides an environment where C++ targeting the native CPU can be written, but interaction with the managed runtime is also possible.

Also worth noting at here is that in recent years server CPUs (e.g. Intel Xeons) have tended to use the increasing transistor count budget to increase core count, with up to 20 cores possible at time of writing. Whereas desktop CPUs have followed a different path of integrating a GPU and other systems previously living on dedicated chips/circuitry off CPU. At this time (2016) it's worth investigating the very significantly higher CPU resources available on server CPUs. This is perhaps not a point that is widely discussed because in the past CPU evolution was mostly on a single path, but now appears to have forked into at least two very different paths (server and consumer CPUs).

Notes / Links


Review 2D Physics Engine

Some of the experiments shipped with SharpNEAT use the box2dx physics engine.

There is a detailed overview of the situation regarding use of this 2D physics engine and the rendering library being used with it, at gihub.com/colgreen/box2dx. Ultimately it may be wise to switch to another 2D engine such as the Farseer Physics Engine, although there is also some debate about the status of that project.



Unity Game Engine Integration

There is at least one github repository that has integrated SharpNEAT and the Unity game engine (not to be confused with the Microsoft Unity IoC framework); located at github.com/lordjesus/UnityNEAT.

The changes required to make this integration work appear to be relatively modest, and therefore it's worth looking into whether the SharpNEAT core library classes could be re-organized slightly and Unity integration provided as part of the main SharpNEAT project.



Miscellany

  • Replace roulette wheel selection with stochastic universal sampling (SUS).
  • Regularized k-means speciation.
  • Species visualisation(s).
  • Simple primer tutorial(s).
  • Unit tests.
  • Periodic innovation ID defragmentation.
  • Population-wide integrity checks. e.g. ensure a given innovation ID is not being used in nodes *and* connections.
  • Distributed NEAT. Island model. (+ fast binary serialization/IO).
  • 3D HyperNeat substrate/network visualization. Where # of connections is large visualization is possible by randomly thinning out the shown connections. This approach is used in TrackVis (http://www.trackvis.org/) to visualise brain fibers.


Contact: Colin Green
SharpNEAT is a Heliosphan.org project.