Con-x: Connectionist Backprop Language and Simulator


This site describes Con-x, the Connectionist Backprop Language and Simulator. The idea behind this system is to allow experimenters to quickly and easily create, train, and test feedforward, backprop and quickprop neural networks. Basically, Con-x (pronounced "kun ex") is a neural network scripting language and environment.

Who's it for? It was designed to be used by serious backprop researchers, as well as a teaching tool for use in introductory AI courses. There are many simulators out there, and many concentrate on the novice student. Unfortunately, these systems usually sacrifice advanced functions (such as flexibility and ability to process large datasets) for their ease of use and nice graphical interface. Con-x attempts to create a tool which is easy to use, but one that the user will not outgrow.

Con-x works quite well with very large data sets, but has a simple scripting interface. We have also used it inside a fuzzy logic/ artificial neural network robot controller (See papers here and also XRCL). All sources and related files are distributed under an open-sources license.


Getting Con-x

The system was designed and written by Doug Blank. This is the latest (beta) version, although there have been others in Scheme (too slow), and C (too limited). This version is written in C++. It defines a programming language that runs networks quite efficiently (about 1 million connection transactions a second on a 180 MHz Pentium Pro). Check back here often, as it is currently evolving. If you send me mail, I'll let you know when something changes that you are interested in.

Current Version: 4 (see source for subversion #)
Last updated: Aug 25, 1999
Source Available: Con-x, Cxbox, and Cxplot. Browse C++ files here.
Compiles: On most ANSI C++ compilers (VC++, g++) (although, latest versions include readline, which is not available in the DOS version) Here is a basic Makefile that works for Linux.
Executables: Really old Windows version Win-95 DOS, Zipped, Linux ELF, GNU Zipped (latest), Tcl/Tk tools Cxbox, and Cxplot
Additional Resources: the cluster program is invaluable for analyzing hidden layer activations.
Site Map :
     cx/
samples/ (sample code and data)
source/ (Con-x sources)
bin/ (binaries for old Win95 DOS and current Linux)
hw/ (Homework assignments - under construction)


What's new

Con-x supports many enhancements that make it truly interactive: control-c interrupts will now pause it and allow interactive changes with continue, command prompt is now editable with a history of previous commands, improved error handling, new functions to help create networks easier, user-defined functions with arguments and recursion, if/else/endif, ability to "include" files, shell-style #! execution and # comments, cross validation ability, and many other new commands and functions.

For a complete list of historic changes, see here.


Overview

Con-x is a neural network simulator that runs in a terminal window with no built in graphics, although it does have a set of associated graphical tools to help in visualization. You issue commands, like:

layer input 2
layer hidden 2
layer output 1
connect input hidden
connect hidden output

In fact, the above is a Con-x program that creates a 2-2-1 (2 unit input layer, 2 unit hidden layer, and 1 unit output layer) feedforward network. This also creates the weights between the layers so that all of the units in one layer are connected to all of the units in the next layer. (Note: "input", "hidden", and "output" are arbitrary names; you can call them anything you like).

Commands can be given interactively at the Con-x command prompt ([#] Con-x>), or read in from a file. For a quick demo, change to the Con-x demo directory (usually cd cx/samples/), and type the following commands (shown in bold) at the OS prompt (signified by the percent sign):

% ../bin/cx xor1.cx

(You could also add the cx program to your path).

This runs the Con-x script named xor1.cx which is a 2-2-1 network designed to learn the xor function (see Table 1). To exit the program, type the letter q at the Con-x prompt.

Table 1: Xor
Input #1 Input #2 Output
0 0 0
0 1 1
1 0 1
1 1 0

Con-x supports the following commands:

! <numeric exp> backprop break connect <args>
cont copy <args> dim <array element> exit
go <numeric exp> help history init
layer <args> learning (l) list load <string>
load_pattern <numeric exp> loaddata loadwts pause
poke <args> print (?) <exp> propagate quit (q)
init (i) reportnet savewts set <args>
show shownet system (s) <string> unconnect
var <args> vars    

Abbreviations are shown in parenthesis. Each command will be described below.

In general, <exp> can be a string, function, expression, or number. Here is the language's basic BNF:

<args> => <exp> <exp> ...
<exp> => [ <exp> <op> <exp> ] OR <string> OR <number> OR <variable> OR <function> OR <array element>
<string> => "text" or 'text'
<function> => &functionname( <args> )
<variable> => $varname
<array element> => @arrayname[ <numeric exp> ]
<op> => + - * < <= > >= and or / % !=
<comment> => /* Comments ... */

Below is a list of current Con-x functions. These are described in detail here.

&424(  ) check two numbers to see if in 40-20-40 range
&abs(  ) absolute value
&act(  ) activation value at unit
&argv(  ) access command line arguments
&bias(  ) bias value at unit
&cos(  ) cos of value
&eof(  ) EOF of a file handle
&error(  ) error value at unit
&layer_type(  ) layer type
&layer_num(  ) layer number
&layer_name(  ) layer name
&layer_size(  ) layer size
&left(  ) left characters of a string
&peek(  ) look at a value at a unit (DEPRECATED)
&rand(  ) a random integer value between 0 and number
&right(  ) right characters of a string
&round(  ) rounds off to nearest integer
&setweight(  ) set the value of a weight between two units
&sin(  ) sine of value
&string(  ) convert number to string
&target(  ) target value at unit
&trim(  ) trim spaces off of string
&user(  ) get information from user
&value(  ) convert string to number
&weight(  ) value of a weight between two units
Before heading into the details, there are a few things to know and keep in mind about the Con-x scripting language:
  1. Spaces around operators, numbers, and comment markers are important, so pay attention to them in the examples below.
  2. All commands are in lowercase. The system is case sensitive with respect to your variable and array names. Constants (like FALSE, and ON) are always typed in uppercase.
  3. The order that layers are created and connected is important and meaningful. Always create and connect in the order you wish the activations to flow (i.e., from input, through the hidden layer(s), and finally to the output layer(s)).
  4. Variables, layer names, and array names are composed of the characters 'a' through 'z', 'A' through 'Z', '0' through '9', '_', ".", and '/'. The first letter should be a letter or underscore, however.
  5. File names are surrounded by double quotes, but layer names are not. All names need quotes (double or single) when passed to a function.
  6. Single characters are self-evaluating (don't need quotes)

Simulator Details

This simulator currently uses either Backprop in its vanilla form (derived in principle from the PDP books) or Quickprop. Weight updating occurs after each pattern or after each epoch. It takes inputs in any range, but targets must be between 0 and 1 (symmetrical outputs are near completion).

Currently, only fully-connected weight matrices are available. The activation function (set actfun) is now alterable; it can be the standard asymmetrical sigmoid centered around 0.5 (ASYM) or symmetrical around 0 (SYM). (Note: now that I have user-definable functions, I should allow for the user to define their own actfun.)

The error function (set errfun) can be set to the difference between desired and actual output (DIFF) or the hyperbolic arctangent of that difference (ATANH).

Recurrent connections are not permitted, but there are many supporting commands to implement Elman's Simple Recurrent Network (SRN) architecture, or even sequential recursive auto-associative memory (sRAAM).

Con-x was written to handle really large files. For example, input and target patterns are not actually loaded into memory, but are kept on disk and loaded in one at a time as needed. Also, data and configuration data are kept in two files. In this manner, if the configuration needs to be edited, the need for editing the huge data file is reduced.


Getting Started

To train a neural network in Con-x, follow these basic steps:
  1. Create a data file. This is an ASCII file that contains all of the input values (usually between 0 and 1) and all of the output (target) values (also usually between 0 and 1) for each pattern to train. The input and target values for each pattern are stored together. For example, to train on xor data, the file "xor.dat" would contain:
    0 0 0
    1 0 1
    0 1 1
    1 1 0
    
    The main part of the filename can be anything, but the extension should be ".dat".

    Newlines in the file are ignored, just make sure there are spaces between all values, and the correct number of inputs and outputs for each pattern.

  2. Next, you need to describe the data. This is doen by creating a file named "xor.cfg". The main part of the filename must match the one picked ealier, and the extesion must be ".cfg". This ".cfg" file contains 5 numbers, as follows:
    {number of patterns to skip}
    {number of patterns to train}
    {number of patterns to test}
    {number of input values}
    {number of output values}
    
    For the xor we would have: 0 4 0 2 1 to not skip any, train on all 4, save none for testing, 2 inputs, and 1 output.

  3. Next, we write the Con-x script for training. Samples are described below.

  4. Usually, you will have to experiment with hidden layer size, learning rate, and other variables.

That's it! Now, let's look at some specific scripts.

Samples

This section examines a few examples: a simple XOR (#1) network which takes full advantage of the Con-x system; a more sophisticated XOR (#2) that does more of the processing manually; a sequential XOR (#3) which provides an example of complex network manipulations; and some other variations.

Sample #1: xor1.cx

File: xor1.cx

This network demonstrates the basic functions of the backpropagation simulator. First, the code:

/* XOR Con-x Sample File: xor1.cx */
/* Con-x Version 4                */
/* D.S. Blank 1998                */
/* dangermouse.brynmawr.edu           */

/* DESCRIPTION: basic xor backprop */

set session "xor"        /* Sets the .wts, .dat, and .cfg files */

var $insize  2
var $hidsize 2
var $outsize 1

layer input   $insize  /* create the layers; order is meaningful */
layer hidden $hidsize
layer output $outsize

connect input hidden   /* connect them up; order is meaningful */
connect hidden output

set epsilon  .5        /* learning rate */
set momentum .9

var $report_rate 25
var $current_error 0
var $epoch 1

while [[$current_error < 1] and [$epoch < 1000]]  
/* while not all 4 patterns learned */
  cycle
  if [[$epoch % $report_rate] = 0]         /* report every once in a while */
	var $p [correct / [correct + wrong]]
	var $e error
  	? " Epoch # $epoch percentage $p error $e\n"
  endif
  var $epoch [$epoch + 1]      /* all 4 seen */
  var $current_error [correct / [correct + wrong]]
endwhile
var $p [correct / [correct + wrong]]
var $e error
? "Final  # $epoch percentage $p error $e\n"
/* save the weights! */
savewts  
quit 

This Con-x script documents the program, defines the data files, creates a fully connected 2-2-1 backprop network, sets the learning parameters, trains the network to 100% accuracy, and saves the learned weights. Not bad for 22 commands!

Let's take a look at each of these steps in detail. First, the script documents the program using the comment symbols /* and */:

/* XOR Con-x Sample File: xor1.cx */
/* Con-x Version 4                */
/* D.S. Blank 1998                */
/* http://dangermouse.brynmawr.edu/   */
Notice that there are spaces before the */ and after the /*. The comments could have been contained in a single pair of /* and */ symbols, but it looks nice this way. Next, the data files where set:

set session "xor"        /* Sets the .wts, .dat, and .cfg files */
This command sets the data file to be xor.dat, the data configuration file to be xor.cfg. This command also determines where the weights will be saved, specifically the file xor.wts.

Let's take a quick detour to examine the data files. Here they are: xor.dat, xor.cfg. We will use these data files for all of the XOR examples, showing the separation of data and control.

The .dat file defines the inputs and target values for the XOR problem, and is really just Table 1, given in floating point values. Each input and target output pair is called a "pattern".

The .cfg file defines the .dat file. The format is:

skip# train# test# input# output# Where skip# is the number of patterns to skip over, train# is the number of patterns to use in training the network, test# is the number of patterns to save for testing, input# is the number of input values in the pattern, and output# is the number of output targets.

We see for this example that we are skipping zero patterns, training on 4, saving zero for testing, using 2 inputs and 1 output.

Now, back to the program. Next, the network was defined:

layer input   $insize  /* create the layers; order is meaningful */
layer hidden $hidsize
layer output $outsize

connect input hidden   /* connect them up; order is meaningful */
connect hidden output
The layer command takes two arguments: a name and a size. The sizes for these were defined by the commands:

var $insize  2
var $hidsize 2
var $outsize 1
The command var defines a variable, which may be used anywhere that a numeric or string value is expected. These values act as the sizes of the network's layers: 2, 2, and 1. All variables start with the dollar sign.

The layer command created layers with arbitrary names and a specific number of units per layer. The connect command creates connections (weights) between pairs of layers, so that every unit in the first layer is connected to every layer in the second.

Next, the learning parameters were set:

set epsilon  .5        /* learning rate */
set momentum .9
Epsilon is the network's "learning rate", and momentum is the standard value of the same name used in many backprop algorithms.

The network was then trained to 100% accuracy using this code:

var $report_rate 25
var $current_error 0
var $epoch 1

while [[$current_error < 1] and [$epoch < 1000]]  
/* while not all 4 patterns learned */
  cycle
  if [[$epoch % $report_rate] = 0]         /* report every once in a while */
	var $p [correct / [correct + wrong]]
	var $e error
  	? " Epoch # $epoch percentage $p error $e\n"
  endif
  var $epoch [$epoch + 1]      /* all 4 seen */
  var $current_error [correct / [correct + wrong]]
endwhile
var $p [correct / [correct + wrong]]
var $e error
? "Final  # $epoch percentage $p error $e\n"

First, the variable $epoch was initially set to 1. correct is an internal value, and begins at 0. It reflects the number of output values that are correct, and is calculated in the following manner:

If the output layer has more than 1 unit and it is in winner_take_all mode, then "correct" reflects the number of patterns where the max activation was found on the targeted winner unit.

If the output layer has a single unit, or if the network is not in winner_take_all mode, then "correct" reflects the number of units correct using the "tolerance" level: if the output is in the range of target +/- tolerance, then that output unit is scored as correct. "error" and "total" are also set in cycle, and reflects the total summed square of error and total outputs for all of the patterns.

Each cycle tests each pattern in the training set (in random order), and computes the value correct. The cycle commands issues a forward and backward propagation phase, followed by a change of weights (when batch mode is OFF). If batch mode is ON, then the change of weights only happens once per cycle (rather than 4 times per cycle, as in this example). When correct equals 4 (the number of patterns in the .dat file), then we are done.

The cycle command is short for propagate, and backprop applied to each pattern of the training set in random order.

Cycle also calls test_cycle, which sets test_total, test_correct, test_wrong, and test_error. test_correct and test_wrong are always set using the tolerance method, and test_error is the total summed square of error for each output unit. You can change the criteria for error by issuing a "set tolerance {value}" command.

The rest of this code is just a little management of the control, and printing out of some information to the screen using the print command (abbreviated by ?).

Finally, after training, the network's weights are saved to the file xor.wts with this command:

savewts
Here is what the program looks like when running. Bold statements were typed in by the user:

% cd cx/samples
% ../bin/cx xor1.cx
Loading xor1.cx... 
Done loading.
Executing... 
Skipping 0 sets...Done!
Reading INPUT Patterns 4 sets...... Done!
Reading GENERALIZE Patterns 0 sets...... Done!
 Epoch #   50.0000 percentage    0.0000
 Epoch #  100.0000 percentage    0.7500
 Epoch #  150.0000 percentage    0.7500
Final. Epoch #  173.0000 percentage   1.0000
[0] Con-x> quit
The network was able to learn the XOR problem in 173 epochs (e.g., "training cycles", or "sweeps"). This means that each pattern was seen 173 times (173 * 4 individual backprops). Weight updates were made after each pattern. (Set batch to 1 to do otherwise).

To run the example again, start the execution of the program using run or r (to re-init) followed by go 1. (1 is the line number; to see the program listing to know where to start, type list.) You could also skip the network creation commands by issuing go 10.

Type Control-d (on Unix), q, quit, or exit to leave Con-x. You may load any Con-x script at the command prompt with load "filename".


Sample #2: xor2.cx

File: xor2.cx

Sample #1 relied on the internal mechanisms of cycle to process each pattern. It is useful to see what it would take to do that without using the cycle command. The code:

/* XOR Con-x Sample File: xor2.cx */
/* Con-x Version 4.0     */
/* D.S. Blank 1998       */
/* dangermouse.brynmawr.edu  */

set session "xor"        /* Sets the .wts, .dat, and .cfg files */

var $insize  2
var $hidsize 2
var $outsize 1

layer input   $insize  /* create the layers; order is meaningful */
layer hidden $hidsize
layer output $outsize

connect input hidden   /* connect them up; order is meaningful */
connect hidden output

set epsilon  0.5        /* learning rate */
set momentum 0.9

var $correct 0.0
var $epoch 1
dim @trained[4]

while [[$correct < 4] and [$epoch < 1000]]  /* while not all 4 patterns learned */
  var @trained[] 0
  var $tcnt 0
  var $correct 0
  while [$tcnt < 4]              /* try all 4 */
        var next &rand(4) 
        while [@trained[$next] = 1]
                var next &rand(4)
        endwhile 
        var @trained[ $next ] 1
        load_pattern $next
        propagate
        backprop
        update
        var $act &round(&act("output" 0))
        var $tar &round(&target("output" 0))
        var correct [[$act = $tar] + $correct]
        var $tcnt [$tcnt + 1]    /* another one has been trained */
  endwhile
  if [[$epoch % 50] = 0]              /* report every once in a while */
  	? " Epoch # $epoch percentage"
  	? [$correct / 4]  
  	? "\n"
  endif
  var $epoch [$epoch + 1]      /* all 4 seen */
endwhile
? "Final. Epoch \# $epoch percentage"
? [$correct / 4]  
? "\n"
/* save the weights! */
savewts                         
Notice that the code has nearly doubled in size in just managing the presentation of each pattern in random order. This does demonstrate the use of arrays and functions, however.

There are two commands for manipulating arrays: dim, and var. dim is used to create an array, like:


dim @trained[4]
This creates an array named @trained of size four. To set an element of the array to a specific value, use var (the same var to set variables). For example, to set the 0th element of @trained to .78, do this:


var @trained[0] .78
Functions take a list of arguments separated by spaces, and returns a value. The above program takes advantages of functions in a few places. For example:


var $act &round(&act("output" 0))
sets a variable named $act to be the activation value of the input layer, unit #0, after rounding to the nearest integer. All functions start with &, and any layer names (or any strings) used as arguments must be surrounded by quotes (double or single).

You may also define your own functions. All functions used in a session must be located in the current file or included from another file with the "include" command.

/* a test to return differing values based on arg */
func &test( $val)
  if [$val == 1]
    return 5
  endif
  return 6
endfunc

/* an add1 function */
func &add1( $num )
  return [$num + 1]
endfunc

/* Handles strings */
func &mystring( $s )
	return ["-" + $s]
endfunc

/* multiple arguments */
func &sum( $a $b $c $d )
   return [$a + [$b + [$c + $d]]]
endfunc

/* no arguments */
func &five( )
   return 5
endfunc

/* recursion and local variables */
func &fact( $arg1 )
   var $local $arg1
   if [$local == 1]
      return 1
   else
      return [ &fact( [$arg1 - 1] ) * $local ]
   endif
endfunc 

/* accumulator-style recursion */
func &fact_accum( $arg1 $accum)
   if [$arg1 == 1]
      return $accum
   endif
   return &fact_accum( [$arg1 - 1] [$arg1 * $accum] )
endfunc 

? &fact( 5 )
Your code can go before or after the functions.

The above XOR example replaced cycle with a series of load_pattern, propagate,, backprop, and update commands. These commands are the heart of the learning algorithm. More about them below.

Sample #3: xor3.cx

File: xor3.cx

The final example uses some more, powerful Con-x commands.

This program learns the XOR function, but the two inputs are given to the network as a sequence in time using Elman's Simple Recurrent Network (SRN) method.

/* XOR Con-x Sample File: xor3.cx */
/* Con-x Version 4.0     */
/* D.S. Blank 1998       */
/* dangermouse.brynmawr.edu  */

set session "xor"        /* Sets the .wts, .dat, and .cfg files */

var $insize  1
var $hidsize 5
var $outsize 1

layer input   $insize  /* create the layers; order is meaningful */
layer context $hidsize
layer hidden $hidsize
layer output $outsize

connect input hidden   /* connect them up; order is meaningful */
connect context hidden
connect hidden output

set epsilon  0.1        /* learning rate */
set momentum 0.9

var $correct 0.0
var $epoch 1
dim @trained[4]

while [[$correct < 4] and [$epoch < 10000]]  /* while not all 4 patterns learned */
  var @trained[] 0
  var $tcnt 0
  var $correct 0
  while [$tcnt < 4]              /* try all 4 */
        var next &rand(4) 
        while [@trained[$next] = 1]
                var next &rand(4)
        endwhile 
        var @trained[ $next ] 1
        load_pattern $next
        poke context 0 A $hidsize 0.5
        propagate
/*      
        shownet
        break 
*/
        copy context 0 A hidden 0 A $hidsize 
        copy input 0 A file $insize I $insize
        propagate
        backprop
        update
/*      
        shownet
        break 
*/
        var $act &round(&act("output" 0))
        var $tar &round(&target("output" 0))
        var correct [[$act = $tar] + $correct]
        var $tcnt [$tcnt + 1]    /* another one has been trained */
  endwhile
  if [[$epoch % 100] = 0]              /* report every once in a while */
  	? " Epoch # $epoch percentage"
  	? [$correct / 4]  
  	? "\n"
  endif
  var $epoch [$epoch + 1]      /* all 4 seen */
endwhile
? "Final. Epoch \# $epoch percentage"
? [$correct / 4]  
? "\n"
/* save the weights! */
savewts                         
The copy command copies values from one location to another:
copy context 0 A hidden 5 A $hidsize 
Here, the entire hidden layer's activations are copied to the context layer. The letter A is the type of information to copy from or to. Types may be A - activation, T - target, M - memory, I - input values of current training pattern, and O (oh) - output (target) value of of training pattern. The above code copies the activation of the hidden layer (starting at 5) to the activation of the context layer, starting at position 0, for $hidsize values. This is equivalent to the following:
var $i 0
while [$i < $hidsize]
   poke context $i A 1 &act("hidden" [5 + $i])
   var $i [$i + 1]
endwhile
An entire network can be displayed using:
shownet
  or
reportnet
Program execution can be halted via the break command, and rejoined via the cont command:
(encounters a break in a program)
Breaking... (type 'cont' to Continue)  
[#] Con-x> cont
Setting stoppable OFF will disable the break.

Single patterns can be loaded into the input layer activation area and output layer target area using:

load_pattern $next
Target, error, and activation values can be set at a unit by using the poke command:
poke context 0 A $hidsize 0.5
In this line of code, 0.5 was poked into the context layer's activation values starting with the 0th unit and proceeding for $hidsize units. This effectively sets the entire hidden layer's unit's activation values to 0.5.

This example was quite complicated in that many things were going on behind the scenes. Let's look at this script in a sample run:

% ../bin/cx xor3.cx
Loading xor3.cx... 
Done loading.
Executing... 
WARNING: data file input number does not match network! 
         Please check the size of the first layer. 
Skipping 0 sets...Done!
Reading INPUT Patterns 4 sets...... Done!
Reading GENERALIZE Patterns 0 sets...... Done!
 Epoch #  100.0000 percentage   0.2500
 Epoch #  200.0000 percentage   0.5000
 Epoch #  300.0000 percentage   0.2500
 Epoch #  400.0000 percentage   0.2500
 Epoch #  500.0000 percentage   0.5000
 Epoch #  600.0000 percentage   0.5000
 Epoch #  700.0000 percentage   0.5000
 Epoch #  800.0000 percentage   0.2500
 Epoch #  900.0000 percentage   0.5000
...
One of the first things one should notice is that there is a warning about the input number not matching the first layer of the network. Of course, this is right: we have changed the input layer (the first layer defined) to be of size 1 as we are giving the XOR values one at a time. The system will load as many of the input pattern as it can into the input layer, and will hold the others in memory.

We can then get access to the extra input value not loaded into the input layer with the following command:

copy input 0 A file $insize I $insize
Notice that instead of copying from a layer, we have instead listed "file" and "I". This copies to the input layer's activation values, the value from the input pattern starting at position $insize, and continuing for $insize. Here, $insize is 1, and therefore this just copies one value from the 1st position--the second XOR input. (Numbering of layer units and array elements is zero-based, which makes 1 the second position).

This is a hard problem to learn, and takes much longer. Here we rejoin the training:

...
 Epoch # 5100.0000 percentage   0.2500
 Epoch # 5200.0000 percentage   0.5000
 Epoch # 5300.0000 percentage   0.5000
 Epoch # 5400.0000 percentage   0.5000
 Epoch # 5500.0000 percentage   0.5000
 Epoch # 5600.0000 percentage   0.5000
 Epoch # 5700.0000 percentage   0.5000
 Epoch # 5800.0000 percentage   0.7500
 Epoch # 5900.0000 percentage   0.7500
 Epoch # 6000.0000 percentage   0.7500
Final. Epoch # 6085.0000 percentage   1.0000
[0] Con-x> q
So, it took 6085 epochs, or sweeps, to learn XOR sequentially. Also, recall that there are twice as many propagations as backprops in this example.


Sample #4: xor4.cx

File: xor4.cx

So far, we have only constructed 3-layer networks (counting all layers of units, including the input layer). Now, let's take a look at the XOR problem with a slightly different architecture.

In this example, we'll use so-called "short cut" connections. We start out with the same program from Sample #1 above. This time, however, we will make the hidden layer contain a single unit, but also connect the input layer to the output layer, thereby taking a short cut around the hidden unit:

...
var $insize  2
var $hidsize 1
var $outsize 1

layer input   $insize  
layer hidden $hidsize
layer output $outsize

connect input hidden   
connect hidden output
connect input output
...
This architecture tends to learn quite quickly.

One could also create many hidden layers, like so:

...
layer in   2
layer h1   1
layer h2   1
layer h3   1
layer h4   1
layer out  1

connect in h1
connect h1 h2
connect h2 h3
connect h3 h4
connect h4 out
...
Of course, a three layer network can compute anything that a network with more layers can. However, this statement doesn't address how easy or difficult a function is to learn.

For a comparison of different network architectures and how quickly they learn, see the CMU Neural Benchmarks.

Is that the fastest a net can learn the XOR function? No! Check out these tweaks that approach Fahlman's Quickprop method: xor5.cx, xor6.cx and this full-fledged Quickprop.

For examples using the test patterns for cross validation, see xortest.cx and the related dat and cfg files.


Operators, Commands, Settings and Functions

Con-x has many operators (+, -, *, /, etc.), commands (layer, connect, print, etc.), settings (set decimal, set display, etc.), and functions (&act, &target, etc.). These are described in detail here.


Common mistakes

  1. If you want to reference ntrains, ntests, or nskips, make sure you "loaddata" first. That opens the data file, loads the data, and sets those parameters. You can change ntrains, ntests, and nskips after issuing loaddata, but that should not be something you consider lightly.
  2. If you want to manually set the weights, make sure you run "init" first. Otherwise, the system will think that it needs to initialize the weights and will overwrite your changes.
  3. You need to issue "backprop" and "update" if you do not run "cycle". Otherwise, you won't calculate the error, nor will you actually change the weights.

Bugs, errors, limitations, and other weird thingies

Warning: this code has been hacked by a hungry, crazed graduate student. And then hacked some more by an over-worked, cranky assistant professor. This has been evolving for a while, and the programming language part of the project should probably be fixed some day. I'm pretty sure all of the backprop code is close to perfect. But then again, this is a major rewrite.

Anyway, please send bug reports, or wish list items to dblank@brynmawr.edu.

If you make changes, please send them back to me so I can incorporate them into the code. If you are interested in working on this project, let me know and I'll let you know what needs to be done.

Things known to be wrong: