Who's it for? It was designed to be used by serious backprop researchers, as well as a teaching tool for use in introductory AI courses. There are many simulators out there, and many concentrate on the novice student. Unfortunately, these systems usually sacrifice advanced functions (such as flexibility and ability to process large datasets) for their ease of use and nice graphical interface. Con-x attempts to create a tool which is easy to use, but one that the user will not outgrow.
Con-x works quite well with very large data sets, but has a simple scripting interface. We have also used it inside a fuzzy logic/ artificial neural network robot controller (See papers here and also XRCL). All sources and related files are distributed under an open-sources license.
For a complete list of historic changes, see here.
Commands can be given interactively at the Con-x command prompt ([#] Con-x>), or read in from a file. For a quick demo, change to the Con-x demo directory (usually cd cx/samples/), and type the following commands (shown in bold) at the OS prompt (signified by the percent sign):
This runs the Con-x script named xor1.cx which is a 2-2-1 network designed to learn the xor function (see Table 1). To exit the program, type the letter q at the Con-x prompt.
| Input #1 | Input #2 | Output |
|---|---|---|
| 0 | 0 | 0 |
| 0 | 1 | 1 |
| 1 | 0 | 1 |
| 1 | 1 | 0 |
Con-x supports the following commands:
| ! <numeric exp> | backprop | break | connect <args> |
| cont | copy <args> | dim <array element> | exit |
| go <numeric exp> | help | history | init |
| layer <args> | learning (l) | list | load <string> |
| load_pattern <numeric exp> | loaddata | loadwts | pause |
| poke <args> | print (?) <exp> | propagate | quit (q) |
| init (i) | reportnet | savewts | set <args> |
| show | shownet | system (s) <string> | unconnect |
| var <args> | vars |
Abbreviations are shown in parenthesis. Each command will be described below.
In general, <exp> can be a string, function, expression, or number. Here is the language's basic BNF:
| <args> | => | <exp> <exp> ... |
| <exp> | => | [ <exp> <op> <exp> ] OR <string> OR <number> OR <variable> OR <function> OR <array element> |
| <string> | => | "text" or 'text' |
| <function> | => | &functionname( <args> ) |
| <variable> | => | $varname |
| <array element> | => | @arrayname[ <numeric exp> ] |
| <op> | => | + - * < <= > >= and or / % != |
| <comment> | => | /* Comments ... */ |
Below is a list of current Con-x functions. These are described in detail here.
| &424( ) | check two numbers to see if in 40-20-40 range |
| &abs( ) | absolute value |
| &act( ) | activation value at unit |
| &argv( ) | access command line arguments |
| &bias( ) | bias value at unit |
| &cos( ) | cos of value |
| &eof( ) | EOF of a file handle |
| &error( ) | error value at unit |
| &layer_type( ) | layer type |
| &layer_num( ) | layer number |
| &layer_name( ) | layer name |
| &layer_size( ) | layer size |
| &left( ) | left characters of a string |
| &peek( ) | look at a value at a unit (DEPRECATED) |
| &rand( ) | a random integer value between 0 and number |
| &right( ) | right characters of a string |
| &round( ) | rounds off to nearest integer |
| &setweight( ) | set the value of a weight between two units |
| &sin( ) | sine of value |
| &string( ) | convert number to string |
| &target( ) | target value at unit |
| &trim( ) | trim spaces off of string |
| &user( ) | get information from user |
| &value( ) | convert string to number |
| &weight( ) | value of a weight between two units |
Currently, only fully-connected weight matrices are available. The activation function (set actfun) is now alterable; it can be the standard asymmetrical sigmoid centered around 0.5 (ASYM) or symmetrical around 0 (SYM). (Note: now that I have user-definable functions, I should allow for the user to define their own actfun.)
The error function (set errfun) can be set to the difference between desired and actual output (DIFF) or the hyperbolic arctangent of that difference (ATANH).
Recurrent connections are not permitted, but there are many supporting commands to implement Elman's Simple Recurrent Network (SRN) architecture, or even sequential recursive auto-associative memory (sRAAM).
Con-x was written to handle really large files. For example, input and target patterns are not actually loaded into memory, but are kept on disk and loaded in one at a time as needed. Also, data and configuration data are kept in two files. In this manner, if the configuration needs to be edited, the need for editing the huge data file is reduced.
0 0 0 1 0 1 0 1 1 1 1 0The main part of the filename can be anything, but the extension should be ".dat".
Newlines in the file are ignored, just make sure there are spaces between all values, and the correct number of inputs and outputs for each pattern.
{number of patterns to skip}
{number of patterns to train}
{number of patterns to test}
{number of input values}
{number of output values}
For the xor we would have: 0 4 0 2 1 to not skip any, train
on all 4, save none for testing, 2 inputs, and 1 output.
This network demonstrates the basic functions of the backpropagation simulator. First, the code:
/* XOR Con-x Sample File: xor1.cx */ /* Con-x Version 4 */ /* D.S. Blank 1998 */ /* dangermouse.brynmawr.edu */ /* DESCRIPTION: basic xor backprop */ set session "xor" /* Sets the .wts, .dat, and .cfg files */ var $insize 2 var $hidsize 2 var $outsize 1 layer input $insize /* create the layers; order is meaningful */ layer hidden $hidsize layer output $outsize connect input hidden /* connect them up; order is meaningful */ connect hidden output set epsilon .5 /* learning rate */ set momentum .9 var $report_rate 25 var $current_error 0 var $epoch 1 while [[$current_error < 1] and [$epoch < 1000]] /* while not all 4 patterns learned */ cycle if [[$epoch % $report_rate] = 0] /* report every once in a while */ var $p [correct / [correct + wrong]] var $e error ? " Epoch # $epoch percentage $p error $e\n" endif var $epoch [$epoch + 1] /* all 4 seen */ var $current_error [correct / [correct + wrong]] endwhile var $p [correct / [correct + wrong]] var $e error ? "Final # $epoch percentage $p error $e\n" /* save the weights! */ savewts quit
This Con-x script documents the program, defines the data files, creates a fully connected 2-2-1 backprop network, sets the learning parameters, trains the network to 100% accuracy, and saves the learned weights. Not bad for 22 commands!
Let's take a look at each of these steps in detail. First, the script documents the program using the comment symbols /* and */:
/* XOR Con-x Sample File: xor1.cx */ /* Con-x Version 4 */ /* D.S. Blank 1998 */ /* http://dangermouse.brynmawr.edu/ */
set session "xor" /* Sets the .wts, .dat, and .cfg files */
Let's take a quick detour to examine the data files. Here they are: xor.dat, xor.cfg. We will use these data files for all of the XOR examples, showing the separation of data and control.
The .dat file defines the inputs and target values for the XOR problem, and is really just Table 1, given in floating point values. Each input and target output pair is called a "pattern".
The .cfg file defines the .dat file. The format is:
We see for this example that we are skipping zero patterns, training on 4, saving zero for testing, using 2 inputs and 1 output.
Now, back to the program. Next, the network was defined:
layer input $insize /* create the layers; order is meaningful */ layer hidden $hidsize layer output $outsize connect input hidden /* connect them up; order is meaningful */ connect hidden output
var $insize 2 var $hidsize 2 var $outsize 1
The layer command created layers with arbitrary names and a specific number of units per layer. The connect command creates connections (weights) between pairs of layers, so that every unit in the first layer is connected to every layer in the second.
Next, the learning parameters were set:
set epsilon .5 /* learning rate */ set momentum .9
The network was then trained to 100% accuracy using this code:
var $report_rate 25 var $current_error 0 var $epoch 1 while [[$current_error < 1] and [$epoch < 1000]] /* while not all 4 patterns learned */ cycle if [[$epoch % $report_rate] = 0] /* report every once in a while */ var $p [correct / [correct + wrong]] var $e error ? " Epoch # $epoch percentage $p error $e\n" endif var $epoch [$epoch + 1] /* all 4 seen */ var $current_error [correct / [correct + wrong]] endwhile var $p [correct / [correct + wrong]] var $e error ? "Final # $epoch percentage $p error $e\n"
If the output layer has more than 1 unit and it is in winner_take_all mode, then "correct" reflects the number of patterns where the max activation was found on the targeted winner unit.
If the output layer has a single unit, or if the network is not in winner_take_all mode, then "correct" reflects the number of units correct using the "tolerance" level: if the output is in the range of target +/- tolerance, then that output unit is scored as correct. "error" and "total" are also set in cycle, and reflects the total summed square of error and total outputs for all of the patterns.
Each cycle tests each pattern in the training set (in random order), and computes the value correct. The cycle commands issues a forward and backward propagation phase, followed by a change of weights (when batch mode is OFF). If batch mode is ON, then the change of weights only happens once per cycle (rather than 4 times per cycle, as in this example). When correct equals 4 (the number of patterns in the .dat file), then we are done.
The cycle command is short for propagate, and backprop applied to each pattern of the training set in random order.
Cycle also calls test_cycle, which sets test_total, test_correct, test_wrong, and test_error. test_correct and test_wrong are always set using the tolerance method, and test_error is the total summed square of error for each output unit. You can change the criteria for error by issuing a "set tolerance {value}" command.
The rest of this code is just a little management of the control, and printing out of some information to the screen using the print command (abbreviated by ?).
Finally, after training, the network's weights are saved to the file xor.wts with this command:
savewts
% cd cx/samples % ../bin/cx xor1.cx Loading xor1.cx... Done loading. Executing... Skipping 0 sets...Done! Reading INPUT Patterns 4 sets...... Done! Reading GENERALIZE Patterns 0 sets...... Done! Epoch # 50.0000 percentage 0.0000 Epoch # 100.0000 percentage 0.7500 Epoch # 150.0000 percentage 0.7500 Final. Epoch # 173.0000 percentage 1.0000 [0] Con-x> quit
To run the example again, start the execution of the program using run or r (to re-init) followed by go 1. (1 is the line number; to see the program listing to know where to start, type list.) You could also skip the network creation commands by issuing go 10.
Type Control-d (on Unix), q, quit, or exit to leave Con-x. You may load any Con-x script at the command prompt with load "filename".
Sample #1 relied on the internal mechanisms of cycle to process each pattern. It is useful to see what it would take to do that without using the cycle command. The code:
/* XOR Con-x Sample File: xor2.cx */
/* Con-x Version 4.0 */
/* D.S. Blank 1998 */
/* dangermouse.brynmawr.edu */
set session "xor" /* Sets the .wts, .dat, and .cfg files */
var $insize 2
var $hidsize 2
var $outsize 1
layer input $insize /* create the layers; order is meaningful */
layer hidden $hidsize
layer output $outsize
connect input hidden /* connect them up; order is meaningful */
connect hidden output
set epsilon 0.5 /* learning rate */
set momentum 0.9
var $correct 0.0
var $epoch 1
dim @trained[4]
while [[$correct < 4] and [$epoch < 1000]] /* while not all 4 patterns learned */
var @trained[] 0
var $tcnt 0
var $correct 0
while [$tcnt < 4] /* try all 4 */
var next &rand(4)
while [@trained[$next] = 1]
var next &rand(4)
endwhile
var @trained[ $next ] 1
load_pattern $next
propagate
backprop
update
var $act &round(&act("output" 0))
var $tar &round(&target("output" 0))
var correct [[$act = $tar] + $correct]
var $tcnt [$tcnt + 1] /* another one has been trained */
endwhile
if [[$epoch % 50] = 0] /* report every once in a while */
? " Epoch # $epoch percentage"
? [$correct / 4]
? "\n"
endif
var $epoch [$epoch + 1] /* all 4 seen */
endwhile
? "Final. Epoch \# $epoch percentage"
? [$correct / 4]
? "\n"
/* save the weights! */
savewts
There are two commands for manipulating arrays: dim, and var. dim is used to create an array, like:
This creates an array named @trained of size four. To set an element of the array to a specific value, use var (the same var to set variables). For example, to set the 0th element of @trained to .78, do this:dim @trained[4]
Functions take a list of arguments separated by spaces, and returns a value. The above program takes advantages of functions in a few places. For example:var @trained[0] .78
sets a variable named $act to be the activation value of the input layer, unit #0, after rounding to the nearest integer. All functions start with &, and any layer names (or any strings) used as arguments must be surrounded by quotes (double or single).var $act &round(&act("output" 0))
You may also define your own functions. All functions used in a session must be located in the current file or included from another file with the "include" command.
/* a test to return differing values based on arg */
func &test( $val)
if [$val == 1]
return 5
endif
return 6
endfunc
/* an add1 function */
func &add1( $num )
return [$num + 1]
endfunc
/* Handles strings */
func &mystring( $s )
return ["-" + $s]
endfunc
/* multiple arguments */
func &sum( $a $b $c $d )
return [$a + [$b + [$c + $d]]]
endfunc
/* no arguments */
func &five( )
return 5
endfunc
/* recursion and local variables */
func &fact( $arg1 )
var $local $arg1
if [$local == 1]
return 1
else
return [ &fact( [$arg1 - 1] ) * $local ]
endif
endfunc
/* accumulator-style recursion */
func &fact_accum( $arg1 $accum)
if [$arg1 == 1]
return $accum
endif
return &fact_accum( [$arg1 - 1] [$arg1 * $accum] )
endfunc
? &fact( 5 )
Your code can go before or after the functions. The above XOR example replaced cycle with a series of load_pattern, propagate,, backprop, and update commands. These commands are the heart of the learning algorithm. More about them below.
The final example uses some more, powerful Con-x commands.
This program learns the XOR function, but the two inputs are given to the network as a sequence in time using Elman's Simple Recurrent Network (SRN) method.
/* XOR Con-x Sample File: xor3.cx */
/* Con-x Version 4.0 */
/* D.S. Blank 1998 */
/* dangermouse.brynmawr.edu */
set session "xor" /* Sets the .wts, .dat, and .cfg files */
var $insize 1
var $hidsize 5
var $outsize 1
layer input $insize /* create the layers; order is meaningful */
layer context $hidsize
layer hidden $hidsize
layer output $outsize
connect input hidden /* connect them up; order is meaningful */
connect context hidden
connect hidden output
set epsilon 0.1 /* learning rate */
set momentum 0.9
var $correct 0.0
var $epoch 1
dim @trained[4]
while [[$correct < 4] and [$epoch < 10000]] /* while not all 4 patterns learned */
var @trained[] 0
var $tcnt 0
var $correct 0
while [$tcnt < 4] /* try all 4 */
var next &rand(4)
while [@trained[$next] = 1]
var next &rand(4)
endwhile
var @trained[ $next ] 1
load_pattern $next
poke context 0 A $hidsize 0.5
propagate
/*
shownet
break
*/
copy context 0 A hidden 0 A $hidsize
copy input 0 A file $insize I $insize
propagate
backprop
update
/*
shownet
break
*/
var $act &round(&act("output" 0))
var $tar &round(&target("output" 0))
var correct [[$act = $tar] + $correct]
var $tcnt [$tcnt + 1] /* another one has been trained */
endwhile
if [[$epoch % 100] = 0] /* report every once in a while */
? " Epoch # $epoch percentage"
? [$correct / 4]
? "\n"
endif
var $epoch [$epoch + 1] /* all 4 seen */
endwhile
? "Final. Epoch \# $epoch percentage"
? [$correct / 4]
? "\n"
/* save the weights! */
savewts
copy context 0 A hidden 5 A $hidsize
var $i 0
while [$i < $hidsize]
poke context $i A 1 &act("hidden" [5 + $i])
var $i [$i + 1]
endwhile
shownet or reportnet
(encounters a break in a program) Breaking... (type 'cont' to Continue) [#] Con-x> cont
Single patterns can be loaded into the input layer activation area and output layer target area using:
load_pattern $next
poke context 0 A $hidsize 0.5
This example was quite complicated in that many things were going on behind the scenes. Let's look at this script in a sample run:
% ../bin/cx xor3.cx
Loading xor3.cx...
Done loading.
Executing...
WARNING: data file input number does not match network!
Please check the size of the first layer.
Skipping 0 sets...Done!
Reading INPUT Patterns 4 sets...... Done!
Reading GENERALIZE Patterns 0 sets...... Done!
Epoch # 100.0000 percentage 0.2500
Epoch # 200.0000 percentage 0.5000
Epoch # 300.0000 percentage 0.2500
Epoch # 400.0000 percentage 0.2500
Epoch # 500.0000 percentage 0.5000
Epoch # 600.0000 percentage 0.5000
Epoch # 700.0000 percentage 0.5000
Epoch # 800.0000 percentage 0.2500
Epoch # 900.0000 percentage 0.5000
...
We can then get access to the extra input value not loaded into the input layer with the following command:
copy input 0 A file $insize I $insize
This is a hard problem to learn, and takes much longer. Here we rejoin the training:
... Epoch # 5100.0000 percentage 0.2500 Epoch # 5200.0000 percentage 0.5000 Epoch # 5300.0000 percentage 0.5000 Epoch # 5400.0000 percentage 0.5000 Epoch # 5500.0000 percentage 0.5000 Epoch # 5600.0000 percentage 0.5000 Epoch # 5700.0000 percentage 0.5000 Epoch # 5800.0000 percentage 0.7500 Epoch # 5900.0000 percentage 0.7500 Epoch # 6000.0000 percentage 0.7500 Final. Epoch # 6085.0000 percentage 1.0000 [0] Con-x> q
So far, we have only constructed 3-layer networks (counting all layers of units, including the input layer). Now, let's take a look at the XOR problem with a slightly different architecture.
In this example, we'll use so-called "short cut" connections. We start out with the same program from Sample #1 above. This time, however, we will make the hidden layer contain a single unit, but also connect the input layer to the output layer, thereby taking a short cut around the hidden unit:
... var $insize 2 var $hidsize 1 var $outsize 1 layer input $insize layer hidden $hidsize layer output $outsize connect input hidden connect hidden output connect input output ...
One could also create many hidden layers, like so:
... layer in 2 layer h1 1 layer h2 1 layer h3 1 layer h4 1 layer out 1 connect in h1 connect h1 h2 connect h2 h3 connect h3 h4 connect h4 out ...
For a comparison of different network architectures and how quickly they learn, see the CMU Neural Benchmarks.
Is that the fastest a net can learn the XOR function? No! Check out these tweaks that approach Fahlman's Quickprop method: xor5.cx, xor6.cx and this full-fledged Quickprop.
For examples using the test patterns for cross validation, see xortest.cx and the related dat and cfg files.
Anyway, please send bug reports, or wish list items to dblank@brynmawr.edu.
If you make changes, please send them back to me so I can incorporate them into the code. If you are interested in working on this project, let me know and I'll let you know what needs to be done.
Things known to be wrong: