experiments with complex systems

- This paper is intended as a counterpoint to the strong focus on proper programming practice that conferences mostly attract.
- It is about Perl’s other great strength—“do what I mean”—and quickly.
- While I could go on about Perl’s application to cleaning up real world data, that quickly gets boring.
- Instead I will focus on a world where you start coding without any real idea what kind of results you might produce, then fine tune and extend as results come in.

- One popular attack on the origins of complexity has been through application of simple rules to repeatedly determine new local states based on the states of neighbours amongst an extended population.
- Cellular automata (CA) and Conway’s
*Life*are familiar starting points. - In 1983 Stephen Wolfram found that the behaviour of the 256 one dimensional two colour three neighbour CAs fell into 4 classes:
- Class 1: constant
- Class 2: regular
- Class 3: random
- Class 4: complex

- I developed
*Pattern Breeder,*an extension to Ed Fredkin’s simple self-replicating CA, and it made the Computer Recreations column of*Scientific American*in September 1986. - From then though 2003, my occasional bursts of research focused on adding simple constraints to
*Life,*more recently with the assistance of Andrew Trevorrow’s LifeLab and my own Perl, as introduced in Appendices B and C of the online version of my paper. - In 2002, Wolfram’s monumental
*A New Kind of Science*helped a field often deemed “recreational mathematics” refocus on systematic study of the behaviour of simple programs.

- While many have found these simple systems to be a source of fun, there has also been a feeling that they might provide some useful clues as to how the world works at a deeper level.
- Despite some interesting plays on the idea, few take seriously the notion that the universe is really a CA.
- Wolfram, quantum gravity–theorist Lee Smolin and others hypothesise that an evolving “Planck scale” network underpins the physical universe.
- CA are easier on human perception than evolving networks, but by early 2004 it was time to start investigating evolving networks.

- A simple (undirected) “graph” (in the graph-theoretic sense) is a collection of “vertices” (nodes) linked by “edges”—the simplest kind of network.
- To tackle global time coordination, I identified a simple rule which each tick creates afresh the arrangement of vertices and edges from the arrangement at the previous tick—and called it “Tick Tock”.
- I’m not going to go into detail here on the results of my investigations to date—they are presented in considerable depth at http://www.twistet.com/
- The purpose of this presentation is to look at how Perl made this research practical.

The local *Tick Tock* mechanism can be defined completely in two English sentences:

- Every triangle in the graph at one tick is succeeded by a node at the next tick.
- Those pairs of nodes which succeeded two triangles which shared an edge at one tick are linked by a new edge at the next tick.

Or, in a simple enough graph transformation diagram:

However part of the point of this experiment was to learn to see how networks evolve by looking mostly at quick and dirty output, so there won’t be too many pretty pictures.

- The class of graphs which are maximally connected for a given number of nodes are known as “simplexes”.
- The triangle is the first non-trivial simplex, technically a 2-simplex because it can be represented in 2 dimensions without edges crossing.
- The tetrahedron is thus a 3-simplex.
- The 4-simplex is known as a “pentatope”.
- Beyond that they just get their numbers.
- Simplexes have proven to be of special importance in understanding
*Tick Tock*.

To find out what *Tick Tock* might do, the basic procedure was just to run it, but first it has to be able to be seeded with a range of initial network configurations. For any value of 5:

my $nodes = 5; my @possible; # edge pool for (my $i = 0; $i < $nodes; $i++) { for (my $j = 0; $j < $i; $j++) { push(@possible, "$j:$i"); } }

fills an array with all possible undirected links between 5 nodes—a pentatope.

Then for nearly any value of 7, and getting comfortable with array splicing and arrays of array refs:

my $edges = 7; my @pairs; # select edges, my @graph; # list connected nodes by node while (@pairs < $edges) { my $pick = int(@possible * rand()); my ($picked) = splice(@possible, $pick, 1); push(@pairs, $picked); # select edge my ($p, $q) = split(/:/, $picked); push(@{$graph[$p]}, $q); # list linked nodes by node push(@{$graph[$q]}, $p); }

provides a seed configuation ready to iterate a few ticks.

Emulating a tick takes a couple of steps within an outer loop.

First find the triangles:

(@triples, %edges) = (); foreach my $edge (@pairs) { my ($p, $q) = split(/:/, $edge); # $p should always be < $q my (@plist, @qlist) = (); # find nodes joined to $p and $q foreach (@{$graph[$p]}) {push(@plist, $_) if $_ > $q} foreach (@{$graph[$q]}) {push(@qlist, $_) if $_ > $q} for (my $i = 0; $i < @plist; $i++) { for (my $j = 0; $j < @qlist; $j++) { if ($plist[$i] == $qlist[$j]) { # joined to both my $r = $plist[$i]; push(@triples, "$p:$q:$r"); # triangular face push(@{$edges{"$p:$q"}}, $r); # list third push(@{$edges{"$p:$r"}}, $q); # vertices push(@{$edges{"$q:$r"}}, $p); # by edge } } } }

then generate the next set of nodes and edges:

my @next = sort { # ordering helps spot persistent patterns my @a = split(/:/, $a); my @b = split(/:/, $b); $a[0] == $b[0] ? $a[1] == $b[1] ? $a[2] <=> $b[2] : $a[1] <=> $b[1] : $a[0] <=> $b[0]; } @triples; %index = (); # new node numbers for old triangular faces for (my $i=0; $i < @next; $i++) {$index{$next[$i]} = $i} @pairs = (); # new edges @graph = (); # list connected nodes by node for (my $i = 0; $i < @next; $i++) { my $picked = $next[$i]; my ($p, $q, $r) = split(/:/, $picked); pair($p, $q, $r, $i); # detect adjacent faces (new edges) pair($p, $r, $q, $i); pair($q, $r, $p, $i); }

which needs a subroutine that, in its naive form, detects each pair of adjacent triangles twice, once from each side of their common edge:

sub pair { # detect adjacent faces (new edges) my ($p, $q, $r, $i) = @_; foreach (@{$edges{"$p:$q"}}) { unless ($_ == $r) { # ignore current face my $triple = join(':', sort {$a <=> $b} ($p, $q, $_)); # adjoining face my $j = $index{$triple}; # new node number for face push(@pairs, "$i:$j") if $i < $j; # new edge push(@{$graph[$i]}, $j); # list linked nodes by node } } }

To iterate the basic tick emulation from the preceding slides just add looping, termination and reporting.

- The complete code of an early version of
`ticktock.pl`

forms Appendix A of the online version of my paper. - It was reasonably self-evident that relatively sparse graphs would die, and quickly, as
*Tick Tock*needs three mutually adjacent triangles to form even a new triangle. - But in the world of simple graphs, one of two ways to construct three mutually adjacent triangles requires 4 nodes and 6 edges—forming a tetrahedron.
- From my familiarity with solid geometry I anticipated that a tetrahedral graph would regenerate itself under the
*Tick Tock*rule, but had no preconceptions as to how it might play out from more complex starting graphs.

Starting with `$nodes = 5`

and `$edges = 7`

we get 4 distinct results:

[A] tick 1 (2 nodes, 1 edges): * node edges: 1 x 2 new nodes /|\ edge faces: 1 x 4 old edges, 2 x 1 *—+—* [B] 1 (3, 2): *———*———* \ / node edges: 1 x 2, 2 x 1 \ / \ / * edge faces: 1 x 5, 2 x 2 *———* [C] 1 (3, 3): *——— node edges: 2 x 3 /|\ \ edge faces: 1 x 6, 3 x 1 * | * * 2 (1, 0): \|/ / * node edges: *——— / \ edge faces: 1 x 3 *———* [D] 1 (4, 6): * node edges: 3 x 4 /|\ * edge faces: 2 x 6 *—+—*———* /|\ 2 (4, 6): \|/ *—+—* node edges: 3 x 4 * \|/ edge faces: 2 x 6 *

- For given
`($nodes, $edges)`

results come in random order with varying frequencies, so it is a slow way to investigate all possibilities. - This seemed to justify a detour into well known but tricky-to-implement areas of graph theory to try to create a comprehensive database of small seed graphs.
- This quickly led me back down a familiar path to longer and longer run times as the number of candidate graphs grows exponentially with increasing nodes, giving reason to be work on other tacks while waiting for tasks to run.
- Meanwhile I had discovered that for
`$nodes = 6`

and some`$edges > 10`

,`ticktock.pl`

“ran away” even with the tick limit set conservatively at 10.

- When using Run from BBEdit’s Perl menu, “run aways” were a bad thing as I could not use BBEdit for anything else before the process (was) stopped, so I soon changed to running
`ticktock.pl`

from Terminal—trap 1. - BBEdit happily runs Perl scripts with either Mac or Unix line breaks, but Terminal needs Unix breaks—trap 2.
- When a BBEdit Perl Run prints it changes focus to the output window, so don’t Run again until you shift focus back to the Perl script—trap 3.
- I dropped the “
`edge faces`

” reporting and added a dump of the sorted triangles data in`@next`

when the iteration terminated for reasons other than death, then accumulated data from many runs.

For `$nodes = 5`

and `$edges = 10`

`@next`

stabilises to:

0:3:4 1:12:13 2:23:24 3:4:6 3:4:9 3:5:6 3:6:9 3:8:9 4:6:7 4:6:9 4:9:10 6:9:11 12:13:15 12:13:18 12:14:15 12:15:18 12:17:18 13:15:16 13:15:18 13:18:19 15:18:20 21:36:37 22:46:47 23:24:26 23:24:29 23:25:26 23:26:29 23:28:29 24:26:27 24:26:29 24:29:30 26:29:31 32:36:38 33:46:48 34:36:40 35:46:49 36:37:38 36:37:40 36:38:40 37:38:39 37:38:40 37:40:41 38:40:42 43:47:48 44:47:49 45:48:49 46:47:48 46:47:49 46:48:49 47:48:49

which on careful inspection reveals 5 tetrahedra (`3:4:6:9 12:13:15:18 23:24:26:29 36:37:38:40 46:47:48:49`

) each associated with 6 other “flag” nodes.

Some slightly larger seeds led to run away growth, so as the size of the dump of triangles data grew it became sensible to output `@next`

to a file using a filename provided via `$ARGV[0]`

Conceptually, *Tick Tock* nodes and links do not have spatial coordinates but it is easier for humans if you give them some, say via John N Huffman’s ancient ChemRote applet which uses plain text files to define, e.g. a regular tetrahedron:

CARTESIAN 4 6 MSC00000 vx(1) 1.000 1.000 1.000 1 vx(2) 1.000 -1.000 -1.000 1 vx(3) -1.000 1.000 -1.000 1 vx(4) -1.000 -1.000 1.000 1 ENDATOMS 1 2 1 3 1 4 2 3 2 4 3 4 ENDBONDS

- Having started to reveal the structure of some of the “run away” configurations by visual inspection of
`@next`

the role of tetrahedra became even more conspicuous. - The choice of what data to look at was always informed by what data could be obtained with the least extra code, and over a period I produced Perl to:
- identify tetrahedra within stored
`@next`

data - classify nodes based on their local network topology
- simulate a tick on a graph defined by ChemRote data
- create a frequency distribution of ChemRote “bond” lengths as a sanity check
- simulate a tick on a graph with more versatile n-dimensional coordinates
- identify various other structures within the data

- identify tetrahedra within stored

- A pair of tetrahedra joined at a common edge each persist and their common edge generates a new tetrahdron joined at opposite edges to the first two, thereby powering “linear exponential” growth.
- I still had no unifying description of
*Tick Tock*outcomes so it was still too early to try to define database or other persistent object formats. - Then gradually I determined the hard way that:
- a 5-simplex seed (
`$nodes = 6`

and`$edges = 15`

) evolves into 30 “hub” tetrahedra joined by 90 “spokes” undergoing linear exponential growth; and - a 6-simplex seed (
`$nodes = 7`

and`$edges = 21`

) evolves into 140 “boundary hub” tetrahedra and 210 partially asymmetric and exponentially growing “joining units”.

- a 5-simplex seed (

- Eventually it became clear that the primary engines within
*TickTock*were any edges which belonged to several triangles because they generated simplexes of corresponding order at the next tick. - That meant that my little routine to form seed graphs by taking random subsets of the links of a simplex was no longer relevant, nor was my much longer detour into the combinatorial enumeration of distinct simple graphs.
- Instead, what came to matter was to run complete simplex seeds of various orders for as many ticks as my aging G4 could stand, keeping in mind that they inflate exponentially, and to find insightful ways to represent the results.

- As well as the basic practicality of Perl for this kind of research, this talk is also about dealing with single run times measured in hours or days which you may not know about until they happen.
- Complex systems research is interesting because it is often unpredictable and that very unpredictability often causes virtual memory algorithns to behave pathologically, slowing computation times down by an order of magnitude after a single increment in a key parameter.
- There can be no better time to start to think more laterally about a problem than when you are forced to wait hours or days for the next data point to be computed.

- Long run times increase the value of retaining potentially useful detailed data for subsequent analysis.
- While at one level the complete graph of nodes and edges is generated anew each tick, in emerges that tetrahedra and larger patterns persist across generations.
- Keeping record of nominal node coordinates, edges, triangles, tetrahedra and higher simplexes in a database enables their heritage to be tracked, edge to simplex and tetrahedron to tetrahedron, and their topologies categorised.
- Those heritage relationships were then utilised in a refactored and partially recursive algorithm which works more directly with the emergent structure.

- The simpler emergent mechanisms were easy enough to demonstrate by chosing 3D coordinates for a minimal seed and running
`cartesian_tick.pl`

for a few ticks to produce a series of ChemRote data files. - The larger structures emerging from 5- and 6-simplex seeds needed special purpose Perl scripts, both to isolate substructures and to schematically represent global structure.
- Despite the widening range of coordinate-space edge lengths, it was also likely that projecting n-dimensional coordinates onto a 2-dimensional plane would provide a useful perspective.

Density of the 5- and 6-simplex after 6 ticks, as projected from their 6D and 7D coordinate spaces to 2D.

- Having achieved a sufficient level of understanding of a problem domain like
*Tick Tock,*the incentive to run yet another data point suddenly diminishes. - It would be nice to do one more refactor of the database and code, to find better ways to convey the results, to rationalise 26 Perl scripts and knock them into shape fit for sharing.
- Then we might even start experimenting with variants of the
*Tick Tock*rule. - But first we also need to find a more sustainable approach to scheduling and managing time consuming surveys covering many data points.

- The paper I wrote for the proceedings is intended to complement and extend the story I have presented today.
- The paper exceeded the printing page limit by 4 appendices and 28 footnotes. The complete paper is available online.
- We can use any remaining time to quickly introduce some of those appendices:
- Appendix A: A listing of
`ticktock.pl`

as discussed earlier. - Appendix B: Introduces
*Life in a Tube.* - Appendix C: Introduces
*Trapper.* - Appendix D: Mentions possible interpretations of various experimental results.

- Appendix A: A listing of

Rolling a tube out a few times can improve perception.

- An isolated tube circumference follows Wolfram’s 1D Rule 22 and thus Rule 90 at period 2
^{n} - Andrew Trevorrow’s LifeLab writes and reads Run Length Encoded
*Life*patterns - Perl parses RLEs
- “Blockades” form commonly in tubes of circumference ≤ 10
- A pair of blockades can trap a region of Rule 22

- Traps often stabilise to cycles with periods that are otherwise uncommon in
*Life* - For a trap of width w, the 2
^{w}possible states each have a successor state - It is the train of succession that gets interesting:
- If a trap reaches a state in which the only live cells are on 4th cells it thereafter follows Rule 90
- Rule 90 traps follow short paths to loops with a period which depends on w
- Some near approximations and some complete misses are particularly interesting

At trap width w each trap state can be represented by a binary number < 2^{w} and successor states calculated by some Perl:

w=1: 0→0 1→1 w=2: 0→0 1→3 2→3 3→0 w=3: 0→0 1→3 2→7 3→4 4→6 5→5 6→1 7→0 w=4: 0→0 1→3 2→7 3→4 4→14 5→13 6→9 7→8 ... ... 8→12 9→15 10→11 11→8 12→2 13→1 14→1 15→0

For w=4, three maps cover all succession:

5→13→1(→3→4→14→1) ..... 2 "hair", 4 loop 10→11→8(→12→2→7→8) ...... 2 hair, 4 loop 6→9→15(→0) ............ all eventually die

Some more Perl finds and counts loops, dying and hair tracks:

4: Loops: 4x2 Hair: 2 2 Dies: 1 1 1 1 5: Loops: 4 1 Hair: 2 Dies: 1 2 4 5 3 2 4 4 6: Loops: Hair: Dies: 1 3 3 5 5 7 14 10 8 2 2 2 2 7: Loops: 12 4 1 Hair: 18 12 10 10 6 8 4 2 4 Dies: 1 4 4 1 1 3 5 6 4 2 2 2 2 8: Loops: 12x2 4x3 Hair: 55 30 30 32 16 18 12 8 4 Dies: 1 6 8

- Each increment in w doubles the state space and increases average hair length
- Full listing of successors and final destinations becomes impractical for w>9
- Exhaustive loop, dying and hair stats remain manageable for w≤23
- At w=23, 48607 seconds computation found the longest hair goes through 436 states

- Virtual memory goes pathological
- Can no longer afford to keep a full map of successor states
- Introduce step size which grows slowly with width
- Hit performance barriers at w=55, w=71 due to very large loops
- w=55 barrier conquered by more thoughtful coding
- w=71 barrier required deeper analysis and changes to objectives

- For w≥71 stopped worrying about hair length and focused on loop and fate detection
- “Monster” loop sizes of order 2
^{n+2}for some 4n-1≤w≤4n+1 too big to try to detect by just computing successor states - Discovered short cut allowing precalculation of “milestone” states around monster loops
- Loop size 2
^{32}- 4 = 4294967292 at 119≤w≤121 is the biggest identified - An even larger loop lurks at w=131 so w=130 was a good place to pause

- Sample sizes stepped down from 1,000,000 for w=24 to 10 for w=130
- At the high end of the trap width range for each sample size, computation took of order one day
- Achieved proof of concept for retaining more data for time consuming trials at higher w
- Could never rerun data generation by hand so full refactoring must include auto scheduling

- The main philosophical conclusion from
*Tick Tock*is that such a simple mechanism can provide “inflation” and symmetry breaking for free. - Results from
*Life in a Tube*suggest that there is more to be found near the recently less fashionable “border of order—edge of chaos”. - Trapper’s atomic successor function is irreversible but leads all states to loops that imply a reversible relationship.
- Trapper also reveals an interesting gap:
- almost all loops fall into families governed by Rule 90 with periods that are a multiple of 4
- there are a small portion of loops that completely avoid Rule 90

This presentation uses an adaptation of:

S^{5}: A Simple Standards-Based Slide Show System

by Eric Meyer, available from:

http://www.meyerweb.com/eric/tools/s5/

S^{5}is a slide show format based entirely on XHTML, CSS, and JavaScript. With one file, you can run a complete slide show and have a printer-friendly version as well. The markup used for the slides is very simple, highly semantic, and completely accessible. Anyone with even a smidgen of familiarity with HTML or XHTML can look at the markup and figure out how to adapt it to their particular needs. Anyone familiar with CSS can create their own slide show theme. It's totally simple, and it's totally standards-driven.