Rosio Pavoris a blog

Delivering on other people’s half-decade-old promises

promise

Click it it is a link.

Permalink 3 Comments

ClusterDebian 1.0

So that thing I hinted at earlier is about as done as I care to get it.

ClusterDebian

Officially the point of this project was for the school to have something with which to replace ClusterKnoppix for their Besturingssystemen II class, but really I just wanted to have something nicer to make use of my ever-growing pile of old computers, which is why I finished it. The README explains. (HPC there stands for “high-performance computing” rather than “Hasty Pudding cipher”.)
What I need now is people to test it, ideally by building images and then trying them.

If you’d rather not put that kind of effort in, I’ve also pre-built an image (291 MB). It doesn’t come with X to save space, but it does come with (what else?) this Open MPI tripcode finder I wrote a while ago. It’s not particularly fast, but it reports its progress if you poke it with SIGUSR1 (as in pkill -USR1 tripfind).
The unprivileged user is called gjs, and the password for both him and root is t. I’ve also included the MPI-patched JtR tarball in the home directory for you to build, if that’s less pointless.

Feedback appreciated, even if you don’t find any problems. If you find this project useful at all, or if you have any suggestions, I’d like to hear about it.

Permalink 8 Comments

Automatic language classification, the slow way

#!/usr/bin/python2

import sys
import bz2

def classify(text, langs=('english', 'german', 'french')):
    results = {}
    for lang in langs:
        with open(lang + '.txt') as f:
            corpus = f.read()

        compressed = len(bz2.compress(corpus))
        results[lang] = len(bz2.compress(corpus + text)) - compressed

    return sorted(results, key=results.__getitem__)

if __name__ == '__main__':
    print "Most likely %s." % classify(sys.stdin.read())[0].capitalize()

$ wget -qO - http://www.gutenberg.org/ebooks/31469.txt.utf8 | ./classific.py
Most likely English.
$ wget -qO - http://www.gutenberg.org/ebooks/22367.txt.utf8 | ./classific.py
Most likely German.
$ wget -qO - http://www.gutenberg.org/ebooks/4968.txt.utf8 | ./classific.py
Most likely French.

Permalink 4 Comments

Processing constraints is easy

Alright, we’ve covered search trees in some detail, and they work great for problems where we have clear states and rules of production to move from one state to the next. Sometimes that’s not a very convenient way to state a problem, though, and a more natural way to think about things is as a bunch of variables which can take values in a certain domain, and a number of constraints which describe the relationships of these variables to each other.
The canonical example here is Dijkstra’s eight queens problem. However, that’s been done to death, so let’s instead have two queens and seven knights, and instead of the usual 8×8 chess board, let’s have a 6×6 one.

not queens problem

Read the rest of this entry »

Permalink 2 Comments

Towards a better BBCode

Everyone knows BBCode is a pain to work with, and while WordPress supports limited HTML in user comments, it should be obvious HTML is no better. The unnecessary repetition of SGML-based languages and the insistence on the proper nesting of tags makes them all hideous and unnecessarily error-prone. We can do better.
The discussions of learned societies on the subject have been less than satisfactory, so I decided to just implement my own mark-up language, based on the venerable S-expression:

{b This} is {i {u expert} {o mark-up}}.

This will turn into:

This is expert mark-up.

The immediate effect is that nesting problems and text redundancy immediately disappear. The syntax also lends itself to easy function composition:

{b.i.o.u EXPERT}

EXPERT

Finally, for this first version,0 we also support function iteration:

{sup*3 To the moon}{sub*3 and back.}

To the moonand back.

It goes without saying this can be combined with function composition in arbitrarily complex expressions, with the iteration operator having a higher precedence than the function composition operator.

I’ve elected to use curly braces rather than the more typical parentheses, because curly braces barely see any use in natural language, which is where this mark-up would generally be used. If you do need literal curly braces, you can escape them with a backslash (and if you need a literal \{, you can escape your backslash with a backslash).

As a proof of concept, and because I eat my own dog food, I’ve written (and enabled) a WordPress plugin that enables this SexpCode in blog comments. For sanity, iteration doesn’t go beyond *3. Supported tags are b, i, u, s, o, sub, sup, code, spoiler, quote, blockquote, and m. If you want to use it yourself, adding more tags or changing their definitions should be straightforward.
Trying to use an unsupported or empty tag, or having unbalanced braces (except for closing braces at the end), will assume you’re actually trying to post C-like code, and disable SexpCode for your comment.

Ladies and gentlemen, BBCode was our COBOL. This is our Lisp.

Edit: People who want to implement this themselves should be following this document rather than this post.

Edit again: Play with it!

Edit again again: More implementations:

Know of another implementation (SexpCode+ or SexpCode)? Let me know!


0 Future versions of the language are expected to add support for function arguments (for things like url, img, and colour) and the ability to define aliases (for example, {define exp b.i.o.u}, which would let you use a new exp function as if it were b.i.o.u).

Permalink 48 Comments

Julia settee

I’ve written this, so I might as well share it.

In my post on the Mandelbrot set earlier, I mentioned the Julia sets of the quadratic polynomial fc(z) = z2 + c where c is a given (constant) complex number and z are the points of the complex plane. Because I wanted to visualise how those Julia sets changed as c varied, I’ve written a short program to do that for me.
You can find it here. As usual, you’ll need Allegro, and the compilation instruction is on the first line.

What it does is take two complex numbers as parameters, plus the number of steps it should take to go from the first to the second. At each step, it will calculate and display the Julia set of the quadratic polynomial with that complex number as c, and hopefully your computer is fast enough that the successive Julia sets look like an animation.
For example, if you invoke it as:

./julia -0.8 -1 -0.8 1 200

you’ll see the following:

Though probably not at the same speed. I’ve made no effort to maintain a certain frame rate; the whole thing moves as quickly as your CPU can keep up, because I just wanted a visualisation of how Julia sets change, not a screensaver. If it’s moving too slowly for you, you can try reducing the number of steps, or lowering the numbers in the ZOOM or ITERS #defines, though the first one will make the image smaller and the second will make it darker. If you aren’t interested in the window title, you can also remove the snprintf and set_window_title steps for a significant speed-up.
If it’s too fast, you can do the reverse, or you can build in a delay with Allegro’s install_timer and rest, or POSIX’s usleep or nanosleep.

(Once it’s done, it will just pause at the last Julia set. Press any key to close it. If you want to close it before it’s done, you’ll have to kill it manually.)

The interesting points to explore are the ones inside the Mandelbrot set, as anything else will just be Fatou dust (though you’ll probably still be able to see it because of the grey). For those points, the salient area is the one within 2 unit lengths of the origin, which is why the field displayed ranges from (-2, 2) in the top left corner to (2, -2) in the bottom right (or probably (-2, -2) to (2, 2), I don’t remember). If you need a bigger plane, replace all instances of ZOOM * 4 with ZOOM * (bigger number), and all instances of ZOOM * 2 with ZOOM * (half of bigger number) (if you want to keep the origin in the center of the window).
If you actually want to save the animations, man 3alleg save_bitmap and assemble the images yourself in something like the GIMP. I initially started out doing it this way, but animated GIFs get really big really quickly, so I went with this instead.

Enjoy.

Permalink Comments

Playing games is easy

People who take an active interest in AI are quite unlikely to have very many friends, so it should come as no surprise that trying to get computers to play games has always been a popular subfield of AI. Traditionally that game has mostly been chess, but I feel chess has a grinding tedium to it, so we’re going to look at tic-tac-toe instead, because that at least has the benefit of being over quickly.

Read the rest of this entry »

Permalink 4 Comments

Mandelbrots

I was bored, so I made this.

Rorschach test on fire

Basic introduction to the Mandelbrot set and what this image represents follows.

Read the rest of this entry »

Permalink 11 Comments

Optimal search is easy

Last time we looked at how to solve the eight puzzle using the hill climbing algorithm, which gave us a result much more quickly than a blind depth-first search did, but we wondered if the solution we found was the best we could do, and we asked if there was a way to use heuristics to find not just a solution, but the best solution. Today, we’ll see that there is, and it’s actually really straightforward.

Read the rest of this entry »

Permalink 6 Comments

Heuristics are easy

(This post assumes you read the previous one.)

Today we’ll be looking at the hill climbing algorithm, which is just a plain old depth-first search with heuristics added.
“Heuristics” is a fancy word (from the Greek εὑρίσκω, “I discover”) for a very simple concept. In the context of search trees, it simply means that at a every node, you’re going to look at each possible branch, and take the one that looks the most promising first, instead of just one at random. “Most promising” can be a tricky concept, though.0
Our river-crossing example isn’t necessarily the best one to demonstrate the concept, so let’s go with another classic: the 8 puzzle.

Read the rest of this entry »

Permalink 3 Comments

Search trees are easy

A decent proportion of my readers are noobie programmers or people who aren’t in a position to receive a formal CS education, so I thought I’d cover the basics of a fundamental concept most people cover in their first semester of algorithms or AI today: search trees. The fact that my college considers this to be third-year material so advanced they cannot in good faith make the class compulsory is neither here nor there.

Consider the famous problem of the farmer who wants to cross a river with his fox, goose, and grain, though the only boat can only carry himself and one of these three possesions. Ignore for a moment why a farmer would own a fox, and let’s stretch credibility a bit more by assuming that while the fox and goose are well-trained enough not to wander off in the absence of the farmer, they are not trained not to eat the goose or the grain, respectively, in said absence. How can he safely get to the other side without losing his goose or grain?

Oh noes!

Read the rest of this entry »

Permalink 6 Comments

Xlib hates me

Having finished another popsci book on chaos theory recently (Ian Stewart’s Does God Play Dice?), I thought it’d be an interesting exercise to visualise the Lorenz attractor, and since it’s been a while since I’ve done anything new in programming, to take the opportunity to get into Xlib, the X Window System C library. Results aren’t very encouraging.

I mean, I got something to work easily enough, but any attempt at introducing color beyond black and white for clarity fails miserably and in non-deterministic ways. Eventually I gave up and redid it using something I know.
Compare:

Lorenz attractor (X11) Lorenz attractor (Allegro)

(It’s prettier animated, so do compile the code yourself and see.)
In both cases, the screen represents the Cartesian plane (X-axis horizontal, Y-axis vertical, origin right in the center; one unit is ten pixels). In the Xlib version (left) the Z component is ignored entirely (so it’s really a projection of the attractor onto the Cartesian plane), in the Allegro version (right) some attempt at representing it using shades of gray has been made, with z=0 being black and z=55 being white (though because it is drawn with no real care, it will happily scribble dark lines over light ones if it has to).
You can mess with the variables and starting condition to see how it behaves, or swap around some Xs and Ys and Zs to get different angles, and at least in the Allegro version, messing with color is trivial enough.

Which brings me to my question: does anyone know of decent introductions to Xlib? The Internet is full of tutorials, and as usual, all of them seem to suck. I know Xlib isn’t really supposed to be used directly, but I want to.

Permalink 3 Comments

Cisco sucks at crypto

I’m in a class called Netwerkbeheer (Network Management), which spans two semesters and is a transparent excuse to peddle CCNA certifications. As a result, I spend a lot of time playing with Cisco routers and switches, and one of the many, many things that annoy me about Cisco’s IOS is their cavalier attitude towards security and cryptosystems. A particularly egregious example of this is Cisco’s type 7 encryption.
If you’ve ever configured a Cisco router, you’ve probably encountered it. When the misleadingly named service password-encryption is running, setting a password with the enable password command “encrypts” the password, so that when you issue the show running-config command, you’ll see a line like

enable password 7 08314940000A

instead of the plaintext password, which you’d see if the so-called “password-encryption” was turned off.
Type 7 “encryption” manifests itself in a few other places, including in FTP passwords and various routing protocol authentication passwords.

Type 7 has been known to be broken for a decade and a half now,0 but people continue to use it, almost always for bad reasons.1,2 To drive home just how broken type 7 is, let’s look at it in detail.

The general form of the type 7 “ciphertext” is (0[0-9]|1[0-5])([0-9A-F]{2})+. Some experimenting finds that the length of the “ciphertext” is always twice the length of the plaintext, plus two. Can you guess why?

The “encryption” key is always a number in the range 0-15, which would be easy enough to bruteforce, but that turns out to be unnecessary, since it’s provided (in decimal form) as the first two characters of the “ciphertext”.
That key determines the starting point in a table of twenty-six secondary keys (which, incidentally, is dsfd;kfoA,.iyewrkldJKDHSUB; I don’t know why the table has 26 entries instead of 16), which are XORed in turn with the characters in the plaintext. If the key is, say, 7, the first character in the plaintext is XORed with the seventh character in the table, the second character in the plaintext is XORed with the eighth character in the table, the third with the ninth, &c.
Each resulting character is then converted to two hexadecimal digits (the input can only be ASCII, of course) and appended to the ciphertext.

And that’s seriously all there’s to it. The result is a “cipher” that’s either slightly less or slightly more secure than writing out your passwords in permanent marker on the outside of the door of the server room, depending on how you manage your configuration files.
Because I know this is going to be an issue at some point, I’ve written a simple utility that encrypts and decrypts passwords using type 7, which you can find here.

You’d think this would be a moot point because people should realise their configuration files are sensitive information, but people are, of course, idiots. In that sense, type 7 isn’t just worthless, but actively harmful, because it gives people a false sense of security.


0 http://insecure.org/sploits/cisco.passwords.html

1 The original intent of type 7 was apparently to foil shoulder-surfers, who might see your configuration file as it scrolls by on your screen. Cisco’s official stance (now) is that if security is an issue, the router configuration file itself should be treated as vulnerable data, not just the passwords that may or may not be displayed in it. That would be fair enough, if it wasn’t at odds with Cisco’s default way of saving and loading configuration files, which is through plain TFTP over the regular network, with no options for encryption of either the config or the passwords themselves. But, you know.
(The claim that type 7 is so weak because the router has to be able to reverse it is bullshit, of course. At most it’s true for PAP authentication, but anyone who considers PAP passwords secret information has no business being anywhere near a router.)

2 Cisco themselves now advise against using it, instead suggesting people use type 5, which isn’t encryption, but just hashing with MD5. Which is also broken, of course. The CCNA materials also state that at least type 7 is “better than no encryption”, but I’d argue that it’s worse, because its security is equivalent to plaintext, while also giving idiot network admins the impression that it’s not.
I’m told a type 6 exists now, which is based on AES and supposed to be better. AFAIK our routers don’t support it, and I’m not holding my breath either way.

Permalink 13 Comments

Valid Brainfuck code

                ++++                         +';cloolollllllooooddddoddddoollllcccccc:;;;;;;+          ++k0XXXXXXXXXXXNNNNNNNNNNNNNWWWWWWWWWWWWWWWWWWWWWWWWWWW
               cO0:+                    +~~~~;;;;;::[>c:::::+clooodoodddddooolccccccc;;''''~           ~:++XXXXXXXXXXXNNNNNNNNNNNNNNNNNNNNWWWWWWWWWWWWWWWWWWWW
               ~~             ~~~~+++''~~~~~~~  ~~~+';;;;;;'';;;:ccclloloollllccccccccc;;>~~           ++d0KKKKXKKXXXXXNNNNNNNNNNNNNNNNNNWWWWWWWWWWWWWWWWWWWWW
                     ~~~~~~~~~~~~~~                  ~~          ~~~~~~~~~~~~~~~~~';;;;'++              ;ok00KKKKKXXXXXXNXXXNNNXNNNNNNNNNNNNWWWWWWWWWWWWWWWWWW
                      ~                                                                                 ~;okO0KKKKKKKXXXXXXXNNNXXNNNNNNNNWWWNNWWWWWWWWWWWWWWWW
                                                                                                          ':odk00KKKXXXXXXXXXXNXXXXNNNNNNNNNNNNWWWWMMWWWWWWWWW
                                                                          ~~~~~~~~                         ++cdkO0KKKKKKXXXXXXXXXXXXXNNNNNNNNNNWWWWWWWWWWWWWWW
                                                                    ~~;;;;~                                 ++++O00KKKKKXXXXNXXKKKXXXXXXXNNNNNNWWWWWNNXXKKKKKX
                                                                  ~;::;'~                                    >;okO0KKKKKXXXXXXXXKKKKXXXXXNXXXNNNWWWNKK0OkkkkOO
                                                               ~;::;'~                                          +++0KKKXKKXXXXXXKKKKXXKKKXXXNNNWWWWNKK0OOOOOOO
                                                            ~::cl;~                                   ~~           >d0KKKXKXXXXXXXXXXXXXXXXXXXXNWWNXKK00KKXKKK
                                                          ~d0Okl;~                                   ;~oo~           +++KKXXXXXXXXXKXXXXXXXXXXNNWWNXKKKKXXNXXX
                                                        ~lKNWXk+;~           ~;            ~         :++N+       ~~ ~~+:++KKKKKKXXXKXXKKKKKKXNNNWNNX0KKXNNNNNX
                                                       ~xNWWN0<~ ~~   ~~'<~;co~         'l<o~~   ~   ;<k0Wd~    -':~~c:c;:]KK0KKKKKK>XXXXXXXXNNNWNNXKKKKNNNNNN
                                                      :KWMWX++';;'';';cdc;lOOl          .dK>do;   --;kkl0NWd- .'ckoo;k:Odx;'xK0<<00KKKKKKKKKKKXXNWNNK0OOOKXXXXX
                                                     lNMWNd;;+k+:+dddOK0o[X>>x~          ~OWNKK; ~ '++N++kNOl+ ;c<<x-;oOx]~;x00000>>KKKKKKKKKKXNNNX0OkkxxOOOOO
                                                    oNNN0ccx0KOkO0KO0XW0kWNNMX;;c..~'~ ~'':<<NNX;;~ ;dkk+++0dc :cONk;;:k[x:~:>>0000KKKKK00000KXNNNXK0OOkkkkkkk
                                                   lOOKOox0KK0O0XNKOKWWOKMNWMKcoxkl:cloxxdokXNKXx;:  ;+++ld0dx~;:oN<~~;<ddd-~d]0000KK0000O00KKXXNNNNXXXKKKKKK0
                                                  ;clkkoOKK0kk>NW>kOXXXNWNXNMO:kkOoc::OOOOdkdO0xO;::~.~:ldx0Od~:>cNN'~;lcod++c00000000000OO00KXXNNNNNNNNNNNNNN
                                                 ~;:dkk000kxk0XWWOoO0xkKOOkOXocOoxlo;cOxlocl:;llxl~;:~  ~cdO0c~;.cKX;~~;:ll'~;>O0000000O0OO00KXXNNNWWWNNNNWWNN
                                                ~~:okOkoxOkO0XNWKc;co;xc;;;okcloll~;'dKocc;:o:;;;c~ --   .;d+;++ lOO+  ;;l;+++oO00KKK0OOOOO00KXNNNNNNNNNNNNNNN
                                                ~;oxOx';dkOKXWWNo~~';~l'~~~lc:l:;~  ~xKddo;':l;;~;;  .   ~~;d;  ~:xc   ~:o:~~~c<00<00000OOO00<<NNNNNNNNNNNNNNN
                                                'odxd;~:xk0XWMMX;~~~~~'~ ~~c::::~~  ~xKkOOo'';;'~~;~      ~~c'   ~:    ~ol;~  :x0000OOOOOkkO00KXNNNNNNNNNNNNNN
                                                :oo+kc~lxkKXWWM0'  ~~~~~~~;:;c::~   ~+K0O0[:~;~~~~~'     ~' ;:         ~O;;~  'dOOOOOO>>kkkOO00KXXXXNNNNNNNNNN
                                               ~:o;dkd~;:d00KNNk~  ---~~~~:::olc'   ~<<KxOOd;''-  ~'     ~~ ~'~        ;];~   ~o>OOOO>xxxkkkOO000KKKXXXXXXXNNN
                                               ~;o'ccc~~~~;ldxkd-  .~~~~~~;;l++c;.   c00d>O>:~~~-.-;         --        :;-    .lxkkkkkxxxkkkkkOOOOO0KKKKKKKXXX
                                               ~'lllolcclllc:'';-- ~~~~~''-;cll::;.  'k0xoxd:~~~~ ~;~        ~~       ~c~     ~:<kkkkkxxxxxxkkkkkOO000K0KKKK00
                                              ~~~~~~~'~~;cxXWXxc;~ ~~~~''''~';'';;'~  ;oxolo:~~~~ ~:'         ~       ~;      ~;<xkkkkxxxxxxkkkkOO00000K00KKK0
                                           ~~ol';~~~~       ~:xKk;~~~~';''';'''~~~~    ~';';;+++  +c'          ~               +dxkkkkkkkkkkkkOOO00000O000KKKK
                                        ~;c:'kXlc:c;.          ~<O;';cl::;;cdxxkxxxdo<:'~++++~~   ~[~      ~    ~              ;o>xxxx>kkkkkkk>OOOOO00OOO00000
                                      ~;c;~'::::clo;;;;''co+     ld:xKXk;cddlll:;;;cdk0XWXd;+      '       '                   +<dxxx<kkkkkkkk<OOOOOO000000000
                                       ~  'odcldd:clkKXKO0X0l::;;'cod0Kd'~~~~~~  ~     ~'c0O;-    ~~       ~    ~              ;]oxxxx>>>xkkkkOOOOOOO00000000K
                                         'c;~ :odo0WMMMMMMWWNNWWK:lx00k:~~~~;~  ~.        ~x<~~    ~       ~      ~~           ;<o<+++kk+x[kkkOOOOOOO0O000000K
                                         '~    lxOWMMMMMMMMMWMMWxoxOWWOc';cxkxocod;         c:o:                               ;od>xx>xxx>kkOkkkOOOOOOOOOOOO00
                                                ;OWMMMMMMMMMMMXxx0XNNNxo:cNMMWNNNNXOdc:;;;'';;;;'~                   ---      ~;ld<xx<xxx<xkkkOOOOOOOOOkOOOO0O
                                                 'lk0KKXXXNX0dlxNMMMMMNko;KMMMMWWNWWMWWWNXKk:;:l'                  ~~~~~    -~;:]>>xxxxxxxxk>kOkkkkkOOOOOOOOOO
                                                 dXNNXKKK00OO00NMMMMMMWOollXMMMMMMMMMMMMMMMXk;;;'~~~~~                ~~~   .~;cxO<<<kxdddxkkkkxxxxxxxkkkkkkkO
                                                ~kWMMMMMMMMMWWNWWMMMWWKoll:;kWMMMMMMMMMMMMWOxd;~  ~~~~            ~      ~~+++;[o>>>NX>-----xkkkxxxxxxxxxkkkOO
                                                 dNWMMMMWXXNNkccdkOkdxO0KKo;:lxOKNWMMMMMMWXK0Oc~~  ~                       ~':<<0O<x<-]cclodxxxxxxxkxxxxxxxkOK
                                                 ;KNWMWWNNNWMXxl:;;'~~~~;c;l0X0kdddkO00KKKKOxl;~~                           ~~'>d>>do>ooddddxxxxkkkxxxxxxkkk0N
                                                 ~xXWWNNWMMMMMMMMWX0OkdolldKNNWMMWN0dl:;;;;;'~~                               ~~+.coooddoddd<<<<+++++++xx[kOKW
                                                  :0NWWWWMWWMWWMMMMMMMMWWWWWWMMMMMMMWX0ko:'~~~~                                   ;lllo>o>-xxxxxxxxxxxxxxxk0NM
                                                  ~dOKXXXK0ko:;cdOO0KNWNNXXXWWMMMMWWNK0Okd:'~~                                 ~~-;<<llooddxddxxxxxddddxxxolxX
                                                   'x0KKOd;~     ~~~~';oxOO0XNWWWWNXK0Oxoc:;~~~                                ~~-;c]lloodd>ddddddddddoddd;  ~
                                                    lk0000Ol~           ~';oOKKXXK00Oxdo:;'~~~                                  ~':c>.looooddddddddoooo<<o;+  
                                              ':okO0Kxk0KNNKo;'~           ~lkO0Okxddoc:;~~~~                                  +;:cc[lllooo>>dddoolllllol;--- 
                                        ~':oxXMMWMMWWXkk0KK0xoodl;;;;;';;;lxkkkkkxdoc:;'~~~~~                            --~~~~;<l<llllllooooodolcc:;;::::;'- 
                                     ~oKWWWWWNWMWWMMWNNOk00Okoccc:colcllooxkkkxdoo]:;'~~~~~                         ~;:>>llloc;.:<<lllllllooooolc:;;''++++;:c:
                                    cXNNNWMMMWWMMNWMMWNMOk000ko:;;;'';;;;;cloo[:;;'~~~~~                       ~' ~;>dxxO>NNNX0oxoc;:lclllllollc:;;'+++++~~~''
                                  ~xKXXKXNNNKKNMMWWMMMXMX0OKXNWNX0kdc:;;:cccc:;'~~~~                           d:~:>>00OkOKKX00XNX0o;cccllllllllc;;'++++~~~~~~
                                 ;0NNNN0KNWWNNNMMMMMWMKNWNOOKNWMMMMMNX0Okdol:'~~~~                            'Nclk0XXXK0KKKXKkXOXNOo;:<<llllllll::;;;;<dd<;~-
                                :N0NNMMWWMMXNMWWMMMWWWWXWXx;:xkxdxOKKK0Okol;~~~                             ~~]WoxkOKKKKXXX0>kKN0k00Okl>looooooollccc-.lONWN>O
                              ~lXW0XXWNXKNKoO0xOX0N0dxKX0Xo; ~~~~~~;cll:;~~~                               ;;>OWkWNNWNNXXOdl0KWNkOWNXOd++looooolloool+++dKWMMM
                             ~xXXNKOXKNNNWO:0xl0WNWWNWWMWWNXO'                                          ~++Ox0NNlX0OKXNWX0oO0ONNkxWKxdc;.lloooollooocclc-l0WMM
                            cOXNKKX0kXKKNNx'xo-cWWWMWWNMNNXKNO                                         o-coNKWWO-oocddkxcoK0c0WK:oOOKXx;-.lloooooooc;;;;;;<ONM
                           cNK0XK00K0OXkkk:~:l~~dWWMNNXWWXKkXX~                                       +NK+OWXWW0lOk++O0<<<0+dNWd;ONXKK0l+'+[oooool:~ ~~;;;:l>>
                          dKXNK000OxOOxxkKo~oOo~~koxd;;:o'~~::~                                      :>Xkk0NKNNWWWcxXkd+0O+;0Xk:l0NXOxxOo';loooll+~     ~;<<dO
                         cKXOKN0k0OOooOc~lc ;ol~~;  ~'~~';<lldxl;-                                    ~;'';xccdXNk;O0~c0k'~xNXx:o0K0KK00k;~cooo]>'        ~;>k
                         xNKK0KN0xkOOkOO;;o ;kk:ldolxxokOOkddc~:kd;~                         ~'~~   ~;c:~~ ~   ~'~~ oOxl~;KWWKc;kXNNNXkxo>+;oolll:.           
                        oKk0000KXKkxkxdol'c 'kKd;'::'lxO0Oxk0;~lxkd;~~~~~                ~;:lo:;~  ~0NMWWNXNKKOkc~  ;K~ 'OWNKd;cKNKOkxO0kdl:coooolc;'~        
                       cK00KO00Okk0kdollc;; 'ooclxkdcc0WMWNXkccc:dxd:~ ~~~            ~;lxkxodko'''~:o0xdO0WNNNMO;  ;W:~0XXXKo~:oxO0KKKK0olcllllllllllc'      
                      lXkdkXKOOkkxdxkkl;:dl ;Oxxk0KKWOxNMMNkKXX0l:;coc;~            ~;ooc::coOx:~~ ~lc:;~'cod:;o;   :NO:0KXNk''d00KKKK0Oxdxoolllllllllc'      
                     ;KXKK00KKkkOxxdo::'':~ co;:oxxXNNMMMMMMNdxNOod:';cl:~~         ~~~~';;;;::~~':;':oodxWNMNko'ok~xKWlkkxx:'lkKKKK0O0X00Ol:'cllllcccc~      
                     oK0KXKKK00xdxxdo:'~ ' 'xd:o0K0WNNNWMMMNd~ONk0XOdc~~~~          col;~~~:llo;'oxxl;;dkoK0NM0lccc'l0N;;O0x::oxOO0KKKNXxoool;;cllcccc:~      
                    l0OK00KK0kdlcdool;~~ ~ ;l'~lkdOMNXc;ddl~  dxkNXkkWl~~          oWWWNOdllc;;~~;;;c' :o:o0kWl  ~~cxOX~~kOo;cx0K00kxxOkxOOOx;'ccccccc;       
                  ~kWWO0OXX0Okxl:;;;;;'~~''o;~co''xM0kKc~;~  ~NO;oxkXXlok;~~      :NW0xox0XNKxc:o0olkNO0NMNN0o ~:~~kNX0 ~doccxKKkkxoxOK00OxOOo:;cccccc~       
                 ~xWMWNKXKOOOxdoc:;~     ~~~  ;;~'OWdoXO;~;oloOxxllcW0lcx;:Xc;'oc~:dk;;;kl;:xXoxxWxlkWl;ok0:~~;xkl;XXKx  o:~;xKKK0xok0K00xdOXo~~c:ccc:~       
                 lNWWWNN0O0Okdll::~           ~  ~d0:lXd;~oOKkl:;lNc;;;:O:cK:''ko~~;d~ 'kl;:kWoddWk:cd~   :xcXKkdKdNXXO~ ~~'kNWN0xclkO0KKklxd' ~c::::;        
                 dXXNNXK00xoodol:;'~              'o':O;';xkO0Oxo:ol'~ ;x~;Kl;;Oc~ 'l~ ~ko;:kNoOxMK;'~    ~''NXkdWKNO:;   'xXNKOdccOXNNXk:;;~~';';::;~        
                :xkKXK0Okxdo:;~~~~~~~  ~~          ~ ;d;~;0XXWW0dl:'~~  ;''c~~~d;~'cd~~'0x;'xWxkdW0;:l       oxoc0;~'~   ~xKK0Oo~~d0xl;'~~~~~~~  ;::;~        
               ;xOOkxxxdo:;;'~~~      ~:ol;~          ;~~ck0Oxx0Oc''~~~;:c;;~  :'~kWK~~'Ol~~xXlcc0c;xWl      ~~ ~d~      ~;ldl;':kxxkkdllodxdddc~;::;~        
              ~lxkOOxc'~~~             ~'ckOx:~       ~~:OXXOxxdc'~~~~~ckko;   ~~ ;0K~ ~Ol'~dO;~~dlOOWk;        ~;         ~;;:OWNXOdcldOKKKK0Oo~;::;~        
              ;oddol:;;;:::::;;'~~       ~ck00Oxxxdlcl0WWWOXKxkXKl~  ~cO0Od:~  ~;;oxxc;;xl;'ox;~;Oxolkdd:'       ~      ~;lkOOOxl:;:ok0KXXXXNX0x;;c:;~        
              ;colldxO0KXXKXKK0Okxl:'~    ~';ldOKXNMMMMMMW0NKdd0MNNOOXXKkdc;~~:lo~cod:~~'';;:'~~~~~~  ~~ ~~             ~~'''';;loxOKK0KXNNNNXKx;'::;'        
              ~:loxkO0KXXXXNXXXK00XNNKKXdlKKkdl;:ldNMMNKMWWW0kKNMMMMMMMWO:~ ~o00xodxKWWWMMMMMN0ko'       ~~~            ~':coxkOO0OOOOOO0KKK0xc~~;;;;;~       
              ~;coxxkO0KXKKXXXXKNWMMMMWXWMMMMMMKc~ :XMMMM0k0OkXXWMMMMMMMNK;:dloXMW0xNMMMMMMMMMMMMWKl'' ;doc:::c;;~~    ~;clddddddddxkkkxoc;~     ;;;''~~      
               ~;coddkO000K000KoNMMWWMMWMMMMMMMMMKcoxkXWMWMNOkWWMMMMMMMMMMNxxOdXMWkOWMMMMMMMMMMMN0WMXxkXkXMMMMMMMMW0o;~~;:clooooooc:;;;''''~~    loolc:;'~    
                ~;:coxOd;dOKWWMMMMMMNNMMMMMMMMMMMMWKKXWWMMMNkxNMMMMMMMMMMWxc0NXWMNx0MMMMMMMMMMMMMMMMOcoK0XMMMMMMMMMMMMWOc~~~~~;;;:ldkOkxdol;~~  ~OKKK0ko:~    
Kkkdlc;;'~'~~~   ~~;OWMX:dXWMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMXxxNMMMMMMNNNNOox00XWMKdKMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMNx;~~~cxOOO00Okxoc;~   ~XNNNNKxc~    
MMMMMMMWWNNNXXK0k:~dNWMMX::NMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMKxkWMMMXo;:;';cxKO0XWWOdXMMMMMMMMMMNMMMMMMMMMMMMMMMMMMMMMMMMMMMMN0o;;:okkOkkxoc;~~   ;XNNNN0o;     
MMMMMMMMMMMMMMMMNKKc~;xNMWKWMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM0x0MMMK;~:NN0d':XXXWMNkkNMMMMMMMMMMMMMMMMMMMMMMNNMMMMMMMMMMMMMMMMWX0kdddxxddo:;~~    ;KK0OOd;~     
MMMMMMMMMMMMMMMMMWMWNMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMXKWMMWKOXMMMK;~:NMMWdOMMMMMWXNWMMMMMWXKWMMMMMMMMMMMMMWWMMMMMMMMMMMMMMMMMXOOkkxolc:;'~~~    ;ol:::;~      
MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMN' lMMMWWWMMMMKxOWMMMMMMMMMMMMMMMMMMMW0OWMMMMMMMMMMMMMMMMMMWMMMMWMMMMMMMMMOdllcc;'~~~~~     ~~~~~~        
WMMMWWMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMWl  'x0MMMMMMMMMMMMMMMMMMMMMMMMMMMMMWOoKWXWMMMMMMMMMMMMMMMWXXl;dlcKWMMMMMMMXc''~~~~~        ~;~~~ ~        
WWMMMWMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMWNWMMMW0:  ~oKNMMMMMMMMMMMMMMMMMMMMMMMMMMMMM0' ;c~dMMMMMMMMMMMMNc:'      ~kNOxNMMMMMl~~           ~:ol:;'~'~       
MMMMMMMMMMWKWMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMXo;;;l:'~;dXMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMWNNWMMMMMMMMMMMMMMMX;~;:~   ~cOKOOWMMMM0;~           ~okkkxolc:'       
MMMWWMMMMMXOWMMMMMMMMMMMMMMMWWMMMMMMMMMMMMMMMXo:loc~~~lKWMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMWNNMNx~ ~kkdOKMMMMMMWXd~           ~;okO0OOxo;'~~   
MMMMMMMMMMMWWWMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMNkcdd:;;oWMMMMMMMMMMMMMMMMMMMMMMMMXXWMMMMMMMMMMMMMMMMMMMMMMWKKXNo;:''';xNMMMMMNd:    ~~';:lok0XWMMMMMMMMMMWWNXK
MMMMMMMMMMMNx:;;;;cNMMMMMMMMMMMMMMMMMMMMMMMMMMMMWl'lxOXWMMMMMMMMMMMMMMMMMMMMMMMNxdKWNNMMMMMMMMMMMMMMMMMMMMMMMWd;0W0l~ lWMMMMMx  :KXXNWMMMMMMMMMMMMMMMMMMMMMMMM
WMMMMMMMMMXl:c;   ~0MMMMMMMMMMMMMMMMMMMMMMMMMMMMMNkXMMMMMMMMMMMMMMMMMMMMMMMMMM0'~~dNWMMMMMMMMMMMMMMMMMMMMMMNWMOOWWMMWNNWMMMMMo 'KMMMMMMMMMMMMMMMMMMMMMMMMMMMMM
MMMMMMMMMMK~ ~;;;lKMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMOlKMMMMMMMMMMMMMMMMMMMMMMMMMNWWkdXMMMMMMMMMMMW0oxWMMMMMWWMMMMMMMMMMMMMMMMMMMMMM
MMMMMMMMMMX;   ~;:lkXWMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMNWM0o0WMWXXWMWNOxoclx0OKNMMMMMMMMMMMMMMMMMMMMMMMMMM
MMMMMMMMMMN;       ~'dNMNKXNMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMNc;lxOx;~:olcc'~';cl;o0NWWWMWWMMMMMMMMMMMMMMMMMMM
MMMMMMMMMMWO'       ~cKWNKkxxxxOXWMMMMMMNWMMMNWMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMX;      '''ckk:~   ;;~:cldkOXMMMMMMMMMMMMMMMMMMMMM
MMMMMMMMMMMMK~        ~~;x0KXKOxxO0OkOxooOKXWNXWWWNNWMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMk         ~:c~     ;;;'';:cxXMMMMMMMMMMMMMMMMMMMMM
MMMMMMMMMMMMN'             ~cdxl:dkl~~~~;cdxdl;::;;ckWMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMWNKd~                 ;kc:~~;;:ONMMMMMMMMMMMMMMMMMMMMM
MMMMMMMMMMMMWo                          ~cko~~     'xXWMWWWMMMMMMMMMMMMMMMMMMMMWMMMMMMMWWWMMWMMMMMMMWXOd;~                   ~K0''~~~~~'lKWMMMMMMMMMMMMMMMMMMM
MMMMMMWNMMMMMW:                                     ~;XMMMMMMMMMMMMMMMMMMMMMMW0ddOWMXd;';coc;:c;:d0Ko~                       oMx~~~~;;;':dKWMMMMMMMMMMMMMMMMMM
MMMMMMWNWMMMMMx~                                      ;dO000XWMMMMMMMMMMMMMWXo~  ~;;~             'XN'                  ~;odkXWl~~~~';;;cokXWMMMMMMMMMMMMMMMMM
MMMMMMMMMMMMMMM0c;~                                          ~'~':clllc:::;'~                      o0;               ~:kXNWWMWX;  ~~~'';:clkXWMMMMMMMMMMMMMMMM
MMMMMMMMMMMMMMMMMX;                                                                               :WMW0~          ~;kXWMMMMMMM0~   ~~~';;:lkKWMMMMMMMMMMMMMMMM
MMMMMMMMMMMXXXMMMWKl~                                                                             :kdc~        :k0XWMMMMMMMMMWk~    ~~;;coxXWMMMMMMMMMMMMMMMMM
MMMMMMMMMMWOooXN00WM0;~          ~;:cxO:                                                                      dWMMMMMMMMMMMMMNl~    ~~';lkXWMMMMMMMMMMMMMMMMMM
MMMMMMMMMMWkc;;~ ;XMMMXo:~    ~lKWMMMMMXkkko~        ~;c;~~                           ~~~                   ~dWMMMMMMMMMMMMMWo ~~~ ~~';lkXWMMMMMMMMMMMMMWWMMMM
MMMMMMMMMMWkc'~   'oxOONMX;':d0WMMMMMMMMMMMMNOdl;~~;:l0WMWNX0Okkkkkdl;~~';'~  ~';:coOXWWNX0kxlc:;~        ~;0WMMMMMMMMMMMWN0c  ~;~~~;;okNMMMMMMMMMMMMMMMMWMMMM
MMMMMMMMMMNd:'~        'x00NMMMMMMMMMMMMMMMMMMMMMMWMMMMMMMMMMMMMMMMMMMMWMMMNXXWMMMMMMMMMMMMMMMWW0l;':c:cdOXWMMMMMMMMMMMMM0'~   ~:~~~;dONMMMMMMMMMMMMMMMMMMMMMM
MMMMMMMMMMMK;~~           ~dXWMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMWNWMMMMMMMMMMMMMMMMXxxOk:    'oo~;o0NWMMMMMMMMMMMMMMMMMMMMMMM
MMMMMMMMMMMMK:~              ~;:oXMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMWKc~       ~;00';XMMMMMMMMMMMMMMMMMMMMMMMMMMM
MMMMMMMMMMMMMWO'                 ~oXWNXXNMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMWWNOkkkkkkxlc;~       ~c0NK;:XMMMMMMMMMMMMMMMMMMMMMMMMMMMM
MMMMMMMMMMMMMMMWO;                  ~   ~cxxddx0XWWW0cd0NWMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMWMMW0d:;'~                 ;dXWNx;lNMMMMMMMMMMMMMMMMMMMMMMMMMMMMM
MMMMMMMMMMMMMMMMMWKd;~                           ~~~    ~~'';lkO0NNNNWMMMMMMMMMMMWNKkdkKOo:;'';;~                    ;l0WMW0ll0WMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM
MMMMMMMMMMMMMMMMMMWXXKx;                                         ~~ ~';;clooolcc:'~                               ;oKWMMNOodKMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM
MMMMMMMMMMMMMMMMMMWNXXKKOdl;~                                                                                 'cdKWMWXOdoxKWMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM
MMMMMMMMMMMMMMMMMMMMMMWOlxO0K0ko:~                                                                     ~~;cx0XWMNKOxdokKWMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM
MMMMMMMMMMMMMMMMMMMMMMMMMMNK0OO0XNNX0ko:;~~                                                     ~~';;l0NWMMWX0xc'~:ONMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM
MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMWXXK0Okk0KNNNX0Oxdolc:;;'~~~~~~~~~~          ~~~~~~~~';;:::lodk0XNWWMWN0xxdolcclodOXWMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM
MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMWXKOOkO0KKXNWWMMMMWMWWWWWWWWNNXXXXXXXXXNNWWWWWWMMMMMMMWWXX0kdlc:;;:ldOKNWMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM
MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMWNXK000KK0OkO00KXXXNNNNNNNNNNXXKK0000OOkkxdxxkxoccloxk0KNWMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM
MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMWWNNNNNWNXXXKK0OkxxxkkkxxxxxxxxO0KXNWMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM
MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM
MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM
MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM

(Spoilers.)

Permalink 2 Comments

Controversy!

If you follow these things at all, you’ve probably heard by now: creationists are once again inventing a controversy where there is none, this time by pretending that it matters at all whether Dawkins’ venerable Weasel program uses locking or not, and claiming that his “unwillingness” to dig up code written in the ’80s and release it means… well, something significant.
In case you’ve forgotten what the Weasel program is, here’s a video (using a different phrase, but the same concept):



If you’re a long-time reader and that looks familiar, it’s because I’ve talked about it twice before. The experiment is simple enough that any idiot can repeat it, but of course creationists are a very special kind of idiot.
So this time, let’s walk through writing our own Weasel program and settle this once again.

I’ll be doing this in a kind of “literate Python”, because everyone understands Python and it doesn’t require compilation, so even the most technologically inept don’t have any excuse not to follow along.
If you don’t have Python installed (or aren’t sure if you do), get it here. Get the 2.6.2 one (if you aren’t sure which you need, you’ll need this one). If that’s too hard already, you shouldn’t be on the Internet in the first place.

I’ll be preceding lines of Python code with > signs, façon literate Haskell. The Python interpreter doesn’t understand this style, so I’ll also be providing a link to the final script at the end.

To recap, we’ll start with a random string composed of symbols chosen from a specific alphabet, and a target string which we’re hoping to achieve.
For randomness, we’ll be using Python’s inbuilt random module, so let’s import it.

> import random

The genetic alphabet is just “CGTA” for DNA, but ours will be a bit longer:

> alphabet = "ABCDEFGHIJKLMNOPQRSTUVWXYZ "

Note that that includes the space at the end. We could make it even longer by including minuscules and punctuation, but it really makes no difference to the principle of the thing.
And here’s our target string:

> target = "METHINKS IT IS LIKE A WEASEL"

Of course it’s important that all symbols in the target string are also in the genetic alphabet, or we’ll never find it.

At the heart of every genetic algorithm, there’s a fitness function. In our case, this is just something that compares our string of “DNA” with the target string symbol by symbol, and just says how many symbols match. Strings with more matching symbols are obviously more like the target string, and will be selected to breed for the next generation.

> def fitness(child):
>     fit = 0
>
>     for a, b in zip(child, target):
>         if a == b:
>             fit += 1
>
>     return fit

The built-in zip function pairs items in a given list, creating a pair of the first character in both the “organism” and the target string, then the second, then the third, and so on. For each pair, it’s then going to check if both members of the pair are the same symbol. If they are, the fitness value is incremented by one. At the end, it’s returned.
This should be obvious.

Equally important, of course, is the reproduction function. Dawkins’ Weasel strings reproduce asexually, so there’s only one parent for each child. The parent copies his entire “DNA”, and each locus has a small chance of mutating.
How high the mutation rate is isn’t that important, because the point of the Weasel program isn’t to simulate real-life life perfectly, but just to demonstrate that descent with modification is more effective than a random search. Let’s give our strings a mutation rate of 1 chance in 50 at each locus. If that seems too high, remember that our genome will be 28 loci in length, so there’ll already be a lot of generations where no mutation will happen at all.

> def reproduce(parent):
>     child = ""
>     for gene in parent:
>         child += random.choice(alphabet) if random.randint(1, 50) == 1 \
>                                          else gene
>     return child

Did you follow that? Our child starts off as an empty string, and then we iterate over the genes of the parent. There’s 1 chance in 50 that a mutation occurs, in which case we randomly select a gene from the alphabet to add to the genome. Otherwise, we use the parent’s gene.
At the end, the constructed child is returned, ready to have its fitness judged.

Now that we have those two important functions, the rest of the program is straightforward. First, we construct a completely random starting point:

> parent = [random.choice(alphabet) for _ in target]

Our Adam. Let’s put him and his fitness on display:

> generation = 1
> print "%d %s (%d)" % (generation, parent, fitness(parent))

The generation variable will keep track of how many generations have passed.
We’re just about ready to enter our reproduction cycle now. Every generation, our parent will have one child, and if this child is fitter than his parent, this child will be the next parent. Otherwise it is mercilessly discarded and the parent will parent the next generation as well.

> while True:
>     child = reproduce(parent)
>     generation += 1
> 
>     child_fit, parent_fit = fitness(child), fitness(parent)
> 
>     if child_fit > parent_fit:
>         parent, parent_fit = child, child_fit
> 
>     print "%d %s (%d)" % (generation, parent, parent_fit)
> 
>     if parent_fit == len(target):
>         break

As you can undoubtedly tell, this loop will exit once the fitness of the mutating string equals the length of the target; that is, when both strings are equal.
The string "METHINKS IT IS LIKE A WEASEL" is 28 symbols long. With a 27-symbol alphabet, a random search would take on average 2728 / 2 generations to match it. That's 5,986,​257,​591,​281,​009,​894,​301,​370,​013,​358,​523,​552,​840 generations. Even if we're doing a trillion1 generations a second, that would take 189 million trillion2 years to finish.

Our little program will be doing a bit better than that. The final generation number will be displayed before the final generation genome, of course, but let's rub it in again just to be sure:

> print "Finished! Target reached in %d generations!" % generation

Running it a thousand times, I got an average of 8012.58 generations. Suck it, creationists. Descent with modification and selection really is faster. Just like the last time. And every other fucking time.

And as promised, the full code is here. Just save that somewhere and double-click it to run it (you'll need to chmod +x on sane platforms, but you know that).

And that really should be that. Dembski can wave giant-sleeve-clad arms about free lunches all he likes, but in the real world, not everyone is innumerate. It’s just sad that decent people have to waste time on his bullshit.
I doubt this will actually do much good (of course it won’t; even the creationists themselves (exceptionally dense specimens excepted, as usual) realise this time that there’s no controversy here, just a giant heap of time-wasting nonsense), but if nothing else, I hope I’ve demonstrated even an elementary school student could do this. If the original code is of any interest at all, it’s because of archaeological reasons, because old code is usually interesting, not because the algorithm is that fascinating or complicated.


1 Short scale. 1012.
2 1018. Yes, I know the proper name for that is a quintillion in the short scale and a trillion in the long scale.

Permalink 3 Comments

Lol Github

I’ve never had much use for source control, but after that one thread broke /prog/scrape I realised it might be useful to have somewhere centralised to put the source code and recent diffs, instead of just having a possibly-up-to-date file hidden here somewhere and having people download that whenever their shit breaks without really knowing if I fixed it yet.
Github was an obvious choice, not least because bandwagons are awesome, but then I lost interest and forgot about it until now.

Incidentally, when you create a new repository on Github, it automatically generates a URL for you to push to, of the form git@github.com:​username/​projectname.git. The project name can’t have any characters outside [a-zA-Z0-9-], though, so when it does they’re replaced with dashes.
Whoever wrote that piece of code, though, forgot that project names apparently also can’t start with a dash, so when you try to push to it, it just gives you an unhelpful error message:

Invalid repository url. Make sure you include the .git, e.g. git@github.com:defunkt/ambition.git

And then it breaks the connection, which is particularly nice when your first experience with git involves pushing a project named -prog-scrape.
Anyway, easy enough to fix. I do like that it uses ssh, so I don’t have to type my password every time. I’ll probably be creating additional repos for my many, many other projects that people find useful.

tl;dr: http://github.com/Cairnarvon/progscrape/tree/master

Permalink 4 Comments

Langton’s Ant

As programming exercises go, this ranks somewhere between hello world and Conway’s Game of Life, but since most implementations I’ve seen on the internets use the basic form, I thought I’d write one capable of the generalised ant.

Recall: in the basic form, Langton’s ant moves forward on a two-dimensional field, flips the tile it’s currently on (which can be black or white), and turns left or right based on the color of said tile. This gives you a surprisingly chaotic pattern which degenerates into a repetitive “highway”.
In the generalised form, the tile can be any number of colors, and each color has a turning behavior. Instead of flipping a binary tile, it cycles through the list of colors in order.

You use this by compiling it (again, you’ll need Allegro; instruction is on the first line of the file again, and you’ll probably want to change the WIDTH and HEIGHT #defines again) and then invoking it like so:

ant LR

Where the “LR” is the pattern. LR is the basic ant, saying that it’ll turn left on the first color and right on the second. If you want more than one ant on the field at a time, specify that using the -n flag:

ant -n 3 LR

If you don’t want to start off with a black field, you can load a bitmapped image (.bmp, .pcx, .lbm, or .tga) with the -f flag. In principle you can determine the placement of the ant(s) with red pixels, but I can’t be bothered to figure out how Allegro works with palettes, so that’s not likely to work.

ant -n 3 -f penis.pcx LR

You can save a screenshot by hitting the S key while it’s running (it will always save to the same file, specified in a #define; unless you change it, that’s ant.pcx), and exit by hitting any other key.

Interesting patterns include LRRL (pictured), and patterns can be from one to fifteen tokens long (more if you add more colors yourself).
Enjoy.

Permalink Comments

Quadratic spline interpolation

You’ve had this problem before: you have a bunch of data points, and you want to interpolate between them.
For various reasons, higher order polynomial interpolation (where you try to find an nth-degree polynomial through n + 1 of your data points) can be a bad idea, so you decide that rather than using a simple equation, you’ll use a series of them to connect your data points. These equations are splines, and the simplest form of spline interpolation is just, well, connecting your data points directly:

That’s pretty ugly, though. Is there a way to achieve something like this:

instead?
Yes, obviously, and one of those ways is to use quadratic splines instead. Let’s use a simpler example, though. Suppose we only have four data points, (x0, y0) through (x3, y3):

linear spline interpolation

The black dots are our actual data points, the red lines are our linear splines. What we’d actually like, though, is this:

quadratic spline interpolation

Turns out that’s not that hard to do. As you can see, every spline is a quadratic equation, which obviously is of the form f(x) = ax² + bx + c. So each spline equation has three unknowns (a, b, and c), and there are three splines, for a total of nine unknowns (let’s call them a1 through a3 and so on).

Since two points are known for each spline equation, that gives us the following six equations:

To solve for nine unknowns, obviously we need nine equations. So what else do we know?

Well, the reason the linear spline interpolation looks like crap is because of the sharp breaks at the spline edges, so we would like our neighboring quadratic splines to have the same slope in the point that they share. In other words, if our spline equations are f, g, and h, we want the derivative of f to equal the derivative of g in the point (x1, y1), and we want the derivative of g to equal the derivative of h in the point (x2, y2).
The derivative is easy enough to find:

Filling in, this gives us two more equations:

Or equivalently:

Which brings our total to eight equations. We aren’t going to squeeze another legitimate equation out of this, so let’s just fill in one of the unknowns ourselves. If we make one of the as equal to 0, one of the quadratic splines becomes a linear spline, which is fine. Let’s take a1 for simplicity’s sake.
This enables us to construct the following matrix:

The first three columns are the as, the next three the bs, the next three the cs, and the final column will hold the solutions after reduction.
Filled in and solved for our particular dataset:

Which gives us the following equations for our splines:

Obviously this is a lot of work, but it’s mechanical work that doesn’t require a lot of judgement. Which is why I’ve written this Python script to do it for you. Feed it data points and it’ll produce gnuplot code to plot your splines:

$ python qsi.py < data.txt 
plot 1.000000 <= x && x <= 3.000000 ? 0.000000 * x * x + 1.500000 * x + 1.500000 :\
     3.000000 <= x && x <= 5.000000 ? -1.250000 * x * x + 9.000000 * x + -9.750000 :\
     5.000000 <= x && x <= 9.000000 ? 1.125000 * x * x + -14.750000 * x + 49.625000 : 0/0 notitle

As you can tell, it’s not necessarily gorgeous, but it (probably) works, and it’s not like anyone has to see the code itself.
Format for the input file is as you’d expect: two numbers per line, first being x and second y, sorted. If gnuplot‘s output is jagged, increase the sampling (set sample 1000).
And if it doesn’t work, fix it.

Edit: In light of overwhelming demand, this is a script that interpolates using a higher-order polynomial, as mentioned above. Here’s how the approaches compare for our sample dataset:

This script will fail if you only have one datapoint and its x value is 0, but everything else should work.

Permalink 9 Comments

Bézier curves are pretty

If you’ve ever used a vector drawing program you’ve probably come across them. It turns out they’re conceptually a lot simpler than I expected them to be.
Wikipedia has great pictures that should be self-explanatory:

This is a linear “curve”, with only two guide points:

A quadratic curve has three guide points:

A cubic curve has four:

And a quartic curve has five:

Most vector graphic applications only go up to cubic, and represent more complicated curves as grafts of simpler ones.
My last exam was yesterday, though, so I had some free time, and I’ve been playing with Allegro recently, so I thought I’d write something that draws Bézier curves of arbitrary complexity. Behold.

First line is how you compile it on a typical system. You’ll need Allegro, obviously, which for Debian/Ubanto users is the liballegro-dev package. Others can get it here.

The guide points it uses are passed as command line arguments, with the first argument being the x coordinate of the first point, the second being the y coordinate of the first point, the third being the x coordinate of the second point, &c.
The origin (0, 0) is at the top left of the screen.

./bezier 10 10 50 320 310 230 200 10

This should give you something like:

The program will pause after drawing your curve, until you press a key. If that key is s, it will save the screen to a .pcx file, the name of which you can change in the #defines (default bezier.pcx; if you don’t like .pcx, Allegro also supports .bmp and .tga, and will determine file type based on the extension).

Other things you can customise should mostly be obvious. If LINES is 0 (or undefined), it will just draw pixels instead of trying to connect points with lines. If GUIDES is 1 it will mark the guide points in the color specified by GUIDECOL. GUIDECOL, FOREGROUND, and BACKGROUND are just RGB values in the range 0-255.
WIDTH and HEIGHT are the dimensions of the drawing field. This doesn’t have to be your resolution, but probably shouldn’t be higher. If you want a 100×100 image, 100 and 100 are perfectly legal values.
GRAIN is how often divide() recurses while trying to divide lines into sections. Higher values should give more accurate representations, but usually aren’t needed. STEPS is how many points this will actually give you. Don’t touch STEPS.

This isn’t hugely interesting, but it’s a nice enough toy that I thought I’d share it.

Permalink 3 Comments

$ php -r "echo 0.15-0.05;"
0.0:

The actual result is 0.09999999999999999167332731531132594, courtesy of IEEE, but since PHP is user-friendly, it rounds before display.

ASCII round

Permalink 11 Comments