Friday, January 20, 2012

Math-ive attack

Warning: this post is part of my quest to find increasingly ridiculous excuses to talk about math.  But first, some music.



When listening to Massive Attack's song "Future Proof", the part that stands out to me are those beeping sounds that continue through the entire song.  Most of the time, the beeps cycle quasi-randomly between three pitches in some kind of inscrutable arpeggio.

So the question is, does the sequence of pitches actually come from a random number generator, or does it just sound random because it goes by so quickly?

I wouldn't put it past artists to use randomness in their music.  I swear, some of the other bands I listen to must compose lyrics by pulling words out of a hat.  And if Wikipedia is to be believed, there is a long history of using chance in musical compositions, and the practice is known as aleatoric music.

On the other hand, humans are terrible at generating and recognizing randomness.  Maybe the composer wrote a sequence of pitches that seemed random to them, but which does not resemble a typical randomly generated sequence.  Or maybe it's not even meant to sound random, but sounds random anyway because humans are so terrible at recognizing it.

Perhaps this can be cleared up if I transcribe a small segment of the music.  The following represents the sequence of pitches starting at 3:35, until the chord change at 3:41:

3223 2132 1232 1132 1232 1323 2123 2132

(Note: If a note is of double length, I'm transcribing it as a repeated note.  I can't guarantee that my transcription is free of errors.)

Now that we have a transcription, it's an open-and-shut case.  This sequence was probably not created by a random number generator.  If it were randomly generated, you would expect about a third of the numbers (give or take a few) would be followed by a duplicate.  In other words, there should be a lot more double-length notes, and probably even some triple-length notes.  Here is a randomly generated sequence for comparison:

1131 2131 3133 3121 3133 1112 3311 1213

Observe that there are three doubles and three triples.  Massive Attack's sequence, on the other hand, only has two doubles.

You might also observe that the randomly generated sequence has sixteen 3s, four 2s, and twelve 1s.  This may seem strange, since you would expect each digit to appear about ten or eleven times.  But that's just the way randomness goes sometimes.  But if we accept that randomness sometimes produces outliers, shouldn't we also accept the possibility that randomness produced Massive Attack's sequence even though it includes only two doubles?

The answer is that we are using trickier reasoning here than it may first appear.  A true random number generator is equally likely to produce any sequence of 32 digits.  A random number generator is no more likely to produce the sequence I showed than it is to produce Massive Attack's sequence.  Our reasoning really has to do with what sequences are most likely to be produced by a human.

And the thing is, we don't really know the probability that a human will generate any given sequence.  There are 3^32 sequences, and it's just not possible to collect that many statistics.  So the first thing we do is we classify those sequences by some simple property.*  For example, I chose to classify the sequences by the number of doubles and triples.  The idea is that instead of 3^32 different sequences, we just keep track of the set of sequences with one double, the set of sequences with two doubles, and so forth.  It's (somewhat) well-known that when humans try to imitate random number generators, they tend to underestimate the typical frequency of doubles and triples.  So if a sequence has relatively few doubles and triples, that tends to support the hypothesis that it was a human imitating randomness.

*This is much like the way we group microstates together into macrostates in order to define entropy. [/physics]

Note that we can come up with more hypotheses to explain the sequence.  For example, perhaps they used a random number generator, but ignored most repeats.  Or they could have used a random number generator in some other way.  This would be difficult to disprove.  However, I believe in a fourth hypothesis, which is that it's meant to sound wandering and mysterious, but is not meant to imitate randomness.  I observe that the transcribed sequence has three copies of the sequence 3212321, which is the kind of pattern that seems very unlikely to be produced by a random number generator, but much less unlikely to appear in deliberately composed music.