Thoughts about Memex
Shlomo
Dubnov, with afterword by F. Richard Moore
Abstract
These notes describe the ideas and the algorithms that
were used for creation of a computer composition for violin “Memex”. The work
consist of a recombination of phrases by
Bach, Mozart and Beethoven using ideas from universal coding and machine
learning, as explained in the notes. It also represents an approach to music
modeling as an information source, which opens new possibilities for style
learning, mixing and experimenting with various music-listener relations based
on memories, expectations and surprises.
The Music: Click here (mp3)
Memex is a computer
artifact, a composition resulting from mathematical operations on a database of
musical works that was designed and created by the author. The name of the piece comes from an article
by Vannevar Bush ý[1], 1945, where he described a futuristic device “in
which an individual stores all his books, records, and communications, and
which is mechanized so that it may be consulted with exceeding speed and
flexibility... When the user is building a trail, he names it, inserts the name
in his code book, and taps it out on his keyboard. Before him are the two items
to be joined, projected onto adjacent viewing positions. At the bottom of each
there are a number of blank code spaces, and a pointer is set to indicate one
of these on each item. The user taps a single key, and the items are
permanently joined".
The
idea of building memory trails, joining of information and deriving new
meanings is designed into the composition Memex in a very formal and
algorithmically precise manner. What we hear is new music, where every note
belongs to one of the great masters, either Bach, or Mozart or Beethoven. Works
by these composers were analyzed using information processing algorithms to be
described below, creating an automaton that can travel across the web of
musical associations, leaving a trail of memories, expectations and surprises.
The work is provocative, intended to leave the listener perplexed conceptually,
aesthetically, and may be emotionally.
Experiments with music models using IT methods (usually named
musical style learning) are now almost a decade old. In many respects, these
works build upon a long musical tradition of statistical modeling that began
with Hiller and Isaacson "Illiac Suite" ý[2] in the 50th and the French composer / mathematician /
architect Xenakis ý[3] using Markov chains and stochastic processes. My
experiments with machine learning of musical style began with a simple
mutual-source algorithm suggested by El-Yaniv et al. ý[4] that was made to jump between different musical
sources looking for the longest matching suffix, effectively creating a new
source that is closest in terms of cross-entropy to the original musical
sequences. The next step in experiments, done with Assayag et al. ý[6], was learning of musical works using compression
algorithms, specifically the Lempel-Ziv ý[7] incremental parsing (IP) algorithm for creation of
context dictionary and probability assignment as suggested by Feder and
Merhav's ý[8] universal prediction. Performing a random walk on the
phrase dictionary with appropriate probabilities for continuations generated
new music.
These works achieved surprisingly credible musical results in
terms of style imitation. Some informal testing suggested that people could not
distinguish between real and computer improvisation for about 30-40 seconds.
This was important for showing that major aspects of music can be captured
without explicit coding of musical rules or knowledge. Additional experiments were done using Ron et
al. ý[9] Probabilistic Suffix Tree (PST)
machine learning method, trying to improve on "generalization"
capabilities of the statistical models at the cost of some extra "false
notes" resulting from "lossy compression".
Memex presents a new approach ý[10] using Allauzen et al. ý[11] Factor Oracle (FO) for generation of new music from
examples. FO is an automaton that is functionally equivalent to a suffix tree,
but with much fewer nodes. In comparison to IP and PST trees that discard
substrings, FO is preferred because it can be built quickly and like the suffix
tree it encodes all possible substrings. One of the main properties of FO is
that it indexes the sequence in such a way that at every point along the data
it builds a pointer to future continuations for most recent suffixes that
appeared in that place. By "recent suffixes" we mean suffixes that occur
for the first time when a new symbol is observed. Since FO is constructed in
online manner, all "previously seen" suffixes are detected earlier in
the sequence. So, at every point along
the sequence FO provides pointers to continuations of most recent suffixes, and
a pointer back to the longest repeating suffix. This way, we can either jump
into the “future” based on the most recent past, or go to earlier past to look
for continuations of previously encountered suffixes (i.e. suffixes of shorter
prefixes), and so on. So, instead of
considering best context with log-loss "gambling" on the next note,
the new method operates by "forgetting" and selective choice of
historical precedence for deciding about the future.
The piece Memex for violin is created by such "random
walk" over an FO that was constructed from a collection of works by Bach,
Mozart and Beethoven. Prior to construction of FO, the music material was
analyzed in short times to construct a set of events (individual or simultaneous
notes and chords become symbols in a new sequence). This is needed to represent
polyphony (account for simultaneous notes) and deal with invariance and
possible symmetries. At generation step the algorithm randomly chooses (in this
piece with probability .87) to continue to next state (advance along the
original sequence) or jump back (with probability .13) along the suffix link
and follow from there to any forward link. As explained above, this procedure
effectively uses the longest repeating suffix of the sequence to perform
transitions to a new place where continuation of this suffix can be found.
Music,
in its pure form, is devoid of symbolism, denotation or concrete meanings,
which makes it a powerful “probe” into higher functions of our mind. In terms of information theoretic modeling this research
goes beyond modeling and recreation of the source entropy. Considering music –
listener relations as an information channel opens new ways to definition of
musical anticipations, memory and its relations to human cognitive responses.
In this sense, Memex can be used as a tool for investigating new insights into
musical theory and musical perception, raising some interesting thoughts
about what composing and listening actually means: What is the style of the
piece? What is its form, story, its meaning? If “controlling” the automaton
amounts to varying anticipations and memories, does this lead to new insights
about play of cognition, creativity, or new venues for art making? How is
listener experience related to pervious training on related musical examples?
Where does the free will of the composer / artist / creator end and
self-reproduction of culture begins?
Richard Moore,
a computer music professor in UCSD, wrote about the piece:
"Have you ever had a
lucid dream? While not exactly common, lucid dreams are ones in which the
dreamer somehow becomes aware that the experience-in-progress is a dream. Once
you know you’re dreaming (I have occasionally had this experience), you can
relax. Sometimes, lucid dreamers just wake up. However, they can sometimes
elect to continue the dream, exercising various levels of influence over what
is going on. One can elect to fly, to fulfill sexual fantasies, to explore
death, or life in other dimensions. Fantasy becomes the ruler of experience.
Exactly what many people want out of life.
If one were to elect to hear
music in a lucid dream, what would it sound like? Clearly, any such music would
not be constrained by rules, such as those of radio stations, music theory,
gravity, or social convention. Whatever such fantastic music might be based on,
it is hard to imagine any sense in which it would not be based on memory. If
necessity is the mother in invention, then memory is its father, for how could
anything appear in the mind that is not the product of (possibly rearranged)
memory?
Besides memory, there is an
additional source of creativity, described by many people, perhaps most
famously by Leonardo da Vinci. He is reputed to have used a technique of
staring at stains on walls, or patterns in mud, or splatters of paint, to see
what they might suggest. Any child who has found rabbits or ships in the sky
while staring at clouds has done the same thing. Japanese artists suggest
tigers and rivers and billowing drapes with but a few brushstrokes. The human
mind has a powerful penchant for inference. Mostly, this capacity is used to
make “sense” of the sensorial world: we see, hear, feel, taste, or smell, and
almost immediately interpret. Once we’ve inferred the rabbit in the
cloud, it becomes difficult not to see it there, even when we remember
that’s it’s “just a cloud.” Such inference is very fast, faster than the speed
of thought, especially logical thought. It is not hard to imagine how those of
our predecessors who quickly inferred the saber-toothed tiger behind the bush
from a few flashes of light would have more likely survived to become our
ancestors.
No one yet knows what sleep
is, nor why we do it, nor why we dream, but I have a theory about the last,
which others have corroborated. Whatever else happens during sleep, the body
shuts down in certain ways. In particular, sensory input seems to be greatly
attenuated, though not entirely shut off (thus, we can still be awakened by a
sudden crash of thunder). The brain, freed from most sensory input, doggedly
continues to interpret what is going on. That which is interpreted is somewhat
unclear, but it seems that it is chaotic (that is, greatly affected in
unpredictable ways by tiny changes in both external and internal stimuli). The
information that comes into the brain during sleep seems both random and
complex, which allows it to be characterized stochastically, as with the
heat-dance of molecules in a warm fluid (from which Einstein established the
existence of atoms). Even random information is subject to the brain’s
“interpreter,” which apparently never sleeps. The result is dreams, which
(according to my theory) are the brain’s interpretations of chaotically
appearing snippets of memories combined with nearly nonexistent, random sensory
inputs. Technically, a random signal is noise. Thus, the food of dreams is
memory spiced with noise.
Could we explore the world of
music that might be intentfully invoked in lucid dreams? One way would be to
enhance our ability to dream lucidly. Some people “practice” lucid dreaming by
various methods, and report varying degrees of success. Others, apparently,
never dream lucidly. Your mileage may vary.
A computer scientist might use
another method. Compared with brains, computers are fairly primitive devices.
Even to the limited extents that we understand them, the memory and processing
capacities of computers and brains still differ by many orders of magnitude
(though some researchers have pointed out that computers are growing in
capacity at a rate much greater than human brains). The most capable current
supercomputers have capacities measured in impressive units like teraflops and
petabytes. Might it be possible to explore musical memory in a way similar to
lucid dreaming on a computer by assembling fleeting “snippets” taken from one
or more sections of the vast domain of musical literature according to
stochastic (i.e., random) methods?
The answer is yes. Without going into technical details, this is the essence of
a method used in, what? assembling? composing? extricating? snippetizing?
dreaming? music for violin solo by Shlomo Dubnov, a music professor with a
background in computer science at UCSD. Dubnov’s recent composition Memex,
performed recently by UCSD violinist János Négyesy, is based on
recollections of detailed musical moments taken from the violin literature of
Bach, Mozart and Beethoven (and presumably—by extension—anyone). The music
retains a familiar quality, even though it is obviously previously unheard. It
is not like music composed by a student attempting to imitate the style of one
of these composers. It is the original music, presented in a way completely
unheard-as-yet. It would never be mistaken for Bach, or Mozart, or Beethoven,
yet, every note was, in some ultimate sense, was written by these composers.
A related technique has been
used by another music professor with a background in computer science: David
Cope at UCSC has produced a CD entitled Bach by Design, in which Bach’s
music is used as a database for “deriving” additional music “by” Bach (even
though Bach never wrote it). Cope also based other “derived” music on the works
of other composers, with varying degrees of verisimilitude. His stated
motivation for such work is the desire to hear more music from composers of the
past that he has known and loved—more, even than they wrote!
Cope is clearly attempting to
capture the essence of the musical style of various composers of the past,
while Dubnov is attempting something different. Dubnov’s “lucid music” touches
on something essential about the musical nature of mind, of intelligence, of
consciousness itself. It is not about producing more violin pieces by past
composers. It is a musical tool for the exploration of mind, and its boundless
ability to fail to interpret.”
Acknowledgment
The piece is written and
dedicated to János Négyesy, whose enthusiasm of experimental art is never ceasing and
whose intellectual curiosity inspired this work.
References
[1] Bush, V., "As We May Think", in the Atlantic Monthly, July 1945.
[2] Hiller, L. A. and L. M. Isaacson. “Experimental Music: Composition With An Electronic Computer”, New York: McGraw Hill, 1959
[3] Xenakis, I. “Formalized Music: Thought and Mathematics in Composition”, Indiana University Press, 1971
[4] El-Yaniv, R., S. Fine and N. Tishby, "Agnostic Classification of Markovian Sequences", in Advances in Neural Information Processing Systems, Vol. 10, 1998
[5] Dubnov, S., “Stylistic Randomness: About Composing NTrope Suite”, Organised Sound: Vol. 4, no. 2. Cambridge: Cambridge University Press: 87-92, 1999
[6] Dubnov, S., G. Assayag, O. Lartillot, and G. Bejerano, "Using Machine-Learning Methods for Musical Style Modeling", IEEE Computers, 36 (10), pp. 73-80, Oct. 2003.
[7] Ziv J., and A. Lempel, “Compression of Individual Sequences via Variable Rate Coding,” IEEE Trans. Information Theory, vol. 24, no. 5, 1978, pp. 530-536.
[8] Feder, M., N. Merhav, and M. Gutman, “Universal Prediction of Individual Sequences,” IEEE Trans. Information Theory, vol. 38, 1992, pp. 1258-1270.
[9] Ron, D., Y. Singer, and N. Tishby, “The Power of Amnesia: Learning Probabilistic Automata with Variable Memory Length,” Machine Learning, vol. 25, 1996, pp. 117-149.
[10] Assayag, G. and S. Dubnov, “Using Factor Oracles for Machine Improvisation”, Soft Computing 8, pp. 1432-7643, September 2004
[11] Allauzen C, Crochemore M, Raffinot M, “Factor oracle: a new structure for pattern matching”, in Proceedings of SOFSEM’99, Theory and Practice of Informatics, J. Pavelka, G. Tel and M. Bartosek ed., Milovy, Czech Republic, Lecture Notes in Computer Science pp. 291–306, Springer-Verlag, Berlin, 1999.