Overview

Overview (of the Overview)

This pages attempts to provide a birds-eye view of what Monolith is and what it does. Things will start with a very high-level description, then immediately build up to this starting from the lowest-level principles covered within the scope of monolith.

The Tippy Top

Monolith is the name of software that I (Paul) use to create computer generated sound and music.

Monolith is a live-coding environment for sound, typically spawned inside of the Emacs text editor. In this setup, I write code (a dialect of Scheme), and this is sent to Monolith to tell it to create sound.

In addition to code, I also use a number of interfaces with my setup. Most notably the monome grid and arc, and sometimes the griffin powermate USB, which I refer to as the 'griffin'. These are tightly integrated with Monolith, and an important aspect of Monolith's design.

In some words: music, computers, real-time, live-coding, interfaces.

The Bottom of the Monolith: Soundpipe

The bottom layer of Monolith is a library called Soundpipe.

Right before it reaches your speaker and comes out as sound waves, sound is a bunch of discrete numbers. These numbers are produced using something known as digital signal processing, or DSP.

The thing that handles all DSP is done with something called the DSP Kernel, which can be examined in detail at 5. DSP Kernel.

The majority of DSP is handled with Soundpipe, a library written in portable C code. Soundpipe has a nice collection of music DSP algorithms that provide monolith a good starting point.

Because it's written in C, it's very hard to be creative in Soundpipe. For every musical thought, several of lines of C code need to be written. This takes time to do. Too much time.

To fix this, a few constructs will be written on top of Soundpipe. Historically, this was a language called Sporth. Spiritually, Sporth lives on in two components: Graforgeand Runt. Soundpipe is managed inside of Graforge. You can read more about this relationship in 6. Graforge and Soundpipe. Runt runs on top of Graforge, and will be covered later in the document.

Making structures with Graforge

graforge is another C library, used on top of Soundpipe. It solves one problem: how to connect self-contained DSP algorithms (like the ones seen in Soundpipe).

Graforge constructs what electronic musicians would call a patch, and computer scientists would call a directed audio graph. A patch is made up of small building blocks of sound called nodes, and nodes talk to one another using virtual cables. There's a bit more, but that's the gist.

Graforge gives Soundpipe a generalized way to talk to itself, and with other DSP code converted to graforge. Every Soundpipe DSP module gets wrapped up in a Graforge node. It's very useful, but it's still very tedious to write Graforge C code by hand. Sometimes, it's even more tedious!

Luckily, things in Graforge are predictably tedious. It is trivial (and encouraged!) to build things on top of Graforge to make things a bit more manageable.

To see how the Graforge C API is used in monolith, consult 6. Graforge and Soundpipe.

The next layer immediately following Graforge is Runt.

Higher-level control with Runt

runt is a funny little stack-based language written in C. It is designed to be very portable, easy to embed, and easy to extend. The core of the Runt is managed in 9. Runt.

Like most stack based languages and Forths, Runt maintains a list of keywords known as words in a dictionary. The Runt C API allows programs to add new words to the dictionary at run time. Graforge API has a number of words that it adds to Runt. Every Graforge node gets wrapped around a Runt word and then added to the dictionary. This happens in 9.2.1. The Main Runt Loader.

Runt allows patches in Graforge to be described in a very terse way. Graforge is also quite stack-oriented like Runt, so the flow really makes sense.

Now, a digression...

Up to this point, only DSP code has been discussed. Actually hearing what this DSP code produces is another matter. In Monolith, sound needs to be heard in realtime. Also, it wants to be adjusted in realtime. This is where the core monolith layer comes in.

There's also the issue of Runt code itself: Runt is a very terse language, and after about a page, things get very hard to keep track of. Also, Runt doesn't have a very great REPL needed for live coding. Both of these problems get solved when Scheme is used to evaluate Runt code.

Runt code evaluation is managed in 9.3. Runt Evaluation. In Scheme, runt code is evaluated with monolith:runt-eval.

Realtime audio and interfaces with Monolith

The core monolith application layer is written in C, and is the heartbeat of the monolith system. It does many things, but the two biggest things it does are handling real-time audio and hardware interfaces like the arc and monome(13. Monome Hardware (Arc, Grid) and Libmonome).

For sound, Monolith uses JACK (7.1. JACK audio). It creates a JACK client that connects to an existing server, and then a callback that sits on top of that. This callback renders audio generated from a graforge patch. it also impelements a thing called hot-swapping in graforge, which makes it possible to throw out an old graforge patch and replace it with a new one. hot-swapping is the key element that makes live-coding possible (5.2. Hot Swapping).

Monolith also implements a hardware listener (12. Hardware Listener) that listens for events from plugged in interfaces. libmonome is used to poll events from the arc and grid (13.7. Polling and Processing Monome/Arc Events), and libusb and a internal fork of libhidapi are used to manage everything else (Griffin, etc).

Monome users should note that I'm not using OSC layer, but the api abstractions underneath that OSC layer. OSC and liblo caused more troubles than they were worth, and removing it ended up making things way more responsive and predictable.

There's a thin layer between the actual hardware and monolith. Someday, this may make it easier build virtual versions of the hardware interfaces. Nothing much done here yet, but it's there. (8. Virtual Interface Layer).

Special monolith applications are written for the arc and grid using what is called a page interface (11. Pages), which allow hardware interfaces to be used in different configurations within in the same patch. When a page is selected, it gains control of the hardware. Only one page can be selected at a time. Pages also the ability to save and load state data using a combination of SQLite and MSGPack blobs (moncmp.c).

This conceptually encompasses most of what Monolith is: a mother hen written C that takes care of all her various little chicks: realtime audio, hot swapping, pages. The kernel of it all.

But, Monolith is not usually directly controlled from C. Instead, it is controlled using scripting languages like Scheme, and sometimes Janet.

Abstraction and expressiveness with Scheme/Janet

Monolith is mostly controlled using the Scheme programming language. Specifically, it is forked from an implementation known as s9 scheme, or s9fes, or scheme 9 from extended space.

Monolith commands in C are bound to s9 scheme functions (10.1. Scheme Loader). The s9 scheme interpreter is most often run as a REPL, which can be run inside of process controlled by emacs.

From emacs, blocks of scheme code can be sent to the REPL to be evaluated and processed.

Scheme is a great language that lends itself well for live-coding. It's also a very elegant language for building abstractions on top of the fundamental monolith API functions from C.

Not only does Scheme control the Monolith API, it can also evaluate runt code as if it were inline code. In this regard, one can think of Scheme as a very powerful macro language for Runt. Scheme makes it possible to build reusable synths and sounds. In fact, virtually all of the core sound unit generators used in Scheme are just wrappers around Runt code.

Scheme also enables more complex patches to be built than if it were written in Runt.

Oh, there's also graphics too

Monolith has a very limited graphics interface. Monolith implements a small framebuffer interface with RGB color with a maximum 320x200 resolution (18.1. The Graphics Framebuffer). In addition, monolith also has a few basic drawing operations (10. Drawing Operations), and some less basic ones (ex: dither_1bit, fbm, fuzzydot.c).

Instead of using Scheme, drawing operations use the Janetprogramming language. (This was done because I wanted to try out Janet).

Using libx264, framebuffer data can be encoded into h264 video (18.4. H264 Video Support). Monolith sound buffers can also be rendered offline simultaneously. The ffmpeg application is then used to stitch them up together and create video.

For audio-visual works, the Janet VM gets controlled inside Scheme.

In conclusion

Monolith roughly works out to be:

music->computers->real-time->live-coding->hardware interfaces.

The software layers rougly work out to be:

emacs->scheme->runt->graforge->soundpipe->C->JACK.