Sporthdraw: Convert audio to picture, and vice versa

2015-09-02

A few days ago, I wrote a little utility called Sporthdraw. Sporthdraw will take audio output from Sporth code, and write it to a png file The png file can then be converted into a wav file.

Here's one such picture, displaying a simple dialtone generated by Sporth:

Sporthdraw is a 40-line shell script utilizing the amazing Imagemagick commandline utility and SoX. Thanks to those two, I barely had to do anything ;)

Why?

I write a lot of "sketches" of sounds and and musical ideas, often no more than a minute in length. When they end up sounding midly entertaining, I post them online , with the hope that around 2.5 people will take 30 seconds out of their day to listen to it.

Sharing any kind of short audio-only content over the web has always felt less than ideal. At the moment, I use Soundcloud, which I dislike for many reasons. It always feels like overkill for audio less than thirty seconds. With Sporthdraw, I can share audio wherever pictures can be shared on the web.

It's also a neat compositional challenge. A 500x500 png image yields about 4.2 seconds of lossess 32-bit floating point PCM audio at 44.1kHz. How creative can I be in that window of time?

Usage

First, create some sporth code (in this case, a North American dialtone):

echo "440 0.3 sine 350 0.3 sine +" > dial.sp

Next, use Sporthdraw to generate the png file:

./sporthdraw sp2png dial.sp dial.png

To convert the png file to an audio file, run:

./sporthdraw png2wav dial.png dial.wav

How it works

Sporth, my musical programming language, has the ability of writing raw audio data to standard output. Imagemagick processes the output of Sporth, encoding every byte as a single r, g, or b parameter in a pixel. Since every audio sample is four bytes in size, every 4 pixels yields three samples.

In a 500x500 picture:

500 500 3 = 750000 bytes

750000 / 4 = 187500 32-bit floating-point samples

187500 / 44100 = ~4.2 seconds of audio at 44.1kHz samplerate