Weekly Update 1

2023-06-19

This week, I found myself implementing Glottal pulsed noise (a thing used in my singing synthesizer), completing my Proto-gestling (proof of concept for my Gestlings project), and thinking about a concept in game development that I'm calling "asset-driven development".

Making my Computer Sing Better

To begin, here's a little melody I wrote, being performed by a singer I also built. I'll talk about this some more.

One of my ongoing fascinations over many years has been in building singing synthesizers like this one. There is a strange (and sometimes unsettling) familiarity that begins to happen when computers start to imitate human vocalizations. As a computer music composer, I find this adds a great dimension of accessibility to works. You don't have to understand all the jargon in the liner notes when a talky computer music piece is framed as a computer on the phone with his mother. This idea of giving inanimate objects human-like qualities, known as anthropomorphization, is a big area of focus for me.

There's are many little components that go into making a computer sing like the sample above, from building the singer, to telling the singer what to perform and how to perform it. This week, I focused on improving the "singer" by upgrading a component using something called "pitch-synchronous pulsed noise"link1 link2 link3. The singers I make are built up using, math. and this I found to be slightly better sounding math.

Before I break down what "pitch-synchronous pulsed noise" is, some orientation about where it goes is needed. Building a virtual singer inside a computer typically involves two parts: a vocal tract, which is essentially a tube, and some kind of air flow that goes through the tract to produce vocal sound at the other end. This "air flow", produced by something called a glottis, is what gives us the initial pitched sound. When it goes through the tract, this sound is shaped and what comes out the other end sounds vocally. We'll be focusing in on this sound-making component.

This thing I'm calling "Pitch Synchronous Pulsed Noise", what it does is add a little detail to the thing that goes into our digitally constructed vocal tract. "pitch-synchrounous" is a fancy way of saying "thing that is aligned with pitch of the voice". When the pitch goes up or down, this thing changes with it. In fact, it is "glued" to the rest of this component, totally locked in step with the underlying sound waveform. The "pulsed noise" bit just means "small noise bursts". In 1991, Perry Cook observed these small noise bursts when looking at recordings of actual singingers. This attempts to approximate those noise bursts. The end result is a singing sound that is a bit more "breathy" and naturalistic.

Building a "Proto-Gestling"

The Proto-Gestling is the Gestling before the Gestling, for an ongoing project of mine called "Gestlings". The core goal of this project is to build sounds that talk about themselves. To get even close to this place, there was a lot of tooling that needed to be build. In order for a Gestling to talk about themselves, they need a mouth. A mouth is part of a physiology that makes sound. The sound being produced is structured to be some kind of pseudo-language performed on this invented creature physiology. This pseudo-language gibberish is composed of smaller bits that are glued together. And oh yeah, the mouth is part of face which should be expressive. And you need to have some kind of captioning system if you want an audience to understand this organized gibberish.

All of this had to be made from scratch, which includes a sophisticated audio engine to constuct the "speech" engine (what I sometimes call "speeched music"), as well as a very crude graphics pipeline capable of producing video. Both the sound and video had to be coordinated in order to get face movements to work.

The end result looks like this:

It's a lot of work for one short video. A very silly one at that. But, what it conveys, the sliver of a sliver of a whisp of a critter with a personality, is far greater than the sum of its parts. This, to me, is worth all the code flung at it. I think.

Asset Driven Development

When I say asset-driven development, I am referring to the way games are usually designed. There is a clear division between the "art" of the video game known as asssets (textures, sound, etc), and the game logic, sometimes known as the engine.

As someone who has attempted to approach "code" as art for my entire adult life, I find my self conflicted by this approach. The division feels arbitrary. A good use of the computer medium for creative pursuits should blur the line between program logic and "art", as they are one in the same. But, recently, in working on Gestlings, I've been seeing the benefits. A lot of stuff does end up basically being essentially dumb data, which is what an assset is. Assets are much easier manage than code. They don't change as much. You can write programs to generate assets, and programs to take those assets and make assets. You can write new code to handle those assets. Your assets will outlive your code.

Anyways, enough of all that. Until next time.