Sample Curation

Sample Curation

The Sample Curation Problem

A on-going problem of mine: how does one efficiently manage or curate a large database of samples?

My compositions up to this point in time have been largely vacant of samples for this reason. I do have sample collections, I always find myself overwhelmed when it comes to choosing a sample. How do I know when I have found the "right" sample? And what happens if I find a good sample, but I don't have a use for it? I don't have a good system in place, so sample selection for me comes down to either random selection, panick-y random trial and error, or just what is most convenient (first one selected, selection from a arbitrarily limited subset, etc).

A Potential Solution

This is beginning to be realized in a few existing tools of mine: weewiki, zet, and crate, are the notable ones. These provide a means for collecting, annotating, tagging, and connecting samples.

The premise is to think of a sample library like a zettelkasten.

Sample Curation Actions

Outlined below are the actions needed for a successful sample curation system, with some implemented solutions.

Listen

you gotta hear it! and you gotta hear lots of them.

I currently use monolith for my interactive audio needs. Being able to build something on top of that could save me from re-implementing mundane stuff, and it could also be an interesting way to hook into my composition environment.

There's still the issue of interface, which I don't think monolith addresses right now. noice has been my go-to interface, with a few modifications. noice reduces the task of listening to files down to a few keystrokes and is quite ergonomic to use.

Browse

Browse refers to the ability to effortlessly navigate, listen, and discover a collection of samples. This will probably be one of the last of the actions I'll address.

The easiest way to accomplish this is with a filesystem and a good filebrowser. In my aging fork of noice, an ncurses based filebrowser, one is able to navigate a file tree and play wav files using only a few keystrokes. For what it is, it works pretty well. But static file tree structures are rigid. Still, it is a good start.

Using a bitmap interface instead could provide a more helpful interface for browsing. Even a 1bit graphics buffer could be useful in things like visualizing waveforms. As it just so happens, I have a btprnt that probably could be a good start. An interactive backend would still need to be writen, which is where bitwrite may come in. This project is a ways off though.

Annotate

Annotation is about writing stuff down. Both long and short-form writing is needed.

Short-term annotations need to be quick and painless. The same transactional cost one gets from tooting from mastodon.

Long-form writing should be done in a format that makes it easy to do edits and rewrites. A text file with some kind of markup should do the trick.

For long-form writing, there's weewiki. weewiki strengths include the org markup syntax and the links. High-level ideas can be encapsulated in this format, which can lead to thoughtful ways to organize and curate sounds.

For short term writing, I've created a zettelkasten in weewiki called the zet. The zet interface is loosely based on twtxt, and allows not only logs and messages, but links to files as well.

Store

A meaningful way to store and organize samples and metadata.

To do this task, I've created crate. crate is built on top of the weewiki zet as a means to connect to sqlar.

The sqlar library has always been attractive for me because the SQLite format makes it easy to build structures on top of it. Also, I personally like having everything self-contained. There are performance trade-offs, but it's a cost I'm willing to pay at this personal scale.

Programs like monolith already have the ability to read directly from SQLar files, and more recently, also crate files! See the sqlar monolith wiki page for more details.

Query

Query refers to being able to find sampels given some sort of parametric constraints.

SQLite comes in handy yet again. Weewiki, Zet, and Crate are all built on top of SQLite, so they all can leverage the SQLite query language, which has proven to be quite powerful even in the initial stages I'm currently at.

Updates

Updates about this page from the zet will be dynamically generated below:

2021-01-09 14:38:19: some rewriting of (sample_curation) done.

2021-01-09 13:19:06: I really need to rewrite (sample_curation), now that I've built out the (zet) and (crate) interfaces.

2021-01-09 13:14:18: getting the hang of managing external harddrives in a workflow. All this time and I just avoided the problem by not using samples and synthesizing everything.

2020-12-23 10:08:23: the (sqlar) loadwav utility has been reworked slightly so that it reads from a sqlite handle rather than a filename. It's necessary to deal with a runt quirk, but I also think of it as a smarter step forward.

2020-12-20 17:26:42: picking a sample at random from a library is a totally valid approach to sample curation. So, I added a shuffle feature to the (zet) which picks N random elements that match a pattern. This general thing can then be used with (crate) to choose random samples from a folder.

2020-12-19 11:21:01: just connected the (zet) and (crate) pages to (sample_curation). That's a neat zet trick.

2020-12-19 10:23:04: some good stuff is happening with the weewiki zet page wrt sample curation. will have to update the (sample_curation) page soon.


home | index