Literate Programming

Literate Programming

Literate programming is a programming paradigm invented by Donald Knuth.

The core concept in literate programming is to better leverage natural language to better increase the readability of a program.

Example Literate Programs

The following are works of the author that explore literate programming

Worgle

Worgle is tangler for literate programs written in org markdown. Worgle is written in a literate style using Org, and is tangled in code using itself.

Patchwerk

Patchwerk is a C library for creating reasonably efficient audio graphs. It is written in CWEB.

Monolith

Monolith is a compositional environment with an emphasis on realtime audio and live coding. The core program is written in a literate style using Org markdown and Worgle. It is perhaps the single largest example of a program written in Org Markdown.

Libline

libline is a library for creating audio-rate line segements, written as a literate program in CWEB.

Voc

Voc is a C port of pink trombone. It is a physical model of the vocal tract. Voc is written using CWEB.

Notable Literate Programs

Good literate programs that you should know about, if you don't know them already.

TeX

The original TeX typesetting system was written by Donald Knuth using WEB, his literate programming tool.

PBRT

The textbook Physically Based Rendering Techniques, or PBRT for short, is both a textbook on physically modelled raytracing techniques, and a program which compiles to a very capable raytracer. It is written using a homegrown version of noweb.

s9fes

Scheme 9 from extended space, or s9fes is a public domain dialect of scheme, implemented in ANSI C. The author has developed it in a semi-literate style using EDOC, a documentation tool of their own creation. The resulting work is a book available for purchase. (It is a very well written book that goes into the entire implementation).

Tools for Literate Programming

WEB/CWEB

WEB was the original literate programming system developed by Donald Knuth. It was most famously used to develop the TeX typesetting system. The original WEB was built to tangle PASCAL code. A version for the C programming language was developed called CWEB.

Noweb

noweb is a literate programming system that aims to be simpler and more flexible than the WEB system. Noweb has the ability to target HTML and TeX system, and can work with any language.

OrgMode

Org Mode is a highly configurable outlining system commonly used for note taking and task management. Org mode is capable of noweb style literate programming via Babel.

Leo

Leo is an outline-based IDE that provides a very powerful environment for literate programming. It is highly suggested to read A brief summary of leo.

Eve

Eve was an experimental IDE heavily inspired by literate programming.

"True" Literate Programming

A common misconception with literate programming is that it is a form enhanced documentation. It is true that literate programming tool can be used in this way, but this approach does not represent what it means to write things in a "literate" style. In a regular program, documentation follows code. In a literate program, code follows documentation.

A true literate program gets the programmer in question to think of program structure in terms of natural language instead machine language. For this to happens, a literate programming environment must do two important things:

1. Code needs to be explained out of order. Both noweb and web do this using a sort of macro expansion system. It is important that definitions don't need to happen before they are used in a code block.

2. The code generation bit (referred to as "tangling") is used to abstract away the code structure from the user. The only thing reading these files should be compilers/interpreters/REPLs, not humans.

If a LP tool fails to do both of these things, it is tool that embraces literate programming as Knuth intended.

Problems with literate programming

Gosh. There are a few. Half the time I wonder why I even bother. The other half of the time, I'm so enamored with the discipline that I can't help but try it.

A bad literate program is worse than no literate program at all. It is very easy for your code to turn in to an incomprehensible mess of spaghetti code and jargon (in many ways, Voc is a prime example of such a failure). A literate program can be a good barometer for code quality. If things read like a rushed first draft, the code probably is too. If it is a thoughtful read, the code is as well. This can be a metric not only for entire codebases, but also sections of codebases as well.

Collaboration seems to be tough. I've had little to no experience collaborating on a literate project, but the paradigm seems mostly apt for single-brain projects. Uncoincidentally, so is writing. I a solution could be found if one treats a program like writing a book, rather than a codebase. This is just a hunch, though.

Literate programming takes more time because writing takes time. Programs take longer to write, but more often than not they end being better programs. The LP paradigm encourages thoughtful programming, which leads to better software in the long-term.