Browsing Program Structure with Worgle and SQLite
Literate Programs as Structured Data
A compelling aspect of writing literate programs is that they can be represented as structured data. In Worgle, there are two main tree representations of the overall program structure. The woven tree represents the document structure as a collection of headers found in the Org markup language. The tangled tree represents the generated code structure as a series of named codeblocks created using the noweb syntax.
I always thought it would be a very powerful thing to be able to explore literate programs as trees. As of now, this is beginning to be possible in Worgle. Worgle has the ability to write data to a SQLite database, which is then queried using a program called worgmap.
Extracting Data from a Worgle Literate Program
Before a literate program made using Worgle can be queried, data must first be written to an intermediate format. I have chosen to use SQLite, as it is a robust and mature data format that is trivial for other programs to parse.
A database is generated using the "-d" flag in Worgle. The code below generates a database from the main Worgle org file.
worgle -d a.db worgle.org
The name "a.db" is the default name of the database that Worgmap opens to query information.
Database write times are reasonable. My largest program written in Worgle to-date, Monolith, is able to write a database in under half a second on my 2015 macbook pro. My GPD laptop running alpine Linux does seem to take a few seconds. This performance difference feels larger than I expected, even when considering the hardware difference. Even so, it still feels manageable.
Some Querying Via Worgmap
Once a database is generated, it can be queried using "get" utilities found in a program called Worgmap. The database is a pure SQLite database, so it is possible to just do raw SQL queries using the sqlite3 CLI. The worgmap get interface saves a few keystrokes.
When worgmap is run, it is assumed the database is in the current working directory, and that the name of the database is named "a.db". In the future, this will be more customizable.
To get a list of files from the database, run
worgmap get filelist
This will return the list of files tangled by Worgle:
worgle.c worgle.h worgle_private.h
ffile can be used to get metadata on
the file "worgle.c":
worgmat get ffile worgle.c
This returns the following:
id = 2 filename = worgle.c top = 1 next_file = 29
id is the UUID associated with this resource.
filename is the stored filename (duh).
to the top-level code block represented.
is the UUID of the next file in the list.
To get more information on the top level block:
worgmap get blk 1
1 3 worgle-top
This displays in order: the UUID (1), the UUID of the top level segment, and the name of the block (worgle-top). "worgle-top" is the block that contains the entire structure of the tangled C file "worgle.c". A tree view of this block can be printed using:
worgmap get tree worgle-top
global_variables enums parse_modes static_function_declarations functions loadfile_localvars loadfile parser_local_variables parser_initialization getline parse_mode_org parse_mode_code parse_mode_begincode begin_the_code worgle_block_set_id worgle_file_set_id worgle_segment_string_set_id worgle_segment_reference_set_id worgle_init worgle_free worgle_string_init worgle_segment_init worgle_block_init hashmap_hasher worgle_file_init_id local_variables initialization parse_cli_args append_filename turn_on_debug_macros turn_on_warnings map_source_code generate_database check_filename loading parsing generation mapping database cleanup
Lots of things to be done here, really. Using something like SQLite allows me to dump way more metadata than I know what to do with right now.
For starters, I'd like to parse save org structure in addition to tangled code structure. I'm hoping to build more utilities that generate interesting representations of the document. Hoping to build a better static HTML generator than the simple one I have currently written. I also want to build a simple HTTP server that dynamic generates HTML content. Maybe throw in a few dot graph generators for good measure?
Being able to write multiple worgle programs into one database is important to me as well, as this would allow more incremental (hopefully faster) development to happen.