bork: Shell to Ruby

Permalink 27 July 2012

bork is my tiny file-tagging utility that essentially maps files to hashes and hashes to tags and so on using the filesystem. It could also probably use a regular db like sqlite, but it doesn’t. This post is mostly discussing one of the important changes that happened while building bork: the transition from shell script to Ruby.

bork initially started out as a series of shell scripts with a fairly simple structure inspired by git (whether git is actually structured this way, I do not know, I’m simply going off appearances): a main jumppad sort of script that calls into other scripts that actually do what I want. This ultimately led to a structure that basically looks like this:

|++ bin
    |- bork
    |++ bork-exec
        |- bork-add
        |- bork-relative-path-to
        |- bork-rm
        |- bork-station
        |- bork-update
        |- ... and so on

This sort of structure actually lends itself fairly well to the particular idea I was going for, where you can work with bork in two different ways:

  1. By using the jumppad script: bork add ...
  2. By using the scripts directly: bork-add ...

The reason this worked well is that each bork command depended on other bork commands. Most bork commands depended on bork-station to get an absolute path to the nearest bork station, the directory containing all hashed files and tags and so on that allow bork to work. It works in a fashion similar to git as well (and likely Subversion and other tools like them) in that it searches the current working directory (CWD) and all directories above the CWD for a station. Almost all commands required a bork station, so they’d run bork-station to get a path (if there was one) and do their thing. Some others depended on bork-update, such as bork-rm and bork-find, both of which modified the station and therefore depended on it being up-to-date.

A second upside is that this potentially allows me to replace various commands down the road with new implementations. Ideally, I should be able to pull out an old shell script, replace it with something written in C, and call it good. Shell script isn’t particularly fast, so I was expecting this to become an issue fairly quickly, and I was right.

What I was wrong about was the ease with which I can replace commands with new implementations. To do a new implementation that doesn’t simply go out, run a new process to do something, and return some results back to the calling command means rewriting most of bork in the new language. Having two separate implementations of a particular aspect of bork isn’t really the best idea, so the way I saw it was fairly simple: rewrite the whole thing and ditch the original structure or put up with the less than pleasant idea of dealing with reading/writing from pipes to other processes. Not too difficult, but also not fun. The latter is unpleasant enough to do a complete rewrite, so rewrite I did.1

I was more or less settled on using Ruby for the project, so that was the target language. bork’s predecessor, a series of Ruby scripts to do roughly the same thing,2 had proven that Ruby was good for the job, so it wasn’t about to change. Next step: gemification (this was the branch name). bork was originally a tool to generate Makefiles, so I had a gemspec sitting around for it and recycled that. The directory structure was mostly there already, though at the time you wouldn’t have known it by looking at the git repo.

I kept the jumppad script, which had always been written in Ruby for convenience (its Kernel#exec provided an easy way to jump from bork to the command script). It was heavily refactored, but for the most part you can see a lot of the same code in the same places. I moved command scripts into individual Ruby files under a Commands module, each registering itself with the default Bork::Hub instance, which collected command classes for the jumppad and allowed commands to run other commands (provided they’d been loaded).

The Bork::Station class ended up becoming the most important class, containing all methods for accessing data contained in any given station and manipulation those stations. I’d ordinarily feel bad about this, but stations are so small in their functionality that the class never becomes too big. If anything, some of the methods could use refactoring into smaller methods, but otherwise the conversion went fairly well. The important point of this, however, is that the Bork::Station class is effectively bork – everything that makes bork bork resides in it. The hub is only there for the jumppad’s sake and the command classes simply get to a station and do stuff with it.

So this means that there is an opportunity to build other projects around bork’s stations. It’s entirely possible that one could subclass a station and add new functionality to it, new metadata, and so on. So, there’s a lot of potential there, provided I refactor the hell out of some of the functions. Eventually, it would be nice to split up the station into the interface and a backend. Either that or simply define a base class for stations to implement and let them go from there. Who knows what maddening things I could do there.

So, that’s bork’s shell to Ruby conversion in short. I have to say, it was a lot of fun writing bork, so I’m glad Ruby makes this stuff so pleasant.

  1. There were probably other options I didn’t consider or ignored, but these were the biggest ones on the table. I treat the shell script version of bork as a prototype that got the idea working and the station structure set up, while the Ruby implementation is the current end goal (until Ruby is too slow).

  2. Long lost to accidental deletion. This was when I didn’t use version control (git wasn’t terribly accessible yet and Subversion was a huge pain in the ass to set up) and keep all my neat toys in some project directory. These days, I’m a little more careful, though I admit to sometimes using rm -rf to wipe out entire directories without thinking.