Texas outline

August 2010


I do a lot of programming, in a lot of different languages. Each language evokes a particular sort of mindset when I’m using it, as well: Python and Objective-C are for games, Ruby and C are for research, PHP is for webdev. Within each mindset, though, there’s a fair amount of blending — and I can’t tell you how often I wish I was writing in Ruby when I find myself writing in C.

My usual workflow for research-oriented coding involves prototyping a tool or model in Ruby (fast to write, easy to understand), then porting the final, working version into C for use with real data (my models are usually trained on massive statistical corpora, something Ruby is, unfortunately, ill-equipped to handle). This is all well and good, but I always miss the usability that I can so easily sprinkle into the Ruby version. OptionParser is one such gem; Ruby/ProgressBar is another.

In a moment of frustration, I re-implemented Ruby/ProgressBar in pure, unadulterated C99. If you’re interested in that sort of thing, you can grab a copy from my GitHub repository:

git clone git@github.com:doches/progressbar.git

For comparison, here is how you use a progressbar in Ruby:

require 'rubygems'
require 'progressbar'

progress = ProgressBar.new("Loading",100)
(0..99).each do |i| 
  # Do some stuff

…and in C:

#include "progressbar.h"

progressbar *progress = progressbar_new("Loading",100);
for(int i=0;i<100;i++) {
  // Do some stuff

More examples, including custom formatting and indeterminate progress, are in test/progressbar_demo.c.

CogSci 2010

I just got back from CogSci 2010, to which I had successfully submitted a paper, “Meaning Representation in Natural Language Categorization.” Unfortunately, I wasn’t invited to present it as a talk — but on the upside, I was invited to present it as a poster. As a consequence, I may now write the following sentence, of which I am more proud than practically anything I have done in my life to date:

Fountain, T. & Lapata, M. (2010). Meaning Representation in Natural Language Categories. In S. Ohlsson & R. Catrambone (Eds.), Proceedings of the 32nd Annual Conference of the Cognitive Science Society. Austin, TX: Cognitive Science Society.


The work I presented deals with whether corpus co-occurrence can be used as a stand-in for norming data, at least in the context of a categorization task; as part of that work I collected a rather large amount of data that I am now making publicly available. The dataset extends the McRae et al feature norms by grouping the words in forty-one categories, and includes norming data, integrated into the standard McRae features, for each of the newly-added category labels.