Eric Warmenhoven

Monday, March 31, 2008

Rewriting as a learning tool

5:26 pm

So a year and a half ago, I mentioned I was trying to learn Haskell, and I still am. It’s been going much more slowly than I hoped, since I haven’t really spent any time on it. I never write anything new anymore! So I decided to rewrite something I had previously written myself, need be damned. So I chose the RSS aggregator I wrote, harsh.

harsh hasn’t changed in eleven months (and that change was just about making the location of its configuration file configurable), but I still remembered how it worked pretty closely; at least enough that I didn’t have to really go searching through its code too much. That’s probably also a function of how small harsh is; in terms of lines of code (excluding blank lines):

  config.c: 139
  cookie.c: 221
 display.c: 709
    feed.c: 577
    list.c: 96
    list.h: 16
    main.c: 96
    main.h: 84
     md5.c: 360
     md5.h: 76
     rss.c: 120
     xml.c: 276
     xml.h: 17
     total: 2787

It uses expat to parse the HTML, libnbio for socket management (which is available in Debian), and ncurses for the UI. It doesn’t have any sort of threading (libnbio does a good job of making sure that, other than DNS lookups, there’s never anything going on long enough to prevent responsiveness). About its only feature is that it will use my cookies.txt file, so that I can see my LiveJournal friends’ protected entries.

I originally wrote harsh in about a day or two. It was really easy because of the 2800 lines, about 850 of them had been written in some of my other projects (list.c/h and xml.c/h) or are standard (md5.c/h). I was also really familiar with the three helper libraries from writing grim (my IM client) and stark (a tool for viewing GnuCash data files).

As a learning exercise, rewriting harsh in Haskell was excellent. It’s incredibly small, does a lot of standard things (like networking and console UI), and doesn’t do a lot of non-standard things (like an AIM client does). I got to play with the Haskell light-weight threads and STM; I learned how to create a Debian Haskell package; I learned how to use ghci as a debugger with breakpoints; and I’m much more comfortable with monads and with the language in general.

It did take me significantly longer to write than the C version, though that’s more due to me having to learn not just the language but some libraries along the way (like HTTP, Vty, and HaXml). I still think that it would take me just as long to write the Haskell version as the C version, but I’m still much more comfortable with C, and imperative programming in general.

From a code size perspective, the Haskell version is about one-sixth the size (excluding comments):

Config.lhs: 53
  Feed.lhs: 144
 Harsh.lhs: 272
  Util.lhs: 26
     total: 495

However, that’s not a very fair comparison. In the C version, md5.c/h are included in the total count, when really they should be considered a standard library (and on the Haskell side, I used Data.Hash.MD5 from MissingH). On the C side, I did all of the HTTP request and response processing myself (which is what more than half of feed.c is about), while on the Haskell side I left that to an HTTP client library. Excluding all those things though, the Haskell version is still about one-fourth the size.

Anyway, I put the Haskell version of harsh up here. If any Haskell hackers out there could take a look at it and let me know what I’m doing wrong or oddly, I’d appreciate it.

3 Responses to “Rewriting as a learning tool”

  1. Rob Flynn Says:

    EAW -

    Where you livin’ these days?

    - RMF

  2. Lance Rocker Says:

    +1, and I’ll even add:

    What ‘chew been up to? (besides not learning Haskell very quickly)

    -LDR

Leave a Reply