A good friend and colleague Mark Rich restated a question I’ve wondered about (and heard) a number of times w.r.t. blogging about something like a thesis on the WWW. He said:

I really like your blog idea for a research journal; I’d just be wary of having strangers read my ideas and scoop them, but I’d like to start one up this summer, maybe just viewable from here at UW.

OK. I personally think a weblog is a fundamentally public enterprise; it’s the only way people can possibly read what you have to say (now or 10 years from now… I’m generally scared by Google and the Wayback Machine). Like any kind of communication, if it isn’t two (or n) way, it just isn’t communication. It’s … writing on a webpage; so what?

Now, if the ideas are out there for people to talk about, that’s good. If someone steals those ideas, it’s bad. But, if they steal the ideas, then the weblog itself is proof of the time you’ve spent in a given space. Furthermore, consider the average PhD livespan, and what takes place during it:

  1. LOST: You start out lost. You have no ideas worth stealing.
  2. REVIEW: You’re reading and extrapolating. The ideas aren’t yours that you’re working with anyway.
  3. FORMATION: You’re playing with ideas that are on the edge; they’re standing on the toes of other people’s work.
  4. DEVELOPMENT: You’ve got an idea, and you’re working on it.
  5. IMPLEMENTATION: Your idea is being made concrete through research and implementation as software.
  6. PRESENTATION: Things are written up, and you’re done.

Of course, there’s an important point in there: you’re publishing all the way through steps 2, 3, and 4.

  • During the formative stages, they’re work-in-progress papers and technical reports. Perhaps you get one off to a bigger conference by piggybacking on your supervisor, or an more advanced (aged, wisened, encrusted) student in the group.
  • During your development stages, you’re doing prototype work and pilot studies; here, you’re definitely in the WIP category, and perhaps even some small findings.
  • During implementation, you might not actually publish a lot, but it’s during this time that you’re doing the hard work for the dissertation. And, during the final stage (again) probably not a lot of publication, but you are writing up The Document, trying to get done.

If you’re doing it the way most students do, you’re laying down markers in the field all throughout your PhD. During the work itself, someone else would have to engage in the same hard work to scoop you; not incredibly likely. And likewise with writing up; what are they going to do, take an idea and do the dissertation faster? Unlikely.

I think Mark’s concerns beg at least two questions: one regarding the nature of publication, and one regarding the ramifications of plagiarism of ideas where weblogs are involved.

My weblog is a place where I put ideas down in writing. In the future, I may put small, short articles on the blog that I will later cite in papers; why not? By referencing it, if the paper is peer-reviewed and accepted in a conference, then the reference is likewise ratified. (Perhaps not, but it’s an interesting notion.) Furthermore, if someone else comes along and plunks down an exact copy of my ideas at a conference, then I can ask to see the history of these ideas. If they don’t have a timestamped logbook like mine (see the right-hand side of this page), then I’d say there are problems for them, professionally.

This is where the issue of plagiarism comes in: I just don’t see it happening. Or, I’m willing to assume that things like tech reports and conference papers are just as fair-game for this kind of “scooping” as my weblog. There really isn’t a difference: if someone is going to be so low as to attempt to base their academic career on the theft of someone else’s ideas, then their profound lack of respect for the system will come back to haunt them. We’ve seen it happen plenty of times.

That’s my 2p. Then again, I’m in the pool splashing around, so perhaps it’s expected that I’d take that view.

Alright. Enough thinking-poo. I’m off to see X-men 2 tonight. :)

Two important discoveries today. Well, three. Maybe four.

  1. There is a racquetball court near me. Well, a 40 minute drive. Mote Squash and Racketball Club is probably going to be my best option. Damn. And my racket is in storage in Ohio.
  2. Aquatomic. This is a sweet little game. Originally written for the Amiga (Atomix by Thalion Games), it was rewritten for Linux, and then ported to OSX.
  3. The levels for Aquatomic are little text files. So, I can create my own! This is cool.
  4. bloghosts.com looks like a very nice little hosting provider. I’m currently hosting some domains with digitalspace.net, and am running out of disk space. The prices are better for more storage with bloghosts, and my interactions via email so far are very encouraging. This is a Good Thing&trademark;. Perhaps I’ll move things around? Who knows.

The first discovery is especially cool because it would be fun to show people the sport. As much as I enjoy squash (it is a hard game, and seriously rewards good technique–moreso, I think, than racquetball), I miss the “oh-my-god-I-wish-I-could-switch-into-bullet-time” fast-paced play of a racquetball match.

Everything else are just my discoveries on a day where I simply cannot get motivated.

My thesis has changed. Actually, it really changed a few weeks ago, and I’ve been thinking and talking to people about the new topic. The umbrella question is simple:

Why do we compile?

No, it isn’t because we want to produce linkable object code. It’s because we want to know if our most recent additions to the code are syntactically correct. It’s because we want to see if the new changes pass the unit tests. It’s because we think it’s cool to see our creation come to life.

Of course, those are just ideas. Hypotheses, if you will. And the question, as it stands, is over-broad. More specifically, I think I want to answer a question along the lines of:

Can the systematic study and analysis of a novice programmer’s interaction with the compilation environment be used to inform programming instruction?

And (some of?) a novice programmer’s reasons for compiling must be very different from mine. How they interact with their tools, the process they go through… these things are the observable parts of the programming practice, and something we can use to get a foothold on what’s going on in the novice’s mind when they are learning to program.

That’s the background. In thinking about the problem, I’ve often come back to a statement I would make before I started working with the Mindstorm in the classroom: “Kids these days just hit compile without thinking.” Of all places, I found it again in the literature review I’m doing regarding the use of LEGO Mindstorms in the classroom. Right now I’m reading the paper Teaching Design and Project Management with LEGO RCX Robots by Ursula Wolz. The first paragraph of the intro is interesting for two reasons:


Old programmers (those with more than 10 years experience in the field) will complain that young programmers are far too dependent upon technology to complete their task. Modern Integrated Programming Environments (IDEs) provide so much support that students often act before they think. A common complaint voiced by students at our institution is "but I spent HOURS trying to debug it." The notion that systematic problem solving strategies might offset wasted effort falls on deaf ears when students can mindlessly compile and re-run code.

(Ursula has not placed a version of this paper on her homepage, hence the ACM DL link.)

I’ve emphasized two chunks of this paragraph, each for different (but related) reasons:

[ So much support... act before they think. ] I’ve never heard of a programming environment that provides too much support (the implication here). Besides, it is a massive claim to state that IDEs have somehow caused novice programmers to act before they think; perhaps it is that they’re now able to act and think (explore, if you prefer) in ways they never could before.

[ ... mindlessly compile and re-run code. ] This is incredibly insulting to the students. To claim that any of them do something mindlessly is demeaning in a really mean way. I mean, how would one of the author’s students feel if they read that paragraph? I try hard not to be dismissive of people working hard to learn hard things. To use it as a premise for using LEGO in the classroom upsets me in rather fundamental ways.

Computers used to be big, slow, and very very expensive. To program them required special equipment, and you would typically use punch cards or punched tape to feed your program or data into the computer. Ursula is probably on the trailing edge of this group–but her teachers absolutely were punch-card programmers. I can only begin to imagine how different the practice of programming was on a shared machine where print actually meant print something on a paper tape. $1000 buys… hell, $500 today buys a 2.8 GHz system with 256MB of RAM and a 20 GB drive (sans monitor). Put another way, I casually can put five hundred times more computational power on my desktop than a CRAY-1, the worlds first “supercomputer,” built in 1976.

This kind of power cannot help but change the way we work with the machine at some level. With respect to travel, the car freed the individual: distances shrunk. The phone changed the way we communicate. Likewise, access to powerful computation necessarily changes the way we work in subtle and (sometimes) obvious ways. Why shouldn’t the way we program change when we can compile 100,000 lines of code in mere seconds?

It is in that comment about mindless compilation and re-running that there is a massive lack of understanding regarding student programming behavior. It is an assumption that reaching early and often for the “compile button” is mindless, and an assumption that it is bad. (Yes, I know many of them really are thrashing, but I’m overstating this to make a point.) The simple fact is that we don’t really know what is going on in a novice’s mind while they are programming, what happens between compiles, and why they choose to hit the compile button when they do.

That hole in our understanding is where I’ll park my thesis, and my dissertation will come out of a systematic empirical investigation into student compilation behavior. Barring, of course, someone else having already done it… :|