Weeks before leaving Bloomington, Carrie and I had the good fortune to meet Sian James and her husband, who were traveling through the US on tour. Her performance—solo harp and voice—held a theatre of people spellbound for an hour. She was wonderful.

Some months ago, I wrote her and her husband, pointing them at the CD Baby electronic distribution service. CD Baby provides a good deal to artists, getting their work distributed digitally while letting them retain the majority of the profit. As of today, they report the numbers as

  • 5,578 artists sell their CD at CD Baby.
  • 788,199 CDs sold online to customers.
  • $6,374,309.38 paid to artists.

They’ve got a whoopie chart that warms my heart; some serious cash continues to be paid out to artists. I particularly was amazed by the fact that everyone who signed up got their $40 back, as they couldn’t fully deliver on their distribution promises in November. By all accounts, a first-class service.

Anyway. I was happy to find the Sian James CD Baby storefront. If you like good music, and you like Celtic and Welsh musics, by all means, go pick up Sian’s CD Pur.

(This came up because of a completely unrelated brainstorm based on browsing the Arts Council UK website. So I took it as an opportunity to promote good digital distribution channels and a fine musician. )

This term, I’ve picked up some teaching on CO137/138, a 16-week Introduction to Java that is part of our certificate course in IT for non-traditional students.

It’s not easy.

Or, perhaps it shouldn’t ever be easy.

I’ve never actually taught an introduction to Java; I’ve taught introductions to Scheme, and I’ve tutored more than one student rather extensively in Java. In working with people coming to computers later in their careers (as opposed to being raised with one), it’s amazing how many concepts are foreign or otherwise new to them. This relates, in some ways, to what I was saying in the past two posts (one and two) about needing a conceptual roadmap for teaching introductory programming; I walked into an example of this last week.

One of my students asked me whether it mattered what order their code was in. However, their question was hesitantly asked, and they asked it more than one way when I probed what they were trying to get at. They were, in some ways, asking two questions: does the order the code is organized in matter, and does the order the code is executed in matter?

We took fifteen minutes to explore what happens when they right-click a class in BlueJ and select “new Foo()” from the popup menu. We developed a flowchart as a class to depict a simple execution model within BlueJ. And then we talked through an example where a Student can depositMoney() or giveAllMyMoneyToCharity(); depending on what order these events take place in, two different Students can have different amounts of money at the end of the day.

task-analysis-040210
HTA, ver. 001.

So, it seems like students need to understand some sort of execution model before they understand what their objects are doing. That model does not need to encompass the all the gory details of registers, CPUs, L1 through L3 cache, etc. It does need to encompass where execution starts and how the environment their code is executed in gets initialized. Certainly, I would want my students to have a fuller picture of what is happening in the computer someday. But not now. For the moment, I want them to know that the constructor is called first, it initializes the state of the object, and then methods are called in some sequence, and the order the methods are called matters a great deal.

Seem obvious? If it’s so obvious, why didn’t I start the term going over things like this? If it’s necessary, why was it left out of the course text—because it is presumed to be prerequisite knowledge? Perhaps it’s assumed that I’ll fill in the details? Regardless, it’s the kind of prerequisite conceptual knowledge that a hierarchical task analysis would have helped me to identify.

Education is a complex system, and improving it involves a systematic approach. Assumptions won’t help. I want a roadmap, even if it starts out as a sketch on a napkin—even if I build it slowly, one mistake at a time.

To return to, just because. CiteSeer really is an excellent resource.

@misc{ upchurch99reflective,
  author = "R. Upchurch and J. Sims-Knight",
  title = "Reflective Essays in Software Engineering",
  text = "Upchurch, R., & Sims-Knight, J.
    Reflective Essays in Software Engineering,
    Frontiers in Education Conference, San Juan, Puerto Rico, November 10-13,
    1999.",
  year = "1999",
  url = "citeseer.nj.nec.com/upchurch99reflective.html" }

@misc{ sims-knight-acquisition,
  author = "Judith E. Sims-Knight and Richard L. Upchurch",
  title = "The Acquisition of Expertise in Software Engineering Education",
  url = "citeseer.nj.nec.com/270482.html" }

@techreport{ pirolli94learning,
    author = "Margaret Recker Peter Pirolli",
    title = "Learning Strategies and Transfer in the Domain of Programming",
    year = "1994",
    url = "citeseer.nj.nec.com/pirolli94learning.html" }

I aimed to keep this piece short. However, in doing so, I didn’t hit important details in the Grand Challenges in Computer Science Education call. In particular,

The submission should focus on a precise description of the problem, and of the perceived benefits that a solution would bring to the community.

I didn’t go into great depth describing the problem, nor did I really address the benefits of solving this problem. My primary goal was to keep my expression of this challenge short, but in doing so I think I assumed too much on the part of my readers. I suspect I fell into a trap others will as well: I didn’t clearly differentiate between the challenge of teaching programming and the challenge of teaching computer science. This, however, may be forgivable.

I claimed we needed to carry out a hierarchical task analysis of introductory programming. Such an analysis would provide a road map through the concepts required to write programs at the novice level. To make this a bit more concrete, we might wonder what we would gain from a hierarchical task analysis of, say, assignment? From the hyperdictionary, assignment is

Storing the value of an expression in a variable. This is commonly written in the form “v = e”. In Algol the assignment operator was “:=” (pronounced “becomes”) to avoid mathematicians qualms about writing statements like x = x+1.

A hierarchical task analysis of assignment would give a conceptual road map that takes us from basic building blocks (things we assume all students know) through to the ability to write code like

x = 3
or
x = x + 1

Right from the start, I can say that variables are troublesome things; Jorma Sajaniemi has done some excellent work exploring variable roles in novice programs (PPIG 2002 paper, PDF). Sajaniemi identifies common usages of variables in novice (procedural) programs: temporaries, followers, one-way flags, gatherers… all very different ways to use variables in a program. They all involve assignment of the values of expressions to variables, and use the results of that assignment in very different ways.

So what does it take to understand assignment? We need to understand … a notion of containment? Side effects? State? Do we need to understand the underlying machine model? Is it more important to understand the uses to which assignment may be put? Where does it end? Where do we begin?

In Physics, I like to think I was taught a well-structured series of lies—fibs, maybe—where the more subtle truths were revealed to me as I progressed from Intro, to Waves, to Classical Mechanics, E & M, and then Quantum. At each stage, I learned more of the model, and I can see where abstractions and simplifications were made for pedagogical purposes.

So what series of abstractions is appropriate for programming instruction and, more broadly, computing? How should we structure programming instruction so that our students, when they get done, can clearly see why they learned what they did, when they did? Should we be looking to functional languages? Object-oriented languages? Should we be looking to event-based, massively parallel, or media-driven paradigms? There is lots of good work by lots of good people out there, and there may be more than one ordering that gets us to where we want our students to be. But from conversations with people in these communities, I know there are a lot of closed minds; too many people think they’re right and everyone else is wrong.

Even one clear, pedagogic ordering of concepts required for teaching any kind of introductory programming would provide an excellent starting point for analyzing other instructional approaches. It would eliminate the need to argue about functional-first or objects-first, and instead let us wonder how these different paradigms relate or differ from each-other in meaningful ways. I’m sure the holy war of first language choice would persist (there’s a lot of money tied up in that holy war), but the side-effects of finding a starting place would be amazing. We could lay the foundations for computing at a much earlier age, much like we lay the foundations for trigonometry and the algebraic manipulation of mathematical structures long before we actually teach children trig or algebra. Pedagogical programming environments for pre-programmers could be built—environments that could would introduce core concepts necessary for learning to program, design DFAs, prove language equivalences, write operating systems, structure queries, and all the other things we do as computer scientists. We could better structure instruction for physicists, chemists, biologists, and other scientists who are eager to learn to use the computer to further their own science, while maintaining some semblance of good instructional practice.