Exploratory vs production quality software

large_jmp_92.gifOver at RealClimate I was prompted to add a comment in the discussion on software and data archiving:

I’m finding the discussion here reminiscent of my own career - 5 years as a postdoc mostly running computational codes of one sort or another, added to my graduate degree in physics, left me with somewhere around 50,000 lines of code I had either written or heavily modified for my purposes (mostly C, some fortran, some perl - this was 15 years ago). A few bits and pieces were original and I put some effort in to make them shareable - graphics and PostScript creation, a multi-dimensional function integrator, etc. A few were done as part of much larger projects and at least ended up under proper revision control as a contribution to that project (that was my intro to CVS). But most were one-off things that tested some hypothesis, interpreted some data file, or were some sort of attempt at analysis. 90% of the time they weren’t a lot of use, and spending extra time documenting would have seemed pretty worthless - I used “grep” a lot to find things later. Sure they could have been made public, but nobody would have any idea what command-line arguments I’d used or the processing steps I’d taken, except in those rare instances where I anticipated my own reuse and created an explanatory “README”. Probably simpler for another scientist to just do it over from scratch than try to figure out what I’d done from looking at the code.

And now I’m a professional software developer in a group where we have quite rigorous test and development procedures, everything is checked into a version control system and regularly built and run against regression tests to keep things robust. Nevertheless, I still have a directory with hundreds of one-off scripts that fit in that same category of being easier to rewrite than to generalize, and there’s little purpose in making them publicly available or putting them under version control since at most I’ll use them as starting points for other scripts rather than re-using as they are in any significant way.

I’m not sure it was Fred Brooks or somebody else, but the expression I recall reading long ago was that turning a prototype into an internal software product took roughly a factor of 3 more effort, and turning an internal product into something you could publicly distribute (or sell) took roughly a factor of 3 times the effort beyond that. Software always falls along this spectrum, and most of what scientists use tends to be at the “prototype” level, simply because of the exploratory nature of science. Theoretically it would be nice to have the resources to keep everything clean and nicely polished, but if 90% of it is code you’re never going to re-use, what’s the point?

As a specific example of exploratory prototype-level software I worked on as a postdoc (in Indiana!), I remember my preliminary work on this paper I published in the Journal of Mathematical Physics on one asymptotic form for Laguerre polynomials. As I recall, I started by examining the zeros, trying to find an expression for the location of the zeros of the polynomials in the limit when all three parameters are large. That involved an iterated series of short C programs, each run just a few times, with output to data files of differences, which I then graphed and looked at trying to spot patterns. At some point I made a guess that was extremely close - and then I had to backtrack mathematically and figure out why my guess worked. Nowhere in the paper is there any mention, or dependence on, the software I wrote, yet it was critical in formulating my intuition about the problem, and leading me to the accurate (and rather complex) approximation I ended up publishing.

The process in this and many similar examples is very far from that of writing a program from detailed specifications, validating it in some fashion, and then running it and trusting the results. It is rather an iterative process of building confidence and fitting pieces together to get a coherent picture. In some ways it is a bit like the more iterative agile methods that software gurus advocate these days, except the final product is not a software product in itself, but rather scientific understanding about the behavior of whatever system it is you are modeling.

And once you have that scientific understanding, doing anything further with the software you used to build it often seems quite beside the point.

Comments

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

John Mashey added a note on

John Mashey added a note on the RealClimate thread pointing out it was Fred Brooks, in figure 1.1 (first figure of his book). The commentary accompanying the figure is quite worthwhile to clarify what he meant (the terminology is old, but you can get the drift from this pretty well - and the relative effort issues haven't changed even with modern test-first or version control methods):

From Fred Borooks, "The Mythical Man-Month", 20th anniversary edition, chapter 1, p. 4-6, "The Tar Pit":

One occasionally reads newspaper accounts of how two programmers in a remodeled garage have built an important program that surpasses the best efforts of large teams. And every programmer is prepared to believe such tale, for he knows that he could build any program much faster than the 1000 statements/year reported for industrial teams.

Why then have not all industrial programming teams been replaced by dedicated garage duos? One must look at what is being produced.

In the upper left of Fig 1.1 is a program. It is complete in itself, ready to be run by the author of the system on which it was developed. That is the thing commonly produced in garages, and that is the object the individual programmer uses in estimating productivity.

There are two ways a program can be converted into a more useful, but more costly, object. These two ways are represented by the boundaries in the diagram.

Moving down across the horizontal boundary, a program becomes a programming product. This is a program that can be run, tested, repaired, and extended by anybody. It is usable in many operating environments, for may sets of data. To become a generally usable programming product, a program must be written in a generalized fashion. In particular the range and form of inputs must be generalized as much as the basic algorithm will reasonably allow. Then the program must be thoroughly tested, so that it can be depended upon. This means that a substantial bank of test cases, exploring the input range and probing the boundaries, must be prepared, run, and recorded. Finally, promotion of a program to a programming product requires its thorough documentation, so that anyone may use it, fix it, and extend it. As a rule of thumb, I estimate that a programming product costs at least three times as much as a debugged program with the same function.

Moving across the vertical boundary, a program becomes a component in a programming system. This is a collection of interacting programs, coordinated in function and disciplined in format, so that the assemblage constitutes an entire facility for large tasks. To become a programming system component, a program must be written so that every input and output conforms in syntax and semantics with precisely defined interfaces. The program must also be designed so that it uses only a prescribed budget of resources - memory space, input-output devices, computer time. Finally, the program must be tested with other system components, in all expected combinations. This testing must be extensive, for the number of cases grows combinatorially. It is time-consuming, for subtle bugs arise from unexpected interactions of debugged components. A programming system component costs at least three times as much as a stand-alone program of the same function. The cost may be greater if the system has many components.

In the lower right-hand corner of Fig. 1.1 stands the programming systems product. This differs from the simple program in all of the above ways. It costs nine times as much. But it is the truly useful object, the intended product of most system programming efforts.