Bioinformatics Project Management

So far bioinformaticians have, publicly at least, focused mainly on science. Any given conference or journal is full of papers about algorithms and newly available software tools. Conspicuously absent, at least to me, is any discussion of the software development process used to turn those algorithms into those working software tools. One might argue that such discussions already occur in other software communities, so reproducing them within the bioinformatics community has little utility. Also, algorithm innovation clearly deserves plenty of attention, because it has made possible this boom era of high throughput biology.

However, I think more is going on here. Other scientific software communities show the same lack of interest in discussing software methods. From personal experience I know how helpful even simple self-reflection can be for a software project, bioinformatics or otherwise, to say nothing of applying well-known, common-sense software development practices. The day-to-day work of building software involves many of the same elements, regardless of the application domain, and how you approach the work has a tremendous effect on the quality of product and on the team’s overall sense of satisfaction. Surely then, any team that has taken on the development of a sufficiently complex bioinformatics tool must appreciate the importance of software methods and processes. And yet, no one seems to be interested in talking about them.

I’m currently working on writing an editorial that explores my thoughts in greater detail, but semi-briefly, my guesses are these. First, there is a disincentive, especially in the academic community, for bioinformatics scientists to learn more about project management and software methods. Scientists mainly earn their reputations by presenting novel and important results at conferences and in journals, and time spent learning other skills detracts from this prime directive.

Second, many scientists simply don’t find project management very interesting. Their interest is chiefly in making new things possible, at least in theory, through innovation. Once they’ve shown that something can be done, they move on to find the next thing. How to actually manage a team to turn these innovations into production quality applications is not an “interesting question”, to use the cliché. Innovative research is certainly deeply compelling, and the kind of thing most scientists signed up for when they went to graduate school. Actually managing the day-to-day activities of the software development lifecycle can seem uninteresting, even trivial, to an outsider.

Third, some scientists believe that project management really is trivial. The traditional approach that many researchers resort to when they need a piece of scientific software is to get a graduate student to write the program. If it’s a more complex piece of software, get two graduate students to do it. If it’s a really really big project, then maybe they add a postdoc. This approach works well for some projects, but fails miserably for others. Usually, failure is blamed on the people involved (where, sometimes, some part of the blame fairly rests); the approach itself, however, does not usually receive much examination. I would argue that a wildly unsuitable approach is a fairly good guarantee of failure. This oversight on the part of PIs is partly a result of ignorance of software development issues and partly due to an assumption that mastery of their own discipline extends to mastery of others, when, in fact, it does not (something we are all guilty of at one time or another).

The thing is, thus far, bioinformatics has been driven by innovation, because people have concentrated on developing the algorithms that make high throughput biology possible. However, I believe we are moving into an age in which it is as important to integrate existing algorithms together into production quality applications that can serve larger groups of biologists for years on end. Getting there will require project management know-how that thus far has been largely ignored.

Finally, while it is true that the software development issues that face bioinformatics have much in common with the issues that face other types of software development, we cannot leave all discussion of bioinformatics software development issues to traditional software forums. There are things about bioinformatics software development that are unique, and we ought to provide space within the communal discourse to think them through.

But I’m not just going to talk about the problem. I’m looking for others interested to join me in the establishment of a conference on biomedical software development, either standalone or as part of another meeting. There would be tutorials on good software practices, papers presented giving project case studies, workshops on scientific software patterns and anti-patterns, keynotes from people from bioinformatics and the traditional software industry, etc. I think I could possibly get some good people from the traditional software industry interested, but I’ll need an interested group of bioinformatics folks to make this work. Please contact me at mmhohman@northwestern.edu if you’re interested in getting involved.

Leave a Reply