e.printStackTrace() is not for you

March 21st, 2006

While reading through another team’s Java codebase recently, I came across a disturbing proclivity for code like this

public SomeType aMethod() {
  SomeType result = null;
  try {
    anObject.thatDoesSomething();
    result = anObject.getSomethingElse();
  } catch (SomeTypeOfException e) {
    e.printStackTrace();
    // or, sometimes, log.error(e);
  }
  return result;
}

This is called swallowing the exception. The only way to know that an exception occurred is to have access to the stderr stream of the process (or, if logging was used, the logs). Since the software product referred to above runs inside an application server, clients never get to see this information. The method just returns null. Sometimes it’s obvious that null indicates a problem. Other times, however, the client may wrongly interpret null as absence of information.

This brings me to what I think should be a rule for Java code: almost always, printStackTrace() is not for us to use. Its only utility is for toy programs and logging libraries. When presented with exceptions that your code cannot handle itself, you should rethrow those exceptions up the call stack, so that the server container can report problems to any client and to the application logs.

Sometimes you have to wrap exceptions in another type of exception (e.g. a ServletException) or a custom runtime exception (we usually create a class SystemException for this), because the callbacks your server gives you do not declare that they throw the right exception type. This is one of the reasons I like using Spring MVC; the controller callbacks all declare that they throw Exception (although I’m sure most other modern Java web frameworks share this property these days).

So, remember (if you didn’t already know, which most of you probably do): if you see e.printStackTrace() in your code, it likely means your code has a problem you need to fix. Even throwing new RuntimeException(e) is better than swallowing the exception.

Update: Reading this later, I realized I left out an important point. The above assumes you actually need the try/catch block. If SomeTypeOfException were an unchecked (i.e. runtime) exception, you could just let it bubble up the call stack. If it’s a checked exception, in order to let it bubble up, you have to declare the exception in the method signature. This is unadvisable if the exception makes no sense for the method (e.g. declaring that an Employee‘s getPhoneNumber() method throws SQLException). In this case it’s preferable either to wrap the exception in a checked exception that makes sense in the method signature or in an unchecked exception that does not need to be declared.

A few more thoughts on communication, tech posts to come

March 21st, 2006

I discovered recently on postgenomic.com that mine is one of the wordiest life science blogs around, so I’m going to try to be a little pithier. We’ll see if I can constrain myself.

In my last post I argued for the central importance of effective collaboration and communication in biomedical informatics. I wanted to list a few things that have worked for my teams in those areas. At Northwestern we worked on two projects. Neuromice.org is a phenotype database and virtual storefront for the mutant lines produced by three neurologically-focused whole genome mutagenesis efforts at Northwestern, the Jackson Laboratory and the Tennessee Mouse Genome Consortium. The other application, MouseDB, is an intranet (i.e. you can’t see it) colony and phenotyping management system for the mice under study at Northwestern (10,000 mice/year when we were in full swing). Each project had different challenges, but here are a few things I learned from those experiences. Some are pretty standard agile ideas, others less so.

  • Each distinct customer/user subgroup should appoint a representative who speaks for that subgroup in all discussions of feature definition and priority. Keep the number of subgroups as small as possible (ideally, one). This greatly reduces the uncertainty and difficulty of scope decisions.
  • Some users in the group might have no reason to use your software. Make this fact explicit, and don’t factor their interests into the product.
  • Be completely open with your user community. Give them the opportunity to know everything you’re working on, and the reasons for (and the opportunity to contribute to) any decisions made about features going into the product.
  • A development team should avoid making any decisions about scope or feature priority. Emphasize to users that it is in their power to steer the software toward the greatest possible utility. Technical improvements are a sticking point here, but we’ve found if you make a good argument for them, users understand their value and will prioritize them appropriately.
  • If you let academics’ busy schedules eat away at your face time with them, you will eventually suffer for it. Be creative.

I think I’ve reached my word limit. Over the next couple weeks I’m going to let loose a flurry of technical posts on various topics that have been on my mind lately.

The promise of bioinformatics

March 3rd, 2006

Now and again, you hear the concern that bioinformatics will fail to “fulfill its promise”. I find this statement to be both a bit scary and a little preposterous. Scary because the success of the field will have an effect on my own personal success. Preposterous because, well, the advantages of high throughput computation, structured biological databases, etc. are so abundantly clear, how could bioinformatics possibly fail?

There are certainly success stories. Important approaches to biological analysis in use today were not available ten years ago. I think some of the frustration arises because, in spite of these successes, some users feel that much of the biomedical software being churned out today just isn’t quite useful enough to justify the cost (in money and time) of using it. Assuming this is the case, we must then ask, why? Is biological analysis too difficult to capture in a set of machine instructions? Are bioinformaticians just a bunch of good for naughts?

The latter response, though intended to be humorous, is actually probably more common among biologists than we bioinformaticians might like. This answer also suggests a dichotomy too often encountered in organizations undertaking software development, that of users vs. developers, or, in the domain-specific vernacular, scientists/clinicians vs. bioinformaticians. Users, angered by the failure of software, blame developers for not working hard enough, for not listening, for being idiots (you know they think this sometimes), etc. Developers, for their part, are no more charitable. Developers blame users for not knowing what they want, for using software inconsistently, for not being able to work around seemingly trivial problems, and of course for being idiots. Much of the naturally occurring tension churned up in the process of building software finds its release in similar fits of whinging by one camp or the other. When we’re more reasonable, we are still honestly perplexed by the question, why isn’t this working out better?

In the past I’ve heard people say that bioinformaticians just need to be trained very well in both biology and computer science, that this would alleviate a lot of the problem of getting them to build biologically relevant and valuable software. This may work in some cases. A couple weeks ago I was having lunch with a biologist colleague, and he told me that I needed to learn the biology better, otherwise I would always be beholden to biologists to come up with interesting problems to work on. I see what he was getting at, but I don’t think that is the solution. The truth is both biology and software development are so complex that I don’t think it’s possible to gather into one person’s head all the expertise necessary to produce all the products that bioinformatics promises. Rather, I think the answer is better communication between biology experts and software experts.

Rather than focusing solely on algorithms and technologies, we must focus more on the people side of building biomedical software. You read this very comment in the bio-IT business literature sometimes, taken from the mouths of venture capitalists, in the form of something like “companies can no longer expect to get funding simply for having cool technology”, their software has to solve a biologically relevant problem, i.e. it has to be useful. I am reminded of something I heard during a talk at an agile conference in New Orleans in 2003. Josh Kerievsky said, and I paraphrase, “Some think we’re in the technology business, but we’re not. We’re in the communication business”. Communicating effectively with users is surprisingly difficult to do, and requires wisdom and dedication to get right. Effective communication is a much bigger challenge than the algorithms and the technology usually are. What’s more, it’s a two-way street, and both developers and scientists have to be committed.

I think biologists and bioinformaticians want to communicate better. I think part of the problem is organizational. For instance, at Northwestern, like at any university, it’s very difficult to get good office space. We were stuck in a converted greenhouse for the three years I was in Evanston, on the top floor of the Hogan building. At first we were in the same building as many of the biologists with whom we worked, but not all of them. Over the course of the project, a nice new building opened up, and a number of them moved into the new building. We typically saw these people once a week, if that, at weekly user meetings. We tried to keep contact with them by going to see them individually on a periodic basis (although I think we probably could have done a better job at that). But these people are very busy, and it’s difficult to fit into their schedules. As my biologist colleague pointed out at lunch, it would have been ideal if we could have had shared office space, so that spontaneous discussions would have been more frequent. I think the software we built would have become more useful as a result.

There are many other organizational problems (the difficulty of funding ongoing bioinformatics groups on grants; the lack of a history of operational management positions within academic groups). I imagine some of these get easier in industry. But these are not the only problems. I also think that we don’t yet focus enough attention on communication, on doing it right, and on getting help from outside to do it right. Most of us assume that, hey, we’re smart people, we should be able to communicate. Part of the problem is we’re too smart. We’re trying to communicate complex information, information we’re used to communicating with peers within our field who usually understand us even if we’re not clear. The level of tacit knowledge in biology, like in software, is very high, and it often takes people outside the field quite a while to get a feel for a problem.

This post is long enough. We talk about some of these communication issues in the paper David Kane and others and I are trying to get published on agile software methods in bioinformatics. I think agile principles help, at the very least because they get software engineers to focus less on impressing people with their bulletproof processes and more on people and on communication that works (i.e. “Individuals and interactions over processes and tools”). Although our application domain is more technical, the general software industry and its customers have been grappling with these issues for years, and there are a number of very smart and capable people out there who could really help transfect bioinformatics groups with good approaches for tackling these problems.