DTrace predicate hack

July 6th, 2008

One of the things I keep wanting in DTrace’s D language but isn’t there (right?) is a richer set of string comparison functions. Ideally I want full regular expression functionality, so that I can predicate actions on, say, regex matches of a class and/or method name. For instance, a while back, while profiling some Java, I wanted to only count time spent in methods of classes that belonged to a particular package (org/apache/solr) or to its subpackages. There is no “starts with” string operator in D. However, the following did the trick:

hotspot$target:::method-entry
/(self->class = copyinstr(arg1)) != NULL && self->class >= "org/apache/solr/" && self->class < "org/apache/sols"/
{
  /* action goes here */
}

A little ugly, but it worked. The choice of “org/apache/sols” as the upper bound was somewhat arbitrary.

Too many mock objects == ruby refactoring death

July 6th, 2008

It’s a question we face as test-driven ruby programmers: Should we use mock objects or real objects in our tests?

Both approaches have trade-offs, and their biggest downsides both have to do with wasting programmer time. If you test with real objects, then your tests run slowly (especially if you use an ORM that binds your domain objects tightly to the database like ActiveRecord). Your tests hit the database, and this is slow. There are other sources of slowness, but nothing has anywhere near as great an effect as hitting the DB.

If you test with mock objects, once your app has any kind of complexity, your refactoring and test writing processes become slow. This is not immediately apparent when you start using mock objects. But as you start writing more and more code, eventually you start having to come up with a crazy number of mock expectations just to test some of your methods. It is true that this is good feedback that the class you’re testing presents too complex an interface to other collaborating objects, or that it collaborates with too many objects, etc. What starts simple will eventually become too complex, and at some point you’re going to need to refactor.

More insidious than this, however, is the effect this web of mocks you’ve wrapped yourself in has on refactoring. You don’t notice how thoroughly you’ve painted yourself into a corner until you want to refactor some ugly aspect of a core class that collaborates with many objects in your system. Suddenly all of those collaborating objects’ tests break because they expect certain method calls from this core object. These tests break because you’ve changed a method signature, a method name, or even worse just an implementation, because no matter what those BDDers tell you, if you test with mocks, to too great a degree that means you’re testing implementation, not behavior (or is that behaviour).

So, now you have to go through all of those mock-based tests and “correct” them, i.e. change their expectations so that they fit the new method name/signature/implementation. This is horrible. The whole point of tests during refactoring is to verify that your refactoring hasn’t changed the behavior of the system (that being half of the definition of refactoring). The tests should pass before you refactor, and they should pass after you refactor. Not only does this break the fundamental refactoring process, it also can take a lot of time, because you have to remember the context of each of those test cases that you have to fix.

You can do something about slow running tests that hit a database (in extreme cases you can use a faster in-memory database, or even parallelize your tests). Of course they won’t run as fast as they would if they didn’t hit a database, but in my opinion it’s something you can live with. Dav Yaginuma has a good suggestion for what to do with this time: Go write that email you need to write, go take the bathroom break, go walk around the office and stretch your legs. It’s not like that’s really wasted time. It is wasted time, however, if you’re sitting there squinting at the screen fixing all of your mock expectations. You can’t do anything else with that time.

I’m kind of half-convinced now that people who advocate the heavy use of mocking either have really nice IDEs that make fixing the expectations a breeze, or they don’t refactor. And if they’re using Ruby that means they don’t refactor. Ok, tongue out of cheek. Seriously, I’d love to hear from folks who have used and continue to use mocks heavily on long running projects, to hear how they handle the refactoring issue. I have pretty much sworn off mocks except in old-school traditional cases (”mocking out an external dependency too expensive to call directly”) because of it.

MacPorts Ruby, now with DTrace

February 21st, 2008

We are gearing up to do some profiling/performance improvement at work, and we use MacPorts (mostly at my stubborn insistence) to install Ruby on our OS X dev boxes. Unfortunately, the MacPorts version of Ruby is not DTrace-enabled, so we were faced with the decision to either go with the Apple-installed Ruby or not use DTrace.

Fortunately, there was a third option. I spent some time massaging Joyent’s Ruby DTrace patch so that it would compile with Apple’s version of DTrace (subtly different from Solaris’s), and so it would play nice with the other patches in the official Ruby MacPort distribution. Anyway, long story short: you can get it via my newest RubyForge project, rubyport-dtrace. You can install either from the tarfile or by checking out from Subversion, see the instructions in the distribution.

Why I like MacPorts: I like being able to cleanly remove software I install. I also like that I can compile Ruby with DTrace and other patches that I might want (such as the Railsbench GC patch, which I’m also working in to the rubyport-dtrace (dare I call it) code, it might already work but I haven’t tested it).

Z factor refactored

November 11th, 2007

I recently reread the original Z factor paper (Zhang et al). The Z factor is a measure of assay reliability and comes in two flavors: the Z’ factor, based entirely based on controls (those with and without the desired effect); and the Z factor, based on experimental data compared with the controls that should have the desired effect.

Rereading a paper months later often makes you wonder whether you read the paper at all the first time. This reading really clarified for me what the Z factor is, that it is not just for high-throughput screening, and raised a number of questions (especially after discussion with colleagues) not addressed in the paper.

The Z factor is the ratio of the “separation band” of the data to the assay dynamic range. A picture helps:

separation band image

where μ+ is the mean of the positive controls (in this case the controls with desired effect), μs is the mean of the data, σ+ is the standard deviation of the positive controls, etc. The assay dynamic range in this diagram is μ+ - μs. The screening window is then (μ+ - μs) - (3σ+ + 3σs), and the ratio of this to the dynamic range is the Z factor = 1 - (3σ+ + 3σs)/(μ+ - μs).

(If you’re reading this in an RSS reader, the story continues on my website.)

Read the rest of this entry »

Chained Selenium RSpec examples

August 8th, 2007

From the RSpec documentation:

It is very tempting to use before(:all) and after(:all) for situations in which it is not appropriate. before(:all) shares some (not all) state across multiple examples. This means that the examples become bound together, which is an absolute no-no in testing. You should really only ever use before(:all) to set up things that are global collaborators but not the things that you are describing in the examples.

Well-known conventional wisdom says that different test cases (in spec-speak, “examples”) should not depend on one another for state, should be runnable in any order, etc. I certainly agree with this wisdom in general, but I think there’s one case where this rule should be broken. We’ve been writing a fair number of Selenium RC tests lately for our app, using RSpec to drive Selenium RC. When writing integration tests like this, each example (in test speak, “test method”) is often a very long script with lots of shoulds/asserts in it. We lose the nice descriptive power of small examples with specific, descriptive text, and instead are faced with a choice between vague and high-level example descriptions and really long example descriptions using ugly here documents that can easily fall out of synch with the example code.

Instead, we want to be able to do something like this:

describe "A user customizing a car" do
  use_chained_examples

  before(:all) do
    @model = models(:spiffy)
    log_in
    go_to_car_customization_start_page
  end

  it "should first be required to select car model" do
    page_title.should == "Select a model"
    droplist("model_id").should be_present
    droplist("model_id").options.should == [
      "(Please select a model)",
      models(:spiffy).name,
      models(:sporty).name
    ]
    droplist(”model_id”).selected_value.should == “”

    droplist(”model_id”).select(@model.id)
    click_next_button
  end

  it “should then be required to select paint color” do
    page_title.should == “Select #{@model.name} color”
    droplist(”color_id”).should be_present
    # etc.
  end
end

A before(:each) block can be used to reset the page between each example, which we’ve found useful when testing a bunch of validations or something similar. Note too that using chained examples is not the default behavior, and must be explicitly specified by the developer, who should “know what they are doing” if they do this.

Anyway, obviously we did figure out how to make this happen, and after we’ve refined it a bit if people are interested we’ll open source our selenium rspec stuff as a plugin. Note that our selenium specs use a different spec_helper.rb than the rest of our normal specs, so we’re keeping the ability to chain examples out of our standard specs, as conventional wisdom would recommend. Let me know what you think.