Too many mock objects == ruby refactoring death

July 6th, 2008

It’s a question we face as test-driven ruby programmers: Should we use mock objects or real objects in our tests?

Both approaches have trade-offs, and their biggest downsides both have to do with wasting programmer time. If you test with real objects, then your tests run slowly (especially if you use an ORM that binds your domain objects tightly to the database like ActiveRecord). Your tests hit the database, and this is slow. There are other sources of slowness, but nothing has anywhere near as great an effect as hitting the DB.

If you test with mock objects, once your app has any kind of complexity, your refactoring and test writing processes become slow. This is not immediately apparent when you start using mock objects. But as you start writing more and more code, eventually you start having to come up with a crazy number of mock expectations just to test some of your methods. It is true that this is good feedback that the class you’re testing presents too complex an interface to other collaborating objects, or that it collaborates with too many objects, etc. What starts simple will eventually become too complex, and at some point you’re going to need to refactor.

More insidious than this, however, is the effect this web of mocks you’ve wrapped yourself in has on refactoring. You don’t notice how thoroughly you’ve painted yourself into a corner until you want to refactor some ugly aspect of a core class that collaborates with many objects in your system. Suddenly all of those collaborating objects’ tests break because they expect certain method calls from this core object. These tests break because you’ve changed a method signature, a method name, or even worse just an implementation, because no matter what those BDDers tell you, if you test with mocks, to too great a degree that means you’re testing implementation, not behavior (or is that behaviour).

So, now you have to go through all of those mock-based tests and “correct” them, i.e. change their expectations so that they fit the new method name/signature/implementation. This is horrible. The whole point of tests during refactoring is to verify that your refactoring hasn’t changed the behavior of the system (that being half of the definition of refactoring). The tests should pass before you refactor, and they should pass after you refactor. Not only does this break the fundamental refactoring process, it also can take a lot of time, because you have to remember the context of each of those test cases that you have to fix.

You can do something about slow running tests that hit a database (in extreme cases you can use a faster in-memory database, or even parallelize your tests). Of course they won’t run as fast as they would if they didn’t hit a database, but in my opinion it’s something you can live with. Dav Yaginuma has a good suggestion for what to do with this time: Go write that email you need to write, go take the bathroom break, go walk around the office and stretch your legs. It’s not like that’s really wasted time. It is wasted time, however, if you’re sitting there squinting at the screen fixing all of your mock expectations. You can’t do anything else with that time.

I’m kind of half-convinced now that people who advocate the heavy use of mocking either have really nice IDEs that make fixing the expectations a breeze, or they don’t refactor. And if they’re using Ruby that means they don’t refactor. Ok, tongue out of cheek. Seriously, I’d love to hear from folks who have used and continue to use mocks heavily on long running projects, to hear how they handle the refactoring issue. I have pretty much sworn off mocks except in old-school traditional cases (”mocking out an external dependency too expensive to call directly”) because of it.

MacPorts Ruby, now with DTrace

February 21st, 2008

We are gearing up to do some profiling/performance improvement at work, and we use MacPorts (mostly at my stubborn insistence) to install Ruby on our OS X dev boxes. Unfortunately, the MacPorts version of Ruby is not DTrace-enabled, so we were faced with the decision to either go with the Apple-installed Ruby or not use DTrace.

Fortunately, there was a third option. I spent some time massaging Joyent’s Ruby DTrace patch so that it would compile with Apple’s version of DTrace (subtly different from Solaris’s), and so it would play nice with the other patches in the official Ruby MacPort distribution. Anyway, long story short: you can get it via my newest RubyForge project, rubyport-dtrace. You can install either from the tarfile or by checking out from Subversion, see the instructions in the distribution.

Why I like MacPorts: I like being able to cleanly remove software I install. I also like that I can compile Ruby with DTrace and other patches that I might want (such as the Railsbench GC patch, which I’m also working in to the rubyport-dtrace (dare I call it) code, it might already work but I haven’t tested it).

Chained Selenium RSpec examples

August 8th, 2007

From the RSpec documentation:

It is very tempting to use before(:all) and after(:all) for situations in which it is not appropriate. before(:all) shares some (not all) state across multiple examples. This means that the examples become bound together, which is an absolute no-no in testing. You should really only ever use before(:all) to set up things that are global collaborators but not the things that you are describing in the examples.

Well-known conventional wisdom says that different test cases (in spec-speak, “examples”) should not depend on one another for state, should be runnable in any order, etc. I certainly agree with this wisdom in general, but I think there’s one case where this rule should be broken. We’ve been writing a fair number of Selenium RC tests lately for our app, using RSpec to drive Selenium RC. When writing integration tests like this, each example (in test speak, “test method”) is often a very long script with lots of shoulds/asserts in it. We lose the nice descriptive power of small examples with specific, descriptive text, and instead are faced with a choice between vague and high-level example descriptions and really long example descriptions using ugly here documents that can easily fall out of synch with the example code.

Instead, we want to be able to do something like this:

describe "A user customizing a car" do
  use_chained_examples

  before(:all) do
    @model = models(:spiffy)
    log_in
    go_to_car_customization_start_page
  end

  it "should first be required to select car model" do
    page_title.should == "Select a model"
    droplist("model_id").should be_present
    droplist("model_id").options.should == [
      "(Please select a model)",
      models(:spiffy).name,
      models(:sporty).name
    ]
    droplist(”model_id”).selected_value.should == “”

    droplist(”model_id”).select(@model.id)
    click_next_button
  end

  it “should then be required to select paint color” do
    page_title.should == “Select #{@model.name} color”
    droplist(”color_id”).should be_present
    # etc.
  end
end

A before(:each) block can be used to reset the page between each example, which we’ve found useful when testing a bunch of validations or something similar. Note too that using chained examples is not the default behavior, and must be explicitly specified by the developer, who should “know what they are doing” if they do this.

Anyway, obviously we did figure out how to make this happen, and after we’ve refined it a bit if people are interested we’ll open source our selenium rspec stuff as a plugin. Note that our selenium specs use a different spec_helper.rb than the rest of our normal specs, so we’re keeping the ability to chain examples out of our standard specs, as conventional wisdom would recommend. Let me know what you think.

Keeping Rails migrations happy

May 9th, 2007

Two quick things we’ve learned about migrations at CDD:

  • Avoid using your model objects in your migrations, e.g. stuff like Group.create!(:name => "Watson Lab"). The problem with this is that later you might add a required field to your model, and then this migration will throw an exception. Occasionally you need some logic from a model in a migration, but if at all possible I’d suggest exposing that logic in a way that doesn’t require creating or loading model objects in your migration itself. The migration should just know about SQL, nothing else.
  • Say you branch your code base for a release, and you anticipate needing to support that branch for any length of time. Sometimes you’ll need to address an issue in the production code that requires another migration. What we’ve found works best with ActiveRecord migrations is:
    1. In the trunk, delete all existing migrations when you branch.
    2. Dump a version of the branch schema, and make that migration #1 (001_production_branch_schema.rb) in your trunk.
    3. Start your next trunk migration several numbers higher than your last migration on the production release branch. So, if your last migration on the branch was 40, start 40+N, where N gives you enough cushion to accommodate any additional migrations needed for the branch until your next release.
    4. Any time you add another migration to the branch, in the trunk replace 001_production_branch_schema.rb with a new dump of your branch schema.

Kind of a hack, but it works better than anything else we’ve come up with. My former colleague, Rhett Sutphin, took a different approach to this problem when he wrote a Java/Groovy port of migrations called bering (to which I minimally contributed in its early stages). In bering, migrations are specific to a particular release. Each release is numbered and gets its own separate migration directory, and migrations start at one again for each new release.