Long Lost Allocation

There's a great little method in Ruby that you might not be familiar with. It's used all the time but it's hardly ever called directly. The method is Class\#allocate.

Even if you're not familiar with Class\#allocate, I'm sure you've seen Class\#new:

me = Person.new("Steve")

If you've ever been confused about how calling Person.new ends up calling Person\#initialize, here's what's really happening behind the scenes:

class Class
  def new(*args, &block)
    instance = allocate
    instance.initialize(*args, &block)
  end
end

The allocate method just builds a new instance of the class, completely skipping the call to initialize.

So what?

While you probably shouldn't call allocate directly in your production code, I think it can be incredibly useful in testing. You are unit testing… right?

Consider the following class:

class Animal
  def initialize(phylum, class_, order, family, genus, species)
    @phylum  = phylum
    @class   = class_
    @order   = order
    @family  = family
    @genus   = genus
    @species = species
  end

  def name
    "#{@genus} #{@species}"
  end
end

To test the Animal\#name method, I need an instance of Animal. But for this method, I only care about the animal's genus and species.

describe Animal do
  subject(:animal) { Animal.allocate }

  describe "#name" do
    it "combines the genus and species" do      
      animal.instance_variable_set(:@genus, "Homo")
      animal.instance_variable_set(:@species, "sapiens")

      expect(animal.name).to eql("Homo sapiens")
    end
  end
end

Rather than initializing a full animal instance to test only two of its internal values, I simply set those values directly.

For me, using allocate has been a great way to keep my unit tests short, clear and isolated. It's especially useful when a class's initialize method requires many arguments and/or preprocesses those arguments.

What else?

I find that developers often don't test their own initialize methods. And if they do, it might look like this:

describe Animal do
  describe "#initialize" do
    it "sets the phylum, class, order, family, genus and species" do
      animal = Animal.new("Chordata", "Mammalia", "Primates", "Hominidae", "Homo", "sapiens")

      expect(animal.instance_varaible_get(:@phylum)).to eql("Chordata")
      # …
    end
  end
end

What stands out to me is that while we're testing initialize, we never call it directly. Using allocate I can test initialize like I would any other private method:

describe Animal do
  subject(:animal) { Animal.allocate }

  describe "#initialize" do
    it "sets the phylum, class, order, family, genus and species" do
      animal.send(:initialize, "Chordata", "Mammalia", "Primates", "Hominidae", "Homo", "sapiens")

      expect(animal.instance_varaible_get(:@phylum)).to eql("Chordata")
      # …
    end
  end
end

Allocating in RSpec

Currently, calling subject within an RSpec unit test will give you the result of calling new on the class you're describing. I'd love to see this default change to use the allocate method so that unit tests can be properly isolated.

In addition, calling allocate will always work, while calling new only works if your class's initialize method accepts no arguments, which I don't find to be the case very often.

While my hopes aren't high that the default will change, at the least I'd like the ability to configure RSpec to use allocate rather than new when calling subject.

What do you think?

steve@collectiveidea.com

Comments

  1. January 23, 2013 at 21:44 PM

    The initialize method is always marked as private by Ruby, so you can’t really just call it. You actually have to use Object#send, or the future-proof Object#send.

    So, if you try to call the initialize method, you end up getting this: `Class.new.allocate.initialize
    NoMethodError: private method `initialize’ called for [object]`

  2. January 23, 2013 at 21:52 PM

    The future-proof method was supposed to be this: http://ruby-doc.org/core-1.9.3/Object.html#send-method :P

  3. January 23, 2013 at 22:04 PM

    While not entirely convinced to use `allocate`, I have a side note for you. Using explicit `subject` in specs is a code smell, unless you name them, like this:

    subject(:animal) { Animal.allocate }
    animal.instance_variable_set(:genus, "Homo") animal.instance_variable_set(:species, “sapiens”)
    expect(animal.name).to eql(“Homo sapiens”)

    This reads a whole lot better.

  4. January 23, 2013 at 22:06 PM

    Pawel: Thanks! I don’t use subjects very often and didn’t know you could do that. Maybe I’ll start now!

  5. January 24, 2013 at 0:49 AM

    Nando: You’re absolutely right. I updated the post to reflect the fact that initialize is a private method. I still like testing initialization this way, just like I would for other private methods.

  6. January 24, 2013 at 1:15 AM

    I appreciate the thought and effort that went into this post, but I think calling allocate (and using things like `instance_variable_get`/`instance_variable_set`) is a code smell: there are cases where they are useful, and I’m glad ruby has them, but their use suggests a deeper problem with your design.  One of the most important aspects of TDD is using your objects’ public interface to interface to interact with it.  Using allocate and `instance_variable_set`/`instance_variable_get` bypasses the public API of your objects entirely.  You get zero design feedback on your public API if you don’t use it.  If you’re using these implementation details to interact with your class due to something painful about using your public API, it’s giving you feedback that you shouldn’t work around in your test.  If it’s not giving you pain to use the public API, why not use the public API?

    As for changing the default implicit subject in RSpec to use `allocate`…this would break tons of specs in the wild (and as I’ve stated here, this is a practice I do not want to encourage).  Besides, if you want to use a different subject…then you can declare a different subject.  It would not be difficult to create a shared context that gets added to all example groups in your project that changes `subject` to use allocate on your project if you want to continue with this approach.

  7. January 24, 2013 at 1:48 AM

    Myron: Thank you for your thoughtful response!

    While I think that the instance_variable_get and instance_variable_set methods are ugly, I don’t agree that they’re a smell. And I don’t think there’s anything wrong with the implementation of the Animal class. I think it’s quite clean.

    When I’m testing just the Animal#name method, I’m not interested in feedback on my API design as a whole. I’m only testing that particular bit of the public API.

    Would you be willing to open the conversation up to RSpec users if I submit a pull request adding a new (non-default) configuration?

  8. January 24, 2013 at 3:01 AM

    Presumedly the values for those instances methods come from the public interface at some point. Asserting that name outputs those instances variables doesn’t really add value to the specs.

    Pretend the public interface changes and those variables are no longer used. That test would still pass because you’re not using the public interface. 

  9. January 24, 2013 at 3:01 AM

    > While I think that the instance_variable_get and instance_variable_set methods are ugly, I don’t agree that they’re a smell.

    Just to make sure we’re understanding words to mean the same thing, here’s what I mean by code smell

    ”..any symptom in the source code of a program that possibly indicates a deeper problem” [1]

    That doesn’t mean “don’t do it” or that it’s wrong to do.  It’s a symptom that possibly indicates a deeper problem.  But it may not.

    Accessing instance variables from outside an object using `instance_variable_get`/`instance_variable_set` is a code smell because it possibly indicates that the object may be    missing something from its public API.  (Otherwise, why would you feel the need to reach into its internals?)

    > And I don’t think there’s anything wrong with the implementation of the Animal class. I think it’s quite clean.

    I didn’t say there’s anything wrong with the implementation of the Animal class.  I think the implementation is fine, too.

    I think that needing to use allocate and ivars suggests a method you’re missing, though.  Here’s the thought process that goes through my head for a case like this:

    • I’m specifying the behavior of `#name`.
    • The behavior of the #name method is only concerned with the animal’s genus and species.  None of the other attributes matter to this method.
    • In my example, I could create the instance using `Animal.new(…)`, but that would require me to pass in 5 arguments, which is both cumbersome and introduces noise into my     example – 3 of those 5 arguments don’t matter to the example at all since #name doesn’t use them.* This suggests that it would be useful to write a factory method that would allow me to create an Animal instance using only a subset of the attributes.  Maybe it could be `Animal.from_hash(genus: “Homo”, species: “sapien”)`.  Or maybe it’s a `build_animal(hash)` helper method that’s only available in my test environment.
    • This improves the design of my system in two ways:1. It gives me a simple way to build animal instances in other tests that similarly only care about a subset of animal attributes.2. It centralizes the knowledge for how to create a animal instances from a subset of attributes in one place…which will come in handy as soon as it needs to change.

    There’s another issue with the approach you’ve used here: you’re tightly coupling your tests to your implementation code. One of the biggest values I get out of tests is that they enable me to refactor (and in fact, that’s the 3rd step in the TDD cycle: red, green, refactor). When you couple your tests to the implementation of the code, it makes refactoring difficult or impossible. You can’t do a simple instance variable rename refactoring without breaking your tests. If you limit your test to only use the public API of your objects, you can refactor the implementation as much as you want, as long as that API doesn’t change.

    > When I’m testing just the Animal#name method, I’m not interested in feedback on my API design as a whole. I’m only testing that particular bit of the public API.

    I’m a big believer in isolated unit testing, but there’s nothing wrong with usign other parts of the object’s public API to get it into the state needed to verify a particular   method. The fact that you’re not interested in feedback on other parts of your API (such as how Animals objects are created) means you miss out on potential improvements to your system.

    > Would you be willing to open the conversation up to RSpec users if I submit a pull request adding a new (non-default) configuration?

    I’ll be honest…I’m torn here: I love the fact that you want to contribute to RSpec, and I want to encourage as much community participation in RSpec’s development as  possible.  But as you can see from my comments here, I think adding an explicit config option to RSpec to change the implicit subject implementation to use `allocate` encourages a way of testing that I think should be generally avoided.

    Besides, there’s already a way to configure this[2]. `shared_context` is crazy powerful and flexible, and you can use it to do many, many things…including changing the implicit subject implementation! I think it’s better to have a more general multi-purpose configuration system (such as shared_context) than to add a specific config option for  `allocate`, anyway – it’s more flexible, and less code for us to maintain :).

    All that said: if you want to open a pull request and gather community feedback, please do. I’d be curious to hear if others spec things using `allocate` as well. Just know that I’m unlikely to merge such a PR, given my thoughts on the technique and the fact that RSpec already has a way of doing this.

    I hope this doesn’t discourage you from contributing in the future, though :).

    Myron

    [1] http://en.wikipedia.org/wiki/Code_smell
    [2] https://gist.github.com/4617276

  10. mwest67@gmail.com
    Mike West
    January 24, 2013 at 9:04 AM

    As Myron said above, bypassing you public API is a bad idea and testing like this only makes your tests brittle. What happens if you want to change the names of the variables or changes the source of the values, your tests will break

  11. January 24, 2013 at 14:56 PM

    Myron & Justin: Thanks again for the awesome feedback.

    So I’m seeing now that the difference in our approaches is where we draw the lines . Instead of treating the class or instance like a black box, I’ve been treating the methods themselves as a black box. So while instance variables are internal to the instance, I was looking at them as external to the method.

    Recently, I’ve been trying to learn and refine my testing methods, especially my unit testing, and I’ve been assuming that blackboxing the method is the thing to do so I’ve made a conscious effort to avoid other methods in the public API.

    Thanks to a lot of honest feedback like yours, I’m reconsidering my approach and will try my hand at utilizing my public API in unit tests over the implementation.

    Thanks again, guys!