Extracting Features into a Gem

Kyle Ries

Since the first gem was born, gem authors have had to toe the line between configurability and supportability. This post covers a facet of this conundrum - when and how to extract certain functionality from an app into a gem and how to support low level configurability. For those interested in a higher level discussion of gem configuration patterns, Brandon Hilkert does a good job covering this landscape in his blog post Ruby Gem Configuration Patterns.

So, what does "low level" configurability mean? Probably many things to many people. In the context of this post, it means the ability to direct the behavior of included gem modules on a class by class basis, after app initialization.

As with all solid abstractions, we'll start with a real-world use case. TeamSnap maintains a gem called emque-producing. It's a relatively young gem that exists to "Define and send messages to a variety of message brokers". Emque-producing provides a great interface for defining consistent messages that we use to unify several message producing apps (legacy and errr, soon to be legacy?). Using this common gem, we can send messages for equivalent events (e.g. game created) such that our message consuming apps can operate on them without concern about where the message originated.

Having used emque-producing for a couple months, we quickly found ourselves desiring the ability to include, in certain types of messages, a consistent representation of the "Original" version of an object, the "Updated" version of an object, and - to round out the picture, the "Delta". We generally call this a changeset. The code to accomplish this is relatively straightforward, but it does add some degree of overhead that is not necessary for all of our messages. Assuming you're using a message broker because you are operating at some considerable degree of scale, we want to be smart about using resources efficiently. Our solution was to write a module that can be mixed into specific messages based on need. Caveat - this performance impact is a hunch at this state. Perhaps including the class is more expensive than calculating the diffs. Part 2 of this post will put this to the test so I can make assertions, with, ya know - data! But for now, the remainder of the post will focus on the extraction.

Show me the code!

This is what the MessageChangeset module looks like in one of the message producing apps:

module MessageChangeset
  def self.included(base)
    base.send(
      :attribute, :change_set, Hash, :default => :build_change_set, :required => true
    )
    base.send(:private_attribute, :updated)
    base.send(:private_attribute, :original)
  end

  def build_change_set
    ChangesPayloadGenerator
      .new(:original => original, :updated => updated)
      .execute
  end

  def build_id
    if updated
      updated.fetch("id") { updated[:id] }
    else
      original.fetch("id") { original[:id] }
    end
  end
end

In order to use MessageChangeset, you would include this module in a message class:

class EventUpdatedMessage
  include Emque::Producing::Message
  include MessageChangeset

  topic "events"

  message_type "events.updated"

  attribute :notify_team_as_member_id, Integer, :required => true
  attribute :notify_team, Boolean, :required => true
  attribute :event_id, Integer, :required => true, :default => :build_id
end

Essentially, what's happening here is that an EventUpdatedMessage, whose interface is largely powered by Emque::Producing::Message in order to make your message palatable to your message broker of choice, is being created with the attributes it has been initialized with. The inclusion of MessageChangeset creates a fourth attribute in the message called changeset, and automatically defaults to the changeset generated by the ChangesPayloadGenerator (we won't talk about that in detail right now, but for the curious, it's basically just a PORO that serializes your object to a hash and sends back the delta as the diff on the updated and original hashes).

And this is great! It works swell, can be included in the specific messages we want, and eliminates the need for this boilerplate code across messages. Abstraction achieved! Boilerplate reduced!

Defining the Problem

But then we started needing MessageChangeset's and ChangesPayloadGenerator's in our other producing apps. We could "copy pasta" the code to the other apps and risk the wrath of all that comes with duplicated code across systems. Or, we could push this abstraction down into Emque::Producing for the benefit of all clients producing these messages.

As a relatively green gem author, I was interested in polling the other devs on the team about how to evaluate the tradeoffs of this idea. In favor of moving it into the gem:

Other client apps needn't implement this same functionality
The functionality seems reasonably common enough to warrant inclusion for the community
The idea of this changeset is directly related to an Emque::Producing::Message and belongs there moreso than any given producer

On the other hand:

There are certain implementation details baked into this message changeset that may not be univerally shared (presence of an id attribute on an object and object's ability to respond to to_hash)
General gem complexity concerns (can this be maintained? tested? documented? configured?)

Refining the Scope

Given the outcome of that dicussion, we decided it'd be worth extracting this functionality into emque-producing, with a few modifications to address some of the negative tradeoffs.

Having never extracted functionality out of an app and into a gem, I reached out for a pair. I lucked out and had two volunteers, both senior Rubyists on the team. This turned out to be a huge benefit - the implementation of the MessageChangeset used a few metaprogamming tricks I had seen before, but had never genuinely understood or used. Tackling this problem with the motivation of a self-serving need combined with the know-how of a senior dev to lead the way turned this from a weekend ruby science project into a productive (i.e. nearly shippable) mid-day code session.

To recap, here were the goals of this pairing session:

Extract MessageChangeset out of the client app and into emque-producing
Provide other clients a way to override some of our default implementation details
Allow for configurability of whether or not to include the changesets in specific messages
Bonus: Teach Kyle the ways of meta-programming

Using Vim and screensharing via Skype, I drove and we jumped right into working on the MessageChangeset module. The snippet below is the original file in the client app followed immediately by the result our our work. The remainder of the post will detail each bit of the change, line by line.

Original

module MessageChangeset
  def self.included(base)
    base.send(
      :attribute, :change_set, Hash, :default => :build_change_set,
      :required => true
    )
    base.send(:private_attribute, :updated)
    base.send(:private_attribute, :original)
  end

  def build_change_set
    ChangesPayloadGenerator
      .new(:original => original, :updated => updated)
      .execute
  end

  def build_id
    if updated
      updated.fetch("id") { updated[:id] }
    else
      original.fetch("id") { original[:id] }
    end
  end
end

# And this is how a message would implement the original
class EventUpdatedMessage
  include Emque::Producing::Message
  include MessageChangeset

  topic "events"

  message_type "events.updated"

  attribute :notify_team_as_member_id, Integer, :required => true
  attribute :notify_team, Boolean, :required => true
  attribute :event_id, Integer, :required => true, :default => :build_id
end

Final

module Emque::Producing
  def self.message(with_changeset: false)
    Module.new do
      define_singleton_method(:included) do |descendant|
        if with_changeset
          descendant.send(:include, ::Emque::Producing::MessageWithChangeset)
        else
          descendant.send(:include, ::Emque::Producing::Message)
        end
      end
    end
  end

  module MessageWithChangeset
    include Emque::Producing::Message

    attribute :change_set, Hash, :default => :build_change_set, :required => true
    private_attribute :updated
    private_attribute :original

    def build_change_set
      ChangesPayloadGenerator
        .new(:original => original, :updated => updated)
        .execute
    end

    def build_id
      if updated
        updated.fetch("id") { updated[:id] }
      else
        original.fetch("id") { original[:id] }
      end
    end
  end
end

# And this is how a message would implement the updated version
class EventUpdatedMessage
  include Emque::Producing.message(changeset: true)

  topic "events"

  message_type "events.updated"

  attribute :notify_team_as_member_id, Integer, :required => true
  attribute :notify_team, Boolean, :required => true
  attribute :event_id, Integer, :required => true, :default => :build_id
end

Solving

I think the first thing I wanted to clarify when we were getting started is, well - what the heck is going on?! So let's start there. The snippet below illustrates some fundamental ruby meta-programming techniques. The purpose of this code is to do the following. When the MessageChangeset module is included in a class, ruby is going to send the #included message to our module, passing the instance of our message class as the sole argument, base. When it does so, our code below will run and define an #attribute method with the arguments :change_set, Hash, :default => :build_change_set, and :required => true. I find the best way to think about this is to imagine how the file would look if you did this manually. In this case, you would just write a new attribute line :attribute, :change_set, Hash, :default => :build_change_set, :required => true in the class. But, since that's boilerplate and subject to error, this automation of the code writing helps avoid errors and encourage consistency. The second and third lines of the method do effectively the same thing - defining two new private_attribute methods, passing :updated and :original as their arguments.

module MessageChangeset
  def self.included(base)
    base.send(
      :attribute, :change_set, Hash, :default => :build_change_set,
      :required => true
    )
    base.send(:private_attribute, :updated)
    base.send(:private_attribute, :original)
  end

# ...rest of module ommitted

And so, in our client app, this is how changeset's are added to a message. Now let's look at how the code in our final version differs, why, and what it affords us.

The first thing that should jump out is that we're now working in the Emque::Producing namespace. It turned out that the easiest approach to flesh out our ideas and get a feel for the implementation was simply to monkey patch the existing gem in the client app and see how things looked and felt. I've always been leery of monkey patching in Ruby and, as a result, have turned a blind eye towards it. However, this was a novel use of it that I found exceptionally helpful for quick prototyping. I would never commit this, but it definitely warrants a stash. But, back to the program at hand.

module Emque::Producing
  def self.message(with_changeset: false)
    Module.new do
      define_singleton_method(:included) do |descendant|
        if changeset
          descendant.send(:include, ::Emque::Producing::MessageWithChangeset)
        else
          descendant.send(:include, ::Emque::Producing::Message)
        end
      end
    end
  end

  module MessageWithChangeset
    include Emque::Producing::Message

    attribute :change_set, Hash, :default => :build_change_set, :required => true
    private_attribute :updated
    private_attribute :original

# ...rest of module ommitted

The line def self.message(with_changeset: false) defines the message method that accepts a keyword argument changeset on Emque::Producing. This is actually quite a significant change that gives us the tremendous advantage of being able to parameterize the way we want to include a module. In this case, it allows clients to do include Emque::Producing.message(changeset: true) when they are working with a message that requires a changeset or simply do include Emque::Producing.message when they don't.

The contents of this method:

Module.new do
  define_singleton_method(:included) do |descendant|
    if changeset
      descendant.send(:include, ::Emque::Producing::MessageWithChangeset)
    else
      descendant.send(:include, ::Emque::Producing::Message)
    end
  end
end

...is where the parameter is honored. If the client passes changeset: true, the Emque::Producing::MessageWithChangeset module will be included. If not, they'll get Emque::Producing::Message. The other important thing to notice here is the use of Module.new in combination with define_singleton_method. What we're doing with combination is saying that Emque::Producing.message should return an anonymous module. And it will then be this anonymous module that is actually what the message class will include. So, following the steps:

Your message class says include Emque::Producing.message
Emque::Producing.message returns an anonymous module
That anonymous module is what is passed to include in the class
The anonymous module has defined it's own included method which ruby will now call as it's being included in a class
When the anonymous module's included method is called by Ruby, our anonymous module is actually going to tell our message class to include the Emque::Producing module that we want to provide

Ok - bear with me. My mind nearly exploded as I tried to follow this the first time, but breaking it down step by step makes sense. I will say, I would never wish to leave this code in a production app for the benefit of another programmer to have to come in, parse, and understand. However, I do think this style seems helpful in gems, and indeed, affords a clean client-side implementation.

This leads us to the next significant change. In the client implementation, we were required to use base.send(:attribute, :change_set, Hash, ....) in order to define the methods on the given message class. This was a result of the class having to first be instantiated, mixin Emque::Producing::Message, then mixin MessageChangeset - we needed to wait for the class in order to define those methods. However, since we are now defining the extension of a Message within Emque::Producing, we can actually just include Emque::Producing::Message as the foundation for our enhanced MessageWithChangeset while tacking on the additional attributes just as though we were defining a message.

  module MessageWithChangeset
    include Emque::Producing::Message

    attribute :change_set, Hash, :default => :build_change_set, :required => true
    private_attribute :updated
    private_attribute :original

In our opinion, it is much clearer that a MessageWithChangeset is exactly that - a standard Message with a few additional bits baked in. Gotta love it when a name matches so well with the idea it is representing!

Conclusion

And that more or less concludes part 1 of how to extract a feature from an app and integrate it into an existing gem. To recap, we recognized a problem and weighed the tradeoff's of moving some functionality into a common gem. We paired up to define the problem and set some boundaries for what we wanted to achieve. We used an in-app monkey-patching style to flesh out the interface we desired. We then used a few meta-programming tricks to implement the functionality in a way that allows for class level configuration.

In part 2, I will finish up this project with a post about how to test this functionality that will soon belong in the gem along with the aforementioned benchmarking of this implementation. Assuming all goes well, part 2 will conclude with a pull request in emque-producing and I'll be happily on my way implementing messages in other apps that will benefit from this effort!

For Business

For Teams

Extracting Features into a Gem

Show me the code!

Defining the Problem

Refining the Scope

Original

Final

Solving

Conclusion

Blog/ Ruby / Extracting Features into a Gem…

Extracting Features into a Gem

Show me the code!

Defining the Problem

Refining the Scope

Original

Final

Solving

Conclusion