Daniel Doubrovkine bio photo

Daniel Doubrovkine

aka dB., @awscloud, former CTO @artsy, +@vestris, NYC

Email Twitter LinkedIn Github Strava
Creative Commons License

I was working on some map/reduce that rolled up daily, weekly and yearly statistics in MongoDB and discovered, to my surprise, that JavaScript Date doesn’t have a getWeek method. Worse, the piece of code on About.com turned out to be buggy (it has issues with week 1 and 52). Total Internet #fail. In this post I’ll show you how to add getWeek(date) to MongoDB/Mongoid and how to use it from a map/reduce.

The server-side JavaScript is almost like a stored procedure and is documented here. Let’s use this implementation with a slight change in parameters and save it as lib/javascripts/getWeek.js. We can then store the JavaScript server-side in any Mongoid model. In our case we’ll be counting Widgets, so add this to Widget.rb.

def self.install_javascript
  getWeekJs = Rails.root.join("lib/javascript/getWeek.js")
  if collection.master['system.js'].find_one({'_id' => "getWeek"}).nil?
    collection.master.db.add_stored_function("getWeek", File.new(getWeekJs).read)
  end
end

The add_stored_function method comes from the Ruby MongoDB driver. Call Widget.install_javascript somewhere in a Rake task or inside your map/reduce code.

Let’s now map/reduce our widgets into widgets_weekly using the created_at timestamp. Notice the call to getWeek.

def self.rollup_weekly
  map = <<-EOS
    function() {
        emit({'ts': this.created_at.getFullYear() + '-' + getWeek(this.created_at) }, {count: 1})
    }
  EOS
  reduce = <<-EOS
    function(key, values) {
      var count = 0;
      values.forEach(function(value) {
        count += value['count'];
      });
      return({ count: count });
    }
  EOS
  collection.map_reduce(map, reduce, :out => "widgets_weekly", :query => {})
end

This yields the following collection in widgets_weekly.

{ "_id" : { "ts" : "2011-1" }, "value" : { "count" : 73 } }
{ "_id" : { "ts" : "2011-2" }, "value" : { "count" : 60 } }
{ "_id" : { "ts" : "2011-3" }, "value" : { "count" : 31 } }
{ "_id" : { "ts" : "2011-4" }, "value" : { "count" : 73 } }
{ "_id" : { "ts" : "2011-5" }, "value" : { "count" : 32 } }

If anyone knows of a library that does this kind of rollups, OLAP cubes or any other data transformation for reporting purposes with MongoDB/Mongoid, please speak up!