You may have noticed some Ruby on Rails posts lately on my blog. That’s because I am working on something. More about it in a few weeks. In the meantime, all you need to know is that it’s an application with a lot of meaningful and interesting data. Without a representative data set the application is useless and with a fake set of data the application would produce confusing results. Imagine that we are building a website for drug manufacturers to search for chemical ingredients of specific drugs – we’d like to be able to find Aspirin in the development environment and see that it’s made of acetylsalicylic acid, carnauba wax, corn starch, hypromellose, powdered cellulose and triacetin. Finding real answers validates our software and enables developers catch bugs early. We want real data, but just not all of it.
I tried mongodump and mongorestore. Those are straightforward tools that let you export and import Mongo data (Mongo people did their job very well there, much less hassle than with a traditional RDBMS where you have to backup the database, deal with the transaction log, bla bla bla). All is well when working with local machines. Remotely, you need to go the extra step of figuring out the database address, username and password. This gets messier with Heroku and eventually starts smelling bad.
I want to do this the “Rails Way” by invoking a single rake command that imports and exports Mongo data in any of my environments. The following is based on this post, but we’re doing this with MongoDB and will take it a little further. We’ll put our tasks in lib/tasks/db_import_export.rake.
Exporting Data
Given a set of objects, we can serialize them to a file using JSON. We’ll give the model name and a file as a parameter, fetch all objects and write them to a file in JSON format.
task :export, [:model, :filename] => :environment do |t, args|
model = args[:model].constantize
filename = args[:filename]
objects = model.find(:all)
File.open(File.join(Rails.root, filename), "w") do |f|
objects.each do |object|
f.write(object.to_json)
f.write("\r\n")
end
end
end
Importing Data
Importing data is the inverse operation. We have to clear the model data – I couldn’t figure out how to instantiate an object that exists and resave it with changes [thread].
task :import, [:model, :filename] => :environment do |t, args|
model = args[:model].constantize
model.destroy_all
filename = args[:filename]
File.foreach(File.join(Rails.root, filename)) do |line|
next if line.blank?
object = model.new.from_json line.strip
object.save!
end
end
Putting it Together
We can call our tasks on several well-known collections. Of course, feel free to extend this to iterate through all Mongo collections and post your code as a comment here.
namespace :db do
def collections
[
{ model: "Drug", table: "drugs" },
{ model: "Ingredient", table: "ingredients" },
{ model: "User", table: "users" }
]
end
task :export => :environment do
collections.each do |collection|
table = collection[:table]
model = collection[:model]
dir = "db/seed/" + Rails.env
filename = dir + "/" + table + ".json"
Dir.mkdir dir unless (Dir.exists? dir)
Rake::Task["db:model:export"].execute({model: model, filename: filename})
end
end
task :import => :environment do
collections.each do |collection|
table = collection[:table]
model = collection[:model]
dir = "db/seed/" + Rails.env
filename = dir + "/" + table + ".json"
Rake::Task["db:model:import"].execute({model: model, filename: filename})
end
end
end
Faking Data
There are two other interesting implementation details worth mentioning.
The first is that we have a User model that has a username and password. Of course we use the awesome devise and importing and exporting password data doesn’t do anything (we’re missing the salt value and the system stores an encrypted password hash anyway). Whenever we encounter a “password” field during import, we simply replace it with a fixed value.
object = model.new.from_json line.strip
object.password = "password" if (object.respond_to? 'password')
The second feature is that we don’t want real user data to be exported, but we’d like to preserve the relationships in the existing database. We use faker to replace all names, e-mails and websites. This can be further applied to all kinds of properties.
object.name = Faker::Name.name
object.email = Faker::Internet.email
object.website = Faker::Internet.domain_name
Famous Last Thoughts
Finally, all we have to do is to tweak and change the files generated by rake db:export to our liking and commit them with git. New developers can simply run rake db:import
to get started instead of a rake db:seed
.
I think this could serve as a good start for a collection of tasks that ship with Mongoid. Thoughts? Comments?