Rails + S3 + Cloudfront + Jammit + Heroku + 100 грам

Back | cloudfront, s3, rake, heroku, rails, ruby, open source | 5/10/2011 |

When someone says that you need a “100 grams” to figure something out, it means that it’s completely unobvious and complicated and therefore you need a few vodka shots. Vodka in Russia is measured in grams – 100g being something you would casually drink for breakfast. Alright, today we’re going to figure out how to make Amazon Cloudfront actually work with static assets, Rails, Jammit and Heroku. Get your vodka glass, you’re going to need one.

First some background.

Syntactically Awesome Stylesheets (SASS) to CSS

SASS is a way to author CSS. We write all stylesheets using SASS and place them into app/stylesheets. These are compiled with compass and placed into public/stylesheets. Note that stylesheets often reference images in various tags, such as background. Those images in our system are added to public/assets/images. There’re a few good articles that dwell into SASS itself, including this one.

CoffeeScript to JavaScript

CoffeeScript is a language that compiles into JavaScript. We use Backbone.js heavily and write all javascript in coffee. Our files live in app/coffeescripts. Coffeescript is compiled with barista and the output is placed into public/javascripts.

Asset Packaging

CSS files generated by SASS and JS files generated by coffee are packaged together by Jammit. The latter uses a configuration file to specify what is to be packaged and how. Here’s a simplified version of our config/assets.yml .

  1. package_assets: on
  2.  
  3. javascripts:
  4.   
  5.   vendor:
  6.     - public/javascripts/vendor/plugins/**/*.js
  7.  
  8.   common:
  9.     - public/javascripts/models/**/*.js
  10.     - public/javascripts/views/**/*.js
  11.     - public/javascripts/controllers/**/*.js
  12.     
  13. stylesheets:
  14.   
  15.   common:
  16.     - public/stylesheets/common/global.css
  17.     - public/stylesheets/common/forms.css
  18.     - public/stylesheets/plugins/jquery-ui.css
  19.     
  20.   client:
  21.     - public/stylesheets/common/client.css
  22.     - public/stylesheets/plugins/jquery.autocomplete.css

Generating Assets

We use a simple Rake task to generate assets heavily inspired by this gist (bonus clean task included). I can run rake assets and get output in public/assets that is a mix of checked in (eg. public/assets/images) and generated (eg. public/assets/common.js and common.js.gz) files.

  1. desc "Compiles CoffeeScript using Barrista (but only if they changed)"
  2. task 'coffee:compile' => :environment do
  3.   abort "'#{Barista::Compiler.bin_path}' is unavailable." unless Barista::Compiler.available?
  4.   Barista.compile_all! false, false
  5. end
  6.  
  7. desc "Compiles SASS using Compass"
  8. task 'sass:compile' do
  9.   system 'compass compile'
  10. end
  11.  
  12. namespace :assets do
  13.   desc "Compiles all assets (CSS/JS)"
  14.   task :compile => ['coffee:compile', 'sass:compile']
  15.   
  16.   desc "Bundles all assets with Jammit"
  17.   task :bundle => :environment do
  18.     system "cd #{Rails.root} && jammit"
  19.   end
  20.   
  21.   desc "Removes all compiled and bundled assets"
  22.   task :clean => :environment do
  23.     files = []
  24.     files << ['assets']
  25.     files << ['javascripts', 'compiled']
  26.     files << ['stylesheets', 'compiled']
  27.     files = files.map { |path| Dir[Rails.root.join('public', *path, '*.*')] }.flatten
  28.     
  29.     puts "Removing:"
  30.     files.each do |file|
  31.       puts "  #{file.gsub(Rails.root.to_s + '/', '')}"
  32.     end
  33.     
  34.     File.delete *files
  35.   end
  36.  
  37. end
  38.  
  39. desc "Compiles and bundles all assets"
  40. task :assets => ['assets:compile', 'assets:bundle']

Amazon Simple Storage Service (S3) + CloudFront

Amazon S3 offers virtually infinite storage and a way to distribute content worldwide with CloudFront. You upload a file to an S3 bucket and it gets distributed worldwide to region-based endpoints. This way if someone from Japan hits your server, you can serve static files from a local data center in Japan. In addition, unlike S3, CloudFront can negotiate content - we can package assets into .gz files and serve those when the browser is capable of receiving compressed content. The latter makes mobile experience magnitude times better.

We’ve distributed our bucket on the CloudFront dashboard on Amazon and pointed our own DNS server for static.example.com to somemagicnumber.cloudfront.net. This way I can go to http://static.example.com/assets/common.css and see the assets/common.css file that I have uploaded to S3.

Distributing to CloudFront has one major caveat: data is cached for at least 24 hours, so you cannot use regular cache-busting techniques, such as appending a timestamp to every url. This is where things will get complicated. In the meantime, lets tell our application to use an environment variable CLOUDFRONT_URL where available or S3_BUCKET otherwise. This can be done in config/environment.rb before Application.initialize!.

  1. Example::Application.configure do
  2.  
  3.   cloudfront_url = ENV["CLOUDFRONT_URL"]
  4.   s3_bucket = ENV["S3_BUCKET"]
  5.   
  6.   if cloudfront_url
  7.     config.action_controller.asset_host = cloudfront_url
  8.   elsif s3_bucket
  9.     # Serve assets from Amazon S3
  10.     config.action_controller.asset_host = "http://" + s3_bucket + ".s3.amazonaws.com"
  11.   end
  12.  
  13. end

We can now configure Heroku applications with either a CLOUDFRONT_URL or an S3_BUCKET depending on the environment.

  1. heroku config:add CLOUDFRONT_URL=http://static.example.com

Examining the page source we’ll see that http://static.example.com has been inserted for all references to .css files.

Versioning and Cache Busting

If I update the assets/common.css file in S3, changes won’t appear on static.example.com for at least 24 hours. That’s not going to work for continuous deployment.

The solution is to version our asset files. We chose to use the git hash.

  1. $ git rev-parse HEAD
  2. 50fd3fcfa592eaac16cce6b3c508cf2487749bb0

Instead of uploading assets/common.css we’ll upload assets/50fd3fcfa592eaac16cce6b3c508cf2487749bb0/common.css. The rake task that performs the upload will also delete the existing files that are being replaced by a new hash. We typically run something like rake assets:push:to_staging. Note that this task needs some pre-requisite functions, read this post for getting started with S3 and Rake.

  1. # uploads assets to s3 under assets/githash, deletes stale assets
  2. task :uploadToS3, [ :to ] => :environment do |t, args|
  3.   from = File.join(Rails.root, 'public/assets')
  4.   to = args[:to]
  5.   hash = (`git rev-parse HEAD` || "").chomp
  6.   
  7.   logger.info("[#{Time.now}] fetching keys from #{to}")
  8.   existing_objects_hash = {}
  9.   s3i.incrementally_list_bucket(to) do |response|
  10.     response[:contents].each do |existing_object|
  11.       next unless existing_object[:key].start_with?("assets/")
  12.       existing_objects_hash[existing_object[:key]] = existing_object
  13.     end
  14.   end
  15.  
  16.   logger.info("[#{Time.now}] copying from #{from} to s3:#{to} @ #{hash}")
  17.   Dir.glob(from + "/**/*").each do |entry|
  18.     next if File::directory?(entry)
  19.     key = 'assets/'
  20.     key += (hash + '/') if hash
  21.     key += entry.slice(from.length + 1, entry.length - from.length - 1)
  22.     existing_objects_hash.delete(key)
  23.     logger.info("[#{Time.now}] uploading #{key}")
  24.     s3i.put(to, key, File.open(entry), { 'x-amz-acl' => 'public-read' })
  25.   end
  26.   
  27.   existing_objects_hash.keys.each do |key|
  28.     puts "deleting #{key}"
  29.     s3i.delete(to, key)
  30.   end
  31. end
  32.     
  33. namespace :push do
  34.   task :to_staging => [ :environment, :assets ] do
  35.     Rake::Task["assets:uploadToS3"].execute({ to: 'example-staging' })
  36.   end
  37.   task :to_production => [ :environment, :assets ] do
  38.     Rake::Task["assets:uploadToS3"].execute({ to: 'example-production' })
  39.   end
  40. end

We now need to tell our Rails application about this hash. We’ll adjust our config/environment.rb as follows.

  1. asset_hash = ENV["ASSET_HASH"]
  2. if asset_hash  
  3.   config.action_controller.asset_path = proc { |asset_path|
  4.     asset_path.gsub("assets/", "assets/#{asset_hash}/")
  5.   }
  6. end

Lets set ASSET_HASH on Heroku with the hash value before every deployment.

  1. namespace :heroku do
  2.  
  3.   namespace :hash do
  4.     task :to_production => [ :environment, :assets ] do
  5.       Rake::Task["heroku:hash:addHashToEnvironment"].execute({ app: 'example-production' })
  6.     end
  7.     task :to_staging => [ :environment ] do
  8.       Rake::Task["heroku:hash:addHashToEnvironment"].execute({ app: 'example-staging' })
  9.     end
  10.     task :addHashToEnvironment, [ :app ]  => [ :environment ] do |t, args|
  11.       hash = (`git rev-parse HEAD` || "").chomp
  12.       Rake::Task["heroku:config:add"].execute({ app: args[:app], value: "ASSET_HASH=#{hash}" })
  13.     end
  14.   end
  15.   
  16.   namespace :config do
  17.     desc "Set a configuration parameter on Heroku"
  18.     task :add, [ :app, :value ] => :environment do |t, args|
  19.       app = "--app #{args[:app]}" if args[:app]
  20.       value = args[:value]
  21.       logger.debug("[#{Time.now}] running 'heroku config:add #{app} #{value}'")
  22.       `heroku config:add #{app} #{value}`
  23.     end
  24.   end
  25.     
  26. end

Assets are now served from a new directory with every deployment, effectively working around the CloudFront cache limitations.

Unpleasant Surprise with Static Images

After implementing the hashing solution I had an unpleasant surprise: CSS files were referencing static images with absolute paths, such as /assets/images/logo.png. This doesn’t work because it renders http://static.example.com/assets/images/logo.png. There’s no way to insert the hash into this at compile time (chicken-and-egg problem). No big deal, we can just make this path relative, right? Unfortunately Jammit rewrites relative paths (#167) which transforms assets/images/logo.png into ../stylesheets/images/logo.png. Fortunately it’s open-source, so I added a new rewrite_relative_paths option on my fork.

Adding rewrite_relative_paths = off to config/assets.yml causes Jammit to leave the relative URLs alone.

Heroku Predeploy

Lets summarize what happens for us with every deployment.

  1. Copy our production database to the target environment unless we’re deploying to production [blog post].
  2. Synchronize image data (we have lots of images) between the production and the target S3 bucket unless we’re deploying to production [blog post].
  3. Push assets to S3 under the current git hash (see above).
  4. Set ASSET_HASH on the target Heroku app (see above).

We wrap this up in a heroku:predeploy task and use Heroku-Bartender to deploy.

Suggestions for improvements always welcome!