Daniel Doubrovkine bio photo

Daniel Doubrovkine

aka dB., @awscloud, former CTO @artsy, +@vestris, NYC

Email Twitter LinkedIn Github Strava
Creative Commons License

When someone says that you need a “100 grams” to figure something out, it means that it’s completely unobvious and complicated and therefore you need a few vodka shots. Vodka in Russia is measured in grams – 100g being something you would casually drink for breakfast. Alright, today we’re going to figure out how to make Amazon Cloudfront actually work with static assets, Rails, Jammit and Heroku. Get your vodka glass, you’re going to need one.

First some background.

Syntactically Awesome Stylesheets (SASS) to CSS

SASS is a way to author CSS. We write all stylesheets using SASS and place them into app/stylesheets. These are compiled with compass and placed into public/stylesheets. Note that stylesheets often reference images in various tags, such as background. Those images in our system are added to public/assets/images. There’re a few good articles that dwell into SASS itself, including this one.

CoffeeScript to JavaScript

CoffeeScript is a language that compiles into JavaScript. We use Backbone.js heavily and write all javascript in coffee. Our files live in app/coffeescripts. Coffeescript is compiled with barista and the output is placed into public/javascripts.

Asset Packaging

CSS files generated by SASS and JS files generated by coffee are packaged together by Jammit. The latter uses a configuration file to specify what is to be packaged and how. Here’s a simplified version of our config/assets.yml .

package_assets: on

javascripts:

  vendor:
    - public/javascripts/vendor/plugins/\*\*/\*.js

  common:
    - public/javascripts/models/\*\*/\*.js
    - public/javascripts/views/\*\*/\*.js
    - public/javascripts/controllers/\*\*/\*.js

stylesheets:

  common:
    - public/stylesheets/common/global.css
    - public/stylesheets/common/forms.css
    - public/stylesheets/plugins/jquery-ui.css

  client:
    - public/stylesheets/common/client.css
    - public/stylesheets/plugins/jquery.autocomplete.css

Generating Assets

We use a simple Rake task to generate assets heavily inspired by this gist (bonus clean task included). I can run rake assets and get output in public/assets that is a mix of checked in (eg. public/assets/images) and generated (eg. public/assets/common.js and common.js.gz) files.

desc "Compiles CoffeeScript using Barrista (but only if they changed)"
task 'coffee:compile' => :environment do
  abort "'#{Barista::Compiler.bin_path}' is unavailable." unless Barista::Compiler.available?
  Barista.compile_all! false, false
end

desc "Compiles SASS using Compass"
task 'sass:compile' do
  system 'compass compile'
end

namespace :assets do
  desc "Compiles all assets (CSS/JS)"
  task :compile => ['coffee:compile', 'sass:compile']

  desc "Bundles all assets with Jammit"
  task :bundle => :environment do
    system "cd #{Rails.root} && jammit"
  end

  desc "Removes all compiled and bundled assets"
  task :clean => :environment do
    files = []
    files << ['assets']
    files << ['javascripts', 'compiled']
    files << ['stylesheets', 'compiled']
    files = files.map { |path| Dir[Rails.root.join('public', \*path, '\*.\*')] }.flatten

    puts "Removing:"
    files.each do |file|
      puts "  #{file.gsub(Rails.root.to_s + '/', '')}"
    end

    File.delete \*files
  end

end

desc "Compiles and bundles all assets"
task :assets => ['assets:compile', 'assets:bundle']

Amazon Simple Storage Service (S3) + CloudFront

Amazon S3 offers virtually infinite storage and a way to distribute content worldwide with CloudFront. You upload a file to an S3 bucket and it gets distributed worldwide to region-based endpoints. This way if someone from Japan hits your server, you can serve static files from a local data center in Japan. In addition, unlike S3, CloudFront can negotiate content - we can package assets into .gz files and serve those when the browser is capable of receiving compressed content. The latter makes mobile experience magnitude times better.

We’ve distributed our bucket on the CloudFront dashboard on Amazon and pointed our own DNS server for static.example.com to somemagicnumber.cloudfront.net. This way I can go to https://static.example.com/assets/common.css and see the assets/common.css file that I have uploaded to S3.

Distributing to CloudFront has one major caveat: data is cached for at least 24 hours, so you cannot use regular cache-busting techniques, such as appending a timestamp to every url. This is where things will get complicated. In the meantime, lets tell our application to use an environment variable CLOUDFRONT_URL where available or S3_BUCKET otherwise. This can be done in config/environment.rb _before _Application.initialize!.

Example::Application.configure do

  cloudfront_url = ENV["CLOUDFRONT_URL"]
  s3_bucket = ENV["S3_BUCKET"]

  if cloudfront_url
    config.action_controller.asset_host = cloudfront_url
  elsif s3_bucket
    # Serve assets from Amazon S3
    config.action_controller.asset_host = "https://" + s3_bucket + ".s3.amazonaws.com"
  end

end

We can now configure Heroku applications with either a CLOUDFRONT_URL or an S3_BUCKET depending on the environment.

heroku config:add CLOUDFRONT_URL=https://static.example.com

Examining the page source we’ll see that https://static.example.com has been inserted for all references to .css files.

Versioning and Cache Busting

If I update the assets/common.css file in S3, changes won’t appear on static.example.com for at least 24 hours. That’s not going to work for continuous deployment.

The solution is to version our asset files. We chose to use the git hash.

$ git rev-parse HEAD
50fd3fcfa592eaac16cce6b3c508cf2487749bb0

Instead of uploading assets/common.css we’ll upload assets/50fd3fcfa592eaac16cce6b3c508cf2487749bb0/common.css. The rake task that performs the upload will also delete the existing files that are being replaced by a new hash. We typically run something like rake assets:push:to_staging. Note that this task needs some pre-requisite functions, read this post for getting started with S3 and Rake.

# uploads assets to s3 under assets/githash, deletes stale assets
task :uploadToS3, [:to] => :environment do |t, args|
  from = File.join(Rails.root, 'public/assets')
  to = args[:to]
  hash = (`git rev-parse HEAD` || "").chomp

  logger.info("[#{Time.now}] fetching keys from #{to}")
  existing_objects_hash = {}
  s3i.incrementally_list_bucket(to) do |response|
    response[:contents].each do |existing_object|
      next unless existing_object[:key].start_with?("assets/")
      existing_objects_hash[existing_object[:key]] = existing_object
    end
  end

  logger.info("[#{Time.now}] copying from #{from} to s3:#{to} @ #{hash}")
  Dir.glob(from + "/\*\*/\*").each do |entry|
    next if File::directory?(entry)
    key = 'assets/'
    key += (hash + '/') if hash
    key += entry.slice(from.length + 1, entry.length - from.length - 1)
    existing_objects_hash.delete(key)
    logger.info("[#{Time.now}] uploading #{key}")
    s3i.put(to, key, File.open(entry), { 'x-amz-acl' => 'public-read' })
  end

  existing_objects_hash.keys.each do |key|
    puts "deleting #{key}"
    s3i.delete(to, key)
  end
end

namespace :push do
  task :to_staging => [:environment, :assets] do
    Rake::Task["assets:uploadToS3"].execute({ to: 'example-staging' })
  end
  task :to_production => [:environment, :assets] do
    Rake::Task["assets:uploadToS3"].execute({ to: 'example-production' })
  end
end

We now need to tell our Rails application about this hash. We’ll adjust our config/environment.rb as follows.

asset_hash = ENV["ASSET_HASH"]
if asset_hash
  config.action_controller.asset_path = proc { |asset_path|
    asset_path.gsub("assets/", "assets/#{asset_hash}/")
  }
end

Let’s set ASSET_HASH on Heroku with the hash value before every deployment.

namespace :heroku do

  namespace :hash do
    task :to_production => [:environment, :assets] do
      Rake::Task["heroku:hash:addHashToEnvironment"].execute({ app: 'example-production' })
    end
    task :to_staging => [:environment] do
      Rake::Task["heroku:hash:addHashToEnvironment"].execute({ app: 'example-staging' })
    end
    task :addHashToEnvironment, [:app]  => [:environment] do |t, args|
      hash = (`git rev-parse HEAD` || "").chomp
      Rake::Task["heroku:config:add"].execute({ app: args[:app], value: "ASSET_HASH=#{hash}" })
    end
  end

  namespace :config do
    desc "Set a configuration parameter on Heroku"
    task :add, [:app, :value] => :environment do |t, args|
      app = "--app #{args[:app]}" if args[:app]
      value = args[:value]
      logger.debug("[#{Time.now}] running 'heroku config:add #{app} #{value}'")
      `heroku config:add #{app} #{value}`
    end
  end

end

Assets are now served from a new directory with every deployment, effectively working around the CloudFront cache limitations.

Unpleasant Surprise with Static Images

After implementing the hashing solution I had an unpleasant surprise: CSS files were referencing static images with absolute paths, such as /assets/images/logo.png. This doesn’t work because it renders https://static.example.com/assets/images/logo.png. There’s no way to insert the hash into this at compile time (chicken-and-egg problem). No big deal, we can just make this path relative, right? Unfortunately Jammit rewrites relative paths (#167) which transforms assets/images/logo.png into ../stylesheets/images/logo.png. Fortunately it’s open-source, so I added a new rewrite_relative_paths option on my fork.

Adding rewrite_relative_paths = off to config/assets.yml causes Jammit to leave the relative URLs alone.

Heroku Predeploy

Let’s summarize what happens for us with every deployment.

  1. Copy our production database to the target environment unless we’re deploying to production. [blog post]
  2. Synchronize image data (we have lots of images) between the production and the target S3 bucket unless we’re deploying to production. [blog post]
  3. Push assets to S3 under the current git hash (see above).
  4. Set ASSET_HASH on the target Heroku app (see above).

We wrap this up in a heroku:predeploy task and use Heroku-Bartender to deploy.

Suggestions for improvements always welcome!