Octopress: Setting up a Blog and Contributing to an Existing One Draft Hidden Sticky

Posted Tuesday, 17 January 2012 | Post Comment |

Octopress documentation can be quite confusing. It took me a while to understand what the heck Octopress is doing to branches and remote origins. It’s actually pretty simple, so I am going to try to un-confuse you. I will also show you a better way to contribute to an existing blog and explain what’s happening in those Rake tasks.

We are going to deploy a blog to Github pages, so we need a project, such as username.github.com. Go to Github to create one. Use your username instead of “username”.

Next, fetch Octopress and install it locally. This gets the files from its main repository and applies a default theme.

  1. $ git clone git://github.com/imathis/octopress.git octopress
  2.  
  3. Cloning into octopress...
  4. remote: Counting objects: 6046, done.
  5. remote: Compressing objects: 100% (2420/2420), done.
  6. remote: Total 6046 (delta 3448), reused 5549 (delta 3097)
  7. Receiving objects: 100% (6046/6046), 1.26 MiB | 426 KiB/s, done.
  8. Resolving deltas: 100% (3448/3448), done.
  9.  
  10. $ cd octopress
  11. Using /home/dblock/.rvm/gems/ruby-1.9.2-p290
  12.  
  13. octopress$ bundle install
  14. Fetching source index for http://rubygems.org/
  15. ...
  16. Your bundle is complete! Use `bundle show [gemname]` to see where a bundled gem is installed.
  17.  
  18. octopress$ rake install
  19. ## Copying classic theme into ./source and ./sass

Octopress comes with some handy Rake tasks to get you started. To deploy to Github pages run rake setup_github_pages. When prompted, enter the GIT URL to your new repository, such as git@github.com:username/username.github.com.git.

  1. octopress$ rake setup_github_pages
  2. Enter the read/write url for your repository: git@github.com:username/username.github.com.git
  3.  
  4. Added remote git@github.com:username/username.github.com.git as origin
  5. Set origin as default remote
  6. Master branch renamed to 'source' for committing your blog source files
  7. Initialized empty Git repository in /home/username/source/octopress/_deploy/.git/
  8. [master (root-commit) 2a4e9e7] Octopress init
  9. 1 files changed, 1 insertions(+), 0 deletions(-)
  10. create mode 100644 index.html
  11.  
  12. ---
  13. ## Now you can deploy to http://username.github.com with `rake deploy` ##

So what the heck happened here? It pointed our clone to our new repository. It also created a _deploy directory with another git repository that is going to contain everything that is being deployed. The remote in that directory is the same as the one in our octopress directory, but the checked out branch is master. Btw, we’re now on the source branch.

  1. octopress$ git remote -v
  2. octopress    git://github.com/imathis/octopress.git (fetch)
  3. octopress    git://github.com/imathis/octopress.git (push)
  4. origin    git@github.com:username/username.github.com.git (fetch)
  5. origin    git@github.com:username/username.github.com.git (push)
  6.  
  7. octopress$ git branch
  8. * source
  9.  
  10. octopress$ cd _deploy/
  11. octopress/_deploy$ git remote -v
  12. origin    git@github.com:username/username.github.com.git (fetch)
  13. origin    git@github.com:username/username.github.com.git (push)
  14.  
  15. octopress/_deploy$ git branch
  16. * master
  17.  
  18. octopress/_deploy$ cd ..
  19.  
  20. octopress$

Now is a good time to read Blogging Basics. You should edit _config.yml with your blog name, etc. Lets create an article and deploy it.

  1. octopress$ rake new_post["New Post"]
  2. Creating new post: source/_posts/2012-01-17-new-post.markdown

Edit the generated file and add some text at the bottom.

Generate the blog.

  1. octopress$ rake generate
  2. ## Generating Site with Jekyll
  3. directory source/stylesheets/
  4.    create source/stylesheets/screen.css
  5. Configuration from /home/dblock/source/o/octopress/_config.yml
  6. Building site: source -> public
  7. Successfully generated site: source -> public

You can also preview it with rake preview.

Before we deploy the blog, save the source and push it to Github. Note that we’re pushing our source branch.

  1. octopress$ git add .
  2.  
  3. octopress$ git commit -m "Initial blog post."
  4. ...
  5.  
  6. octopress$ git push origin source
  7. Counting objects: 3927, done.
  8. Compressing objects: 100% (1412/1412), done.
  9. Writing objects: 100% (3927/3927), 910.08 KiB, done.
  10. Total 3927 (delta 2257), reused 3848 (delta 2203)
  11. To git@github.com:username/username.github.com.git
  12. * [new branch]      source -> source

You’ll have to repeat the above every time you make changes, to save them.

Deploy the blog. What this does it rake everything inside _deploy and push it onto the master branch.

  1. octopress$ rake deploy
  2.  
  3. ## Pushing generated _deploy website
  4. Counting objects: 84, done.
  5. Compressing objects: 100% (74/74), done.
  6. Writing objects: 100% (84/84), 180.40 KiB, done.
  7. Total 84 (delta 2), reused 0 (delta 0)
  8. To git@github.com:username/username.github.com.git
  9. * [new branch]      master -> master

If you go to http://username.github.com you should see your blog with the blog post once Github has regenerated the pages – usually a minute or two. And on https://github.com/username/username.github.com you should be able to see the generated files on master along with a source branch with the blog source.

You’ll have to do this every time you want to deploy your changes.

So how does one start contributing to an existing Octopress blog (or yourself from a new computer)? What we want is the same setup as above, but not from scratch.

  1. $ git clone git@github.com:username/username.github.com.git
  2. $ cd username.github.com
  3. username.github.com$ git checkout source
  4. username.github.com$ mkdir _deploy
  5. username.github.com$ cd _deploy
  6. username.github.com/_deploy$ git init
  7. username.github.com/_deploy$ git remote add origin git@github.com:username/username.github.com.git
  8. username.github.com/_deploy$ git pull origin master
  9. username.github.com/_deploy$ cd ..
  10. username.github.com$

You’re all set. Create posts and stuff. Happy blogging with Octopress.

 

Octopress: Blogging Evolution Draft Hidden Sticky

Posted Tuesday, 17 January 2012 | Post Comment |

In 93’ I wrote a Guestbook CGI in C++. Then in 94’ I made my first blogging system. I honestly don’t remember what technology it used, but I think it was some hairy PHP with a ton of issues. Data must have been stored in text files or something like that. The technology has since evolved and got hours of work poured into it. This blog’s code is here. But don’t use it. It all works, but it’s proprietary and riddled with legacy. I would have done something very different today. So …

I wanted something modern, fresh, nerdy and open-source for the new Art.sy Engineering blog. Last week I played with Octopress. It was confusing at first, but now feels completely natural and clean. Octopress is fully integrated with Git and Github pages, so we love it. Check it out. Bookmark it. Subscribe to its RSS.

image

There’s also an awesome first post by @mmcnierney14 on responsive layout with CSS3 and a collection of our open-source projects.

 

Paginating w/ Mongoid 2.4.0, MongoMapper and Kaminari Draft Hidden Sticky

Posted Sunday, 15 January 2012 | Post Comment |

I’ve upgraded our project from Mongoid 2.0.2 to 2.4.0. It took me a few days since our specs raised a couple of real issues. If you’re doing the same, take Mongoid from the tip of 2.4.0-Stable.

If you remember, 2.0.2 dropped pagination support and a helpful Kaminari gem took over (details here). Once again the upgrade had a surprise, the number of items on the current page was wrong, displaying the entire count of a collection. I thought this was a bug in Mongoid and created #1584. Turns out that the behavior of count on a Mongoid::Criteria is now aligned with the Ruby driver, which takes a curious boolean skip_and_limit parameter that basically says whether to take limit and skip options into account (doc here). So calling Foo.limit(1).count may return 10 if there’re 10 Foos. The fix is to call Foo.limit(1).count(true). I am going to guess this was a bug in the Mongo driver and the additional of a boolean was a clever fix hack?

Kaminari needed to pass the boolean, which meant adding a current_page_count to the Kamiari collection wrapper, pulled in #194. Next version (probably 0.14.0) will have the fix. In the meantime, I am not super happy with my implementation:

  • It’s not possible to know whether count takes a parameter, method(:count).arity doesn’t provide enough indication for optional parameters, so the code relies on a ArgumentError.
  • MongoMapper needed the same fix, but is lacking a way to pass count to the driver. The current implementation calls to_a, which can’t be good when you just want a count.
 

Warning: Toplevel Constant XYZ Referenced Admin:XYZ Draft Hidden Sticky

Posted Saturday, 07 January 2012 | Post Comment |

I posted this to a Ruby forum a while ago.

I got controllers in a namespace and controllers outside of the namespace. For example, I have a PagesController and a Admin::PagesController. When I run rspec from the top, tests pass and I get the following warning: spec/controllers/admin/pages_controller_spec.rb:4: warning: toplevel constant PagesController referenced by Admin::PagesController. This makes no sense. I do have a PagesController and an Admin::PagesController and specs for both that are declared properly.

This was only happening under Spork, so I posted a similar question to the sporkgem list.

I also found a workaround, to require the Admin controllers first in spec/spec_helper.rb.

  1. Dir[File.expand_path("app/controllers/admin/*.rb")].each do |file|
  2.   require file
  3. end

Finally, @tilsammans figured it out. It’s the same problem as what I have: an Admin namespace and an Admin class.

It was because I also had a class Admin, as well as a namespace Admin. Since Admin was a class (a model) it inherited from Object which made the top-level ApplicationController available inside the Admin namespace. The reply by Andrew White on http://groups.google.com/group/rubyonrails-core/browse_thread/thread/bab5e87ee10d2ecb lead me to find the right answer. In the end I renamed Admin to AdminUser and everything fell into place.

This is rather counterintuitive and one would think Ruby should somehow handle this situation, but it at least makes technical sense.

 

Fabricating Spec Failures Draft Hidden Sticky

Posted Friday, 06 January 2012 | Post Comment |

I love fabricators. We use the awesome fabrication gem that lets you do some pretty neat things in Rails specs.

For example, we have a page that lists users in alphabetical order. The retrieval is implemented in a controller.

  1. def index
  2.   @users = User.asc(:name)
  3. end

The fabricator for a User generates a global sequence to give each user a unique name.

  1. Fabricator(:user) do
  2.   name { Fabricate.sequence(:name) { |i| "Joe #{i}" } }
  3. end

To test the above-mentioned controller, we would fabricate a couple users and ensure that they are returned in the correct order.

  1. it "returns users in alphabetical order"
  2.     user1 = Fabricate :user
  3.     user2 = Fabricate :user
  4.     get :index
  5.     assigns(:users).should eq [ user1, user2 ]
  6. end

What could possibly go wrong here?

I made a beginner mistake that just looks like someone else’s fault (specs fail depending on which order you run them). To get a failed test we fabricate 8 users before this spec is run. The next two users that are fabricated are Joe 9 and Joe 10. When sorted alphabetically Joe 10 comes before Joe 9, duh. It’s a good lesson in not relying on external behavior for tests – in this case we should not rely on knowing how names are generated in a fabricator to test that the users are sorted by name. Instead, we should assign names explicitly.

  1. it "returns users in alphabetical order"
  2.     user1 = Fabricate :user, name: "A"
  3.     user2 = Fabricate :user, name: "B"
  4.     get :index
  5.     assigns(:users).should eq [ user1, user2 ]
  6. end
 

Blame it all on MongoDB Draft Hidden Sticky

Posted Tuesday, 20 December 2011 | Post Comment |

You may have read my previous post about the MongoDB 1.4.x Ruby driver hell.

We rolled back to 1.3.1 and were running fine in production for a long time. On Friday, we started seeing intermittent deadlock: recursive locking errors from the driver and our site was struggling to stay up. Very quickly the error rate rendered it unusable. A Google search yielded Mongo Ruby driver bug RUBY-274, describing the exact error, which pointed a Ruby 1.9.2 threading issue #4266, explained in this blog post.

We were confused why this suddenly started happening with no apparent reason, created a ticket with our MongoDB provider MongoHQ and bounced the replica set members one-by-one as well as our app on Heroku. It did nothing.

Kyle, the maintainer of the Ruby driver at 10gen was replying to RUBY-274 and told us to upgrade the driver to 1.5.2. We did. All tests passed (we have over 2000) and our staging site was operating normally. But after pushing it in production where we have a replica set, we were now seeing a different error: stack level too deep, with /app/.bundle/gems/ruby/1.9.1/gems/mongo-1.5.2/lib/mongo/util/pool.rb:72 on top of a cut-off stack trace. By then I haven’t gotten up from my chair for six hours straight and you bet I was thinking the Ruby driver was the worst piece of crap as I was angrily typing RUBY-393, a knee-jerk reaction.

With a bit of calm and holding onto a better error someone on my team dug through this and hit the the root cause. We made a mistake in a data model and ended up with a recursion in a query. Instead of reporting an expected stack level too deep error for this one particular request the Ruby 1.3.1 driver blew up with deadlock: recursive locking and no stack trace, caused by a bug in Ruby 1.9.2, while other queries would immediately start failing bringing the entire site down with this error. The newer version of the driver did much better and only failed the specific query, which was much easier to diagnose. Pilot error – lessons learned. Sincere apologies to everyone involved at 10gen and MongoHQ – you guys were there when I needed you.

The 1.5.2 driver has been running rock solid over the week-end. I am hearing good things about it from other teams too. If you’re on 1.3.1 or 1.4.x, you should consider upgrading.

 

Grape vs. Webmachine Draft Hidden Sticky

Posted Tuesday, 13 December 2011 | Post Comment |

One of my favorite talks at QCon 2011 was about Webmachine. I was very curious to see what those well-disciplined Erlang people had come up with. At the end of the talk I had learned that Webmachine used a resource-based model that enabled well-behaved HTTP applications, which is RESTful by definition. So I went to NYC.rb today to hear about the Ruby version of Webmachine and to write a post about how these two frameworks compare.

Should you build your next RESTful API with Grape or Webmachine?

Both frameworks as saying that you should not force HTTP onto an MVC-shaped application. Both excel at serving HTTP resources.

Webmachine is an executable model for HTTP, while Grape is a DSL for RESTful APIs. This means that in Webmachine you don’t perform actions – you declare resources. In Grape you declare API methods and fill out the responses. In Grape you have to be disciplined about those API methods - they should represent resources, not RPC service endpoints. More differences appear in branching: halting execution in Webmachine is done by returning appropriate answers in resource-specific functions, while halting execution in Grape is done by throwing a specific exception that carries an HTTP error code. Routing-wise, In Webmachine you map URIs to resources, while in Grape you define namespaces and method paths that translate into invisible routes. In Webmachine you implement resource callbacks, while in Grape you use procedural logic within the API method implementation. Webmachine is trying to be a complete executable model and is therefore more structured, while Grape wants you to use middleware for aspects such as ETag-based caching and doesn’t try to prevent you from jumping in the water when you don’t know how to swim.

We had a long discussion about this outside of Pivotal Labs with @seancribbs and @johnjoseph (who even mentioned Prolog at some point). It helped me frame my opinion around mostly philosophical differences between the two frameworks. I could very well use Webmachine to build an API and be very happy with it (I would not be happy building an API in Rails). I would grant Webmachine an advantage over purity from the developer’s perspective – it’s harder to step outside of the programming model. I would grant Grape an advantage over favoring the API consumer, since it focuses on the expressiveness of the API. For example, Grape now has self-introspection for automatically generating documentation, a feature that seems harder and maybe even unnatural to build for Webmachine.

Fundamentally, Webmachine declares resources served via HTTP, while Grape declares an API. Your choice?

 

Measuring Activity in Open-Source Projects using Github Network Graph Draft Hidden Sticky

Posted Sunday, 11 December 2011 | Post Comment |

When choosing to use an open source project you might want to know whether it’s still developed or at least maintained. For Ruby projects I used to check Rubygems release dates. Here’s Grape’s.

image

Nothing for the last six months? Not very good – there hasn’t been a release for a while, but that’s more a tribute to the stability of the project.

You could check the number of forks and watchers for projects on Github.

image

It measures the project’s popularity quite well, but maybe not its activity.

My favorite way of measuring a project’s activity is to look at the Github network graph. It’s an amazing and useful feature. Here’s Grape’s.

image

The project is clearly happening. But you can see how this is all over the place – that’s a typical picture for open-source efforts: few core and relatively irregular contributors, long branches and abandoned ends for feature attempts that don’t get merged.

When a small group of people truly collaborates, their feature branches make it into master most of the time. They are also constantly picking up commits from the source. What does a really dense collaborative project look like? Here’s a picture from one of our private repositories. If yours looks like this, you got a team!

image

Maybe someone can use this idea to build a nice feature to measure a Github project density?

 

Grape: Describing and Documenting an API Draft Hidden Sticky

Posted Sunday, 11 December 2011 | Post Comment |

Building a software platform is not just an investment in the future, it’s a software architecture philosophy. A proper API is a manifestation of some of the core principles of domain driven design – spend a lot of time figuring out what your domain is, then build software that represents the immutable concepts behind an API and, finally, implement different businesses that can quickly thrive, die or pivot, on top of that. We’ve spent considerable amounts of time iterating on our own API and are constantly improving the artifacts around it as we learning from good examples of Twilio, Stripe, etc.

There’re several ways to build an API reference: entirely by hand, generated from code comments or by adding metadata at runtime. The first one is inanity and the second one is not leveraging the magic of Ruby. Hence I am a huge fan of the latter, as it offers the best chance of creating something that actually reflects code.

You can now do this in Grape with desc blocks.

  1. # DELETE /api/v1/thing/:id
  2. desc "Delete an existing thing.", {
  3.   :params => {
  4.     "id" => { :description => "Thing id.", :required => true }
  5.   }
  6. }
  7. delete ":id" do
  8.   thing = Thing.find(params[:id])
  9.   error!('Thing Not Found', 404) unless thing
  10.   thing.destroy
  11.   thing.as_json
  12. end

Aside from the description passed to desc, you can specify a hash with anything in it. There’re a few conventions, such as :params, which will merge with any values specified in the URL of the API call.

We can introspect the API at runtime, adding a Rake task, for example, that lists all API calls with their parameters.

  1. namespace :api do
  2.   desc "Displays all API methods."
  3.   task 'routes' => :environment do
  4.     Api.routes.each do |route|
  5.       route_path = route.route_path.gsub('(.:format)', '').gsub(':version', route.route_version)
  6.       puts "#{route.route_method} #{route_path}"
  7.       puts " #{route.route_description}" if route.route_description
  8.       if route.route_params.is_a?(Hash)
  9.         params = route.route_params.map do |name, desc|
  10.           required = desc.is_a?(Hash) ? desc[:required] : false
  11.           description = desc.is_a?(Hash) ? desc[:description] : desc.to_s
  12.           [ name, required, "   * #{name}: #{description} #{required ? '(required)' : ''}" ]
  13.         end
  14.         puts "  parameters:"
  15.         params.each { |p| puts p[2] }
  16.       end
  17.     end
  18.   end
  19. end

Notice how we’ve used the required option for parameters – it’s, once again, a convention. Grape doesn’t care – it’s pure metadata attached to a route. You can create similar conventions in your own API – we have some “partner” and “admin” APIs that we’ve marked in a similar manner.

This is now in the frontier branch of Grape, which is Grape v.next. Live on the edge.

 

Pushing Assets to S3 w/ Rake: Versioning and Cache Expiration Draft Hidden Sticky

Posted Saturday, 10 December 2011 | Post Comment |

A while ago I wrote about how we package and push Rails assets to Amazon S3. We version assets with the GIT hash – varying the assets by URL enables setting indefinite cache expiration and works well with a CDN. In that post you could find a Rake task that would delete any old assets and replace them with newer assets. It’s time for a revision with some new features.

The first problem we have solved is how long it takes to sync contents between a local folder and S3. The old task fetched the entire bucket file list, which grew quite a bit over time. The S3 API supports a prefix option.

  1. s3i.incrementally_list_bucket(to, prefix: "assets/") do |response|
  2.   response[:contents].each do |existing_object|
  3.     ...
  4.   end
  5. end

The second issue is with asset rollback. We deploy assets to S3 and then code to Heroku. The asset deployment deletes the old assets. There’s a small window in which we have old code and new assets, which is obviously not okay. We’re actually saved by CloudFront which keeps a cache for extended periods of time. A solution is to keep two copies of the assets online: current and previous. The code preserves the most recent copy by looking at the :last_modified field of the S3 object.

Here’s the task with some shortcuts and a complete task as a gist.

  1. # uploads assets to s3 under assets/githash, deletes stale assets
  2. task :uploadToS3, [ :to ] => :environment do |t, args|
  3.   from = File.join(Rails.root, 'public/assets')
  4.   to = args[:to]
  5.   hash = (`git rev-parse --short HEAD` || "").chomp
  6.   
  7.   logger.info("[#{Time.now}] fetching keys from #{to}")
  8.   existing_objects_hash = {}
  9.   existing_assets_hash = {}
  10.   s3i.incrementally_list_bucket(to, prefix: "assets/") do |response|
  11.     response[:contents].each do |existing_object|
  12.       existing_objects_hash[existing_object[:key]] = existing_object
  13.       previous_asset_hash = existing_object[:key].split('/')[1]
  14.       existing_assets_hash[previous_asset_hash] ||= DateTime.parse(existing_object[:last_modified])
  15.     end
  16.   end
  17.  
  18.   logger.info("[#{Time.now}] #{existing_assets_hash.count} existing asset(s)")
  19.   previous_hash = nil
  20.   existing_assets_hash.each_pair do |asset_hash, last_modified|
  21.     logger.info(" #{asset_hash} => #{last_modified}")
  22.     previous_hash = asset_hash unless (previous_hash and existing_assets_hash[previous_hash] > last_modified)
  23.   end
  24.   logger.info("[#{Time.now}] keeping #{previous_hash}") if previous_hash
  25.  
  26.   logger.info("[#{Time.now}] copying from #{from} to s3:#{to} @ #{hash}")
  27.   Dir.glob(from + "/**/*").each do |entry|
  28.     next if File::directory?(entry)
  29.     File.open(entry) do |entry_file|
  30.       content_options = {}
  31.       content_options['x-amz-acl'] = 'public-read'
  32.       content_options['content-type'] = MIME::Types.type_for(entry)[0]
  33.       key = 'assets/'
  34.       key += (hash + '/') if hash
  35.       key += entry.slice(from.length + 1, entry.length - from.length - 1)
  36.       existing_objects_hash.delete(key)
  37.       logger.info("[#{Time.now}]  uploading #{key}")
  38.       s3i.put(to, key, entry_file, content_options)
  39.     end
  40.   end
  41.   
  42.   existing_objects_hash.keys.each do |key|
  43.     next if previous_hash and key.start_with?("assets/#{previous_hash}/")
  44.     puts "deleting #{key}"
  45.     s3i.delete(to, key)
  46.   end
  47. end

Since we’re versioning assets with a GIT hash in the URL, another improvement is to set cache expiration to something longer.

  1. content_options['cache-control'] = "public, max-age=#{365*24*60*60}"
 
< | 1  2  3  4  5  6  7  8  9  10  11  ... >