Strong Opinions Weakly Typed

Andrew Warner

Software Engineer and technology enthusiast
CTO at Genius

Faster and Simpler With the Command Line: Deep-comparing Two 5GB JSON Files 3X Faster by Ditching the Code

This post also appeared on the Genius Engineering blog.

As part of our recently announced deal with Apple Music, you can now view Genius lyrics for your favorite music within the Apple Music app.

We deliver our lyrics to Apple via a nightly export of newline-delimited JSON objects. With millions of songs in our catalog, these dumps can easily get as big as 5 GB. It’s not quite “big data”, but it’s also not something you can easily open in vim.

Our first iteration of the code that generated these nightly exports was slow and failure-prone. So, we recently did a ground up rewrite focused on speed and reliability, which yielded significant improvements on both axes—stay tuned for a future blog post on that subject. But other than spot-checking with small data sets, how could we make sure that the new export process wasn’t introducing regressions? We decided to run both export processes concurrently and compare the generated exports from each method to make sure the new version was a comprehensive replacement.

Beware the Siren Song of Comments

Choosing a New Theme

About a week ago I finally decided that I wanted to start blogging again. I love talking about programming, but I often find it difficult to motivate myself to write a blog post about it. I sat down to write a post, and sure enough, I couldn’t think of anything to blog about. So instead I procrastinated by thinking about all of the things I wanted to do to make my blog better.

Git-getpull: Quickly Find the Pull Request That Merged Your Commit to Master

The 3 Ways to Get the Size of an Active Record Relation

If you’re reading this and your first thought is, “there are 3 ways to get the size of a relation?”, then you’ve come to the right place! Basically, given a relation like Post.all or User.first.posts, when you want to know the size, you’ve got 3 choices: size, length, and count. At first glance, it seems like these might do the same thing, right? Not so! There are some key differences between them.

Use the Rails Router for Routing!

This is a quick one, and the title says most of it. Basically, you should never have code like this in your app:

class SomeController < ApplicationController
  def some_action
    if something_about_the_url?
      render :template => :foo
      render :template => :baz

Simple Active Record Query Debugging in the Rails Console

Stop me if this sounds familiar. You’re tooling around in the Rails console, testing out some new code you’re working on (or debugging some slow/broken code), and you see a ton of repeat queries.

I have this experience frequently; usually I can figure out what’s going on, but sometimes it can be quite tricky to track down the source of extra queries. Whenever I want to figure out where a method is getting called from, one easy and lazy solution is to add a debugger statement in that code. But where the heck do I add a debugger for sql statements?

My First Blog Post

This is my first blog post! I setup up my blog using Octopress, which was incredibly easy. They’ve got some great guides on their site, but just to give you a sense of exactly how easy it is, I simply:

  • Created a repository on github named - the standard naming conventions that Github Pages expects if I want to resolve to this blog
  • Cloned Octopress via git clone git://, ran bundle
  • Next step was to run rake setup_github_pages
  • Then it’s as simple as rake generate and rake deploy!
  • Creating this blog post just involved running rake new_post["My first blog post"]

The writing process is extremely simple - just run rake preview until it looks right, and then rake deploy after committing your changes.

Not that I should be surprised, but using Octopress is really a breeze, and I highly recommend it to anybody looking to crank out a quick blog with minimal setup and maintenance.