MOVED

I’m now at http://helderribeiro.net.

Scaling and Tools Diversity: Google vs. Facebook

Steve Yegge in one of his posts talks about Google’s policy of standardizing on the use of only a few programming languages. He mentions how he, as a guy interested in different languages, was at first annoyed by that fact, but later came to realize it was the only sensible way of building systems as scalable as theirs have to be.

In what seems to contradict that view, in Facebook’s recent discussion of the decisions they had to make when designing Facebook Chat, they mention choosing Erlang because, well, it was made to do distributed, realtime systems with… message passing. Can’t get a better fit for a Chat project than that.

To make that code interface with their existing codebase, they used Thrift, their free software “framework for scalable cross-language services development”, whose white paper begins with the strong remark:

“In our implementation of these services [Facebook], various programming lan-
guages have been selected to optimize for the right combination of performance, ease and speed of development, availability of existing libraries, etc. By and large, Facebook’s engineering culture has tended towards choosing the best tools and implementations available over standardizing on any one programming language and begrudgingly accepting its inherent limitations.

Now, of course Google and Facebook have very different needs (crawling, storing, rating and indexing the whole web on a regular basis must be a bit more challenging than showing user profiles — even if it’s showing many of them, many many times a day), and playing safe has surely worked out well for Google so far. But sometimes that requires reimplementing Ruby on Rails in Javascript just to suite a company requirement.

It’s good to see there is a place in the monster-traffic world for programming language enthusiasts :)

Making Gmail always use HTTPS without any Greasemonkey

It’s the stupidest solution, but I hadn’t thought of it before.

I had seen that the Better Gmail extension had this feature of making Gmail always run over HTTPS, but it also came with so much other stuff that I found it to be just too bloated, and kept wishing for a simpler solution (meanwhile exposing my privacy to all those sniffers out there — oh the danger!).

Of course, I could always replace “http” with “https” on the address bar, but it’s a pain doing that every time. If only I could set it to be always like that…

Wait a second: I always start Gmail as my home page. Always use Alt+Home when I want to go to it. Yes, people, I had this brilliant idea: why not just put the address with “HTTPS://” in the configs as my Home page?

And that I did. Now I’m a safe, happy Gmail user. No extensions, no glitchy Greasemonkey scripts. Just the ululating obvious.

When testing isn’t worth the price

I’m just starting to use RSpec for Rails instead of Test::Unit, and with it comes a little novelty: there are separate Controller and View tests (unlike TUnit’s functional tests). At first I thought “hm.. cool”. But after spending the first hours writing tests for views I started to feel very stupid, and the whole thing feels very awkward and unnecessary.

Views are too unstructured and change too often for it to be worth keeping it all tested, and most of the time you’re not testing Ruby code, but HTML, and I don’t think that’s what tests are for. If your controllers are well tested, views should do OK.

As convinced as I am about this, I was feeling a little guilty to just ditch testing like that, so I searched for some supporting opinion and found this post. It agrees with me, so it must be right :) The comments are also interesting. The main idea is that you should just test if the views render without errors and get on with life. Now I just have to find out how to test that little thing.

Yak Shaving: optimizing brain usage for code snippets

So you checked out how TextMate has all those wonderful snippets for every possible piece of code you could think of (and how now Emacs also does!), but you’re having second thoughts on wether it pays off to memorizing all those little abreviations and their meanings?

Not anymore!

With the cute one-liner below, you can fire up irb on your snippets directory and see instantly the TOP 5 winners in characters-saved / characters-typed !! Pretty neat huh? ;)

Here’s an example:

$ cd ~/.emacs.d/yasnippet/snippets/text-mode/ruby-mode
$ irb
irb(main):019:0> Dir['*'].map {|f| [f,File.read(f).reject{|s| s =~ /^#/}.join.size.to_f/f.size]}.sort {|a,b| b.last <=> a.last}.first(5).each {|f| puts "####" + f.first,File.read(f.first), "\n" *2}
####w
#name : attr_writer ...
# --
attr_writer :${attr_names}

####r
#name : attr_reader …
# –
attr_reader :${attr_names}

####mm
#name : def method_missing … end
# –
def method_missing(method, *args)
$0
end

####am
#name : alias_method new, old
# –
alias_method :${new_name}, :${old_name}

####bm
#name : Benchmark.bmbm(…) do … end
# –
Benchmark.bmbm(${1:10}) do |x|
$0
end

=> [["w", 26.0], ["r", 26.0], ["mm", 21.0], ["am", 19.5], ["bm", 19.5]]

It would probably be nice also to consider how much typing it saves you by mirroring variable names and stuff… And also how frequent that particular construct actually is in your code… Come to think of it, this one-liner is pretty useless, but at least picking an arbitrary 5 or 6 snippets to add to your dynamic cheatsheet is better than trying to randomly memorize them.

In case you didn’t catch it, here it is in full color (and we discover wordpress doesn’t use a full parser for syntax highlighting, what a shame :P):

Dir['*'].map {|f| [f,File.read(f).reject{|s| s =~ /^#/}.join.size.to_f/f.size]}.sort {|a,b| b.last => a.last}.first(5).each {|f| puts "####" + f.first,File.read(f.first), "\n" *2}

Rails vs SCM: resolving conflicts between local and upstream Migrations

If you’re working on a local branch of a Rais project for long enough, you’re bound to run into this irritating problem: you create a new migration, it gets the smallest unique number from the ones you got from upstream, BUT, before you get the chance to commit it, someone does it first, and in your next update (svn up || git pull) you have that tangled migration mess.

This little rake task might help you out. Warning: it assumes that all your local migrations have already been run, and that *none* of the new migrations from upstream have been run.

The code is definetely not very DRY and doesn’t take much advantage of Rake (I’m pretty n00b on Rake), so I accept suggestions/patches :)

To use it, just throw it in your lib/tasks folder and call it using “rake db:migrate:fast_forward”.

Next (and easy) step is making it receive a SCM parameter (git/svn) so it’ll use the proper “mv” command.


namespace :db do
  namespace :migrate do
    desc <<STR
Resolves conflicts between local and upstream migrations.

This task assumes the following scenario:
During your local development, you've created migrations and ran rake db:migrate;
Then, you updated from upstream (svn update || git svn rebase), and ended up with
pairs of migrations with the same number: one is the local you created, and the
other is the one from upstream that someone commited before you.

Besides that, there might be some other non-overlapping migrations *after* the
overlapping zone that are *also* local (you had more local migrations than new ones
that came from upstream on the update).

This tasks takes *all your local migrations*, **reverts them** (in reverse order),
and moves them (in order) to the end of the line.

After that you can run rake db:migrate again and it'll run first the migrations
from upstream, and yours last.
STR

    task :fast_forward => :environment do
      migrator = ActiveRecord::Migrator.new(:down, 'db/migrate')
      puts "Looking for migrations with repeated numbers"
      all_migrations = Dir['db/migrate/*'].sort
      pairs = all_migrations.group_by{|migration| migration =~ /(\d+)/; $1}.
        select {|number, migrations| 1 < migrations.size && migrations.size < 3}
      pairs = pairs.sort {|x, y| x[0] <=> y[0]}

      # Pick the range of (local) migrations that will be slided to the end
      migrations_to_move = []
      # First the ones that overlap (disambiguated by user)
      pairs.map{|pair| pair[1]}.each do |mig1, mig2|
        begin
          puts "\n[1]\t#{mig1}"
          puts "[2]\t#{mig2}"
          puts "\nWhich one is part of the range to be slided to the end of the list?"
          option = STDIN.gets.to_i
        end until option == 1 || option == 2

        migrations_to_move << (option == 1 ? mig1 : mig2)

      end
      # Then the (local) ones past the overlap zone
      unless pairs.empty?
        idx_last_overlapping_migration = all_migrations.index(pairs.last[1].last)
        migrations_to_move += all_migrations[idx_last_overlapping_migration+1..-1]
        # Assumes all (and only) the local ones past the overlap zone have already been run
        migrations_to_move.reject! { |m| m =~ /(\d+)/; $1.to_i > migrator.current_version }
      end

      migrations_to_move.first =~ /(\d+)/
      schema_version = $1.to_i # set_schema_version subtracts one

      # Slide the range to be slided to the end of the list
      upstream_migrations = all_migrations - migrations_to_move
      upstream_migrations.last =~ /(\d+)/
      next_number = $1.to_i + 1

      new_names = migrations_to_move.map { |migration|
        migration =~ /(\d+)(.*)/
        name_migration_to_move = $2

        new_name = 'db/migrate/' + ("%03d" % next_number) + name_migration_to_move
        next_number += 1
        new_name
      }

      # Confirm and execute
      unless migrations_to_move.empty?
        pp "Latest upstream migrations", upstream_migrations.last(5)
        pp "These are your local migrations: ", migrations_to_move
        pp "They will be reverted and renamed to: ", new_names
        puts "And the new schema version will be: #{schema_version-1}"

        begin
          puts "\nShould I proceed? [Y/n] "
          option = STDIN.gets.strip.downcase
        end until option == 'y' || option == 'n'

        if option == 'y'
          # Revert
          migrations_to_move.reverse.each do |migration|
            require migration
            migration_class = migrator.send(:migration_class, *(migrator.send(:migration_version_and_name, migration).reverse))
            migration_class.down
          end
          migrator.send(:set_schema_version, schema_version)
          # Move to end of line
          migrations_to_move.zip(new_names) do |old_name, new_name|
            File.rename old_name, new_name
          end
        end
      else
        puts "No overlapping migrations. You can safely run rake db:migrate."
      end
    end
  end
end

Update: Just after writing this, a friend told me about the Git Migration Buddy. It is git specific and seems to handle handle multiple branches better. Mine is kinda 1-n (main (svn in my case) repo syncing with multiple local branches). There’s the enhanced_migrations plugin that supposedly stops the problem at the root, having timestamps instead of increasing numbers for migrations. Zach in the comments also mentions a great solution he’s coming up with: a post-checkout hook to change database.yml and have a different db for each branch (dunno if it works too well with big dbs, but it’s a great idea nonetheless).

Fast-forwarding through screencasts without ant voice

One thing that always stopped me from using screencasts as a viable learning tool is that they usually take too long. And most of the time it’s the guy moving around, or saying “uh”, “er…”, or doing stuff I already know. And when I tried skipping a few seconds ahead I usually skipped the very few meaty bits and had to go back and listen to them again. So what I did sometimes was increasing the playback speed (by pressing the ]-key on mplayer), but that made the guy speak as if breathing helium.

Not anymore!

Mplayer (from svn) has a fantastic new audio filter called scaletempo. It basically lets you change the playback speed without changing the sound pitch. The guy speaks faster, but in the same tone. Isn’t that amazing?! So, here’s how to do it (you really need the svn version as of now; 1.0rc2 won’t do it) on Ubuntu:

First, we need to install the dependencies for compiling the new package (without installing the package itself):

sudo apt-get build-dep mplayer

Now the usual checkout, compile and install:

cd /tmp
svn checkout svn://svn.mplayerhq.hu/mplayer/trunk mplayer
cd mplayer
./configure
make
sudo make install

And that’s it! Now you just open your videos like this:

mplayer screencast.ogm -af scaletempo

and use the keys [ and ] to adjust playback speed at will.

Edit: from my experience, you can speed up speech up to 1.5-1.75 without losing quality. That means you can watch a 1-hour video in 34-40min!

Follow

Get every new post delivered to your Inbox.