Ruby Shift vs Multiply

Having a background in hardware development, I always assumed some operations were trivial and more efficient done some way rather than another.

One of those operations for example was the usage of shifting by 1 instead of multiplying by 2. The reason it would be more effective is due to the CPU complexity of multiplying by two as opposed to shifting bits which is equivalent to just selecting whatever is high on the clock cycle.

0b00000100100001000010001000100001
# => 75768352

0b00000100100001000010001000100001 * 2  #
# => 151536704

0b00000100100001000010001000100001 << 1  #
# => 0b00000100100001000010001000100001 << 1  #
# => 0b00001001000010000100010001000000
# => 151536704

While wondering about such performance, I decided to run a quick ruby program to measure the difference.

require 'benchmark'

repetition = 100_000_000
number = 0b00000100100001000010001000100001

Benchmark.bm do |x|
  x.report('<<') { repetition.times { number << 1 } }
  x.report('*2') { repetition.times { number * 2 } }
end

The response I would obtain was at first astonishing. Here is a sample output of what I would see (though results may vary per run and per machine.):

usersystemtotalreal
«11.5200000.06000011.580000(11.621933)
* 29.3500000.0500009.400000(9.434972)

I was really surprised and wanted to figure out why that was. Was I running a process in the background that was incorrect. Maybe the order in which they were run, so I inversed the order reporting first on the multiplier and then on the shifter. Damn! I was really hoping it was linked to pre-heating of the cache or something. That was not it. Mystery had yet to be resolved.

No matter what I did, I was always achieving the same results - multiplying by two was always about two seconds more efficient. I pondered on this for a while until I decided to look into Ruby’s source code for an explanation.

This was the first time opening up the ruby source code, and after some digging around, I found this in numeric.c

static VALUE
rb_fix_lshift(VALUE x, VALUE y){
	long val, width;
	val = NUM2LONG(x);
	if (!FIXNUM_P(y))
		return rb_big_lshift(rb_int2big(val), y);
	width = FIX2LONG(y);
	if (width < 0)
		return fix_rshift(val, (unsigned long)-width);
	return fix_lshift(val, width);
}

The most interesting thing I noticed was the line that said rb_int2big. This means it was converting my integer number to a Bignum. This was a side-effect I had not taken into consideration. Overflow was now being handled in the Bignum class instead of the Numeric class. This was probably it, but I had to prove it.

Looking at the Bignum documentation it does say:

Bignum objects are created automatically when integer calculations would otherwise overflow a Fixnum.

Followed by:

When a calculation involving Bignum objects returns a result that will fit in a Fixnum, the result is automatically converted.

Great, that was my explanation. The proof remained to be done so I wrote:

require 'benchmark'

repetition = 100_000_000
number = 0b00000100100001000010001000100001
big_number = (2**70)

# big_number.class #=> Bignum

Benchmark.bm do |x|
  x.report('Integer <<') { repetition.times { number << 1 } }
  x.report('Integer *2') { repetition.times { number * 2 } }

  x.report('Bignum <<') { repetition.times { big_number << 1 } }
  x.report('Bignum *2') { repetition.times { big_number * 2 } }
end

Now watch what happens when we start using a Bignum:

usersystemtotalreal
Integer «11.3200000.04000011.360000( 11.373461)
Integer * 29.4200000.0400009.460000( 9.473982)
Bignum «18.1800000.06000018.240000( 18.265900)
Bignum * 223.6300000.08000023.710000( 23.748442)

And here, shifting a Bignum was now more efficient than shifting an Fixnum.

That was it! After quite a bit of digging around I found it. My assumptions were wrong but what I had learned remained true. Ruby’s intrinsic conversion was to blame for this matter.

Conclusion

There is a difference in performance between multiplying and integer by two (:*) and shifting it (:<<) in Ruby. Be careful about some of the assumptions you make, they may be wrong.