Golang vs Java image

For the past five years I’ve been building software in Ruby, it’s a great a language to work in and for many situations its performance is more than adequate. Recently though, I started to encounter situations that would benefit from something with a little more oomph! There’s plenty of languages I could choose from that’d give a good speed boost, and after much consideration I whittled the list down to Go and Java.

I’ve been playing around with Java for a couple of months (just some casual learning really) and I have no experience what-so-ever with Golang. To allow me to gain some real world experience, as well as something to let me gather performance data of each, I decided to write a small test app, and share my test results here.

I won’t be going into details on why I chose Golang and Java (speed was just one of the criteria), but I will give you a little background on the app I wrote to test them.

UPDATED 2015-11-21: now includes benchmark results for both Go 1.4.2 and 1.5.1.

Some Background

Recently I finished up a R&D project for a client in which I developed an EPUB ebook toolchain in Ruby. Some of the features included; extracting/updating ebook metadata, watermarking, adding new publisher related content, automated conversion from EPUB v2 to v3, and numerous other handy features.

When they hired me, they needed an expert in ebook production, not in a specific language. Performance wasn’t going to be an issue, though I did have limited development time, so they were happy to go with Ruby.

A couple of months back my client finished a full rebuild of their service, and that ebook toolchain has now become a core part of their USP. Although the performance is adequate for the moment, at their current growth rate it won’t be too long before we start to see performance related issues. We’ve started considering that a rebuild of the toolchain in a faster language is required, but would like some data to help us with that decision.

The Test App

I built the same app in Golang 1.4.2, Java 8, and also Ruby 2.2 - which is based on our current implementation so that it can be used as a baseline for the comparison.

One of the features of our current app is to pull out a bunch of metadata from an EPUB ebook when a user first uploads the file. This is a nice simple app, so good for our initial test.

Here’s a breakdown of the required tasks;

  • Read an EPUB file (a ZIP archive).
  • Read and parse two XML files from inside the archive.
  • Extract various pieces of metadata.
  • Get a total word count - first stripping the HTML tags.
  • Output the collected information.

I generated a Ruby script to iterate over all the files and used system calls to the Go/Java programs. Benchmarking is done via Ruby’s benchamark module.

Ruby

In the Ruby version, I used the unzip command for extracting all the archive contents, and the Nokogiri GEM for parsing the XML, using xpath to extract the metadata. Nokogiri was also used for doing the word count;

html_files.each do |file|
  doc = Nokogiri::HTML(File.new(file).read)
  count += doc.xpath('//body//text()').to_a.join(' ').split.size
end

It’s interesting to note that as libxml2 (which Nokogiri uses) and unzip are both written in C, that would make our Ruby version of the app mostly a C program!

Golang

EPUBS.each do |epub|
  puts "GO BOOK: #{epub}"
  `~/go/bin/expose #{epub} 2>&1`
end

Java

EPUBS.each do |epub|
  puts "JAVA BOOK: #{epub}"
  `java -jar ~/java/Expose.jar #{epub} 2>&1`
end

A total of 1800 EPUB3 files were processed, varying in size from about 100KB to 5MB, with most word counts being between 30,000 and 500,000, all of which was performed on a Macbook Air.

The Results

This report shows the user CPU time, system CPU time, the sum of the user and system CPU times, and the elapsed real time. The unit of time is seconds.

           user     system       total         real
RUBY: 176.040000  23.810000  240.600000  (246.768792)
  GO:   0.280000   0.870000  118.730000  (115.939783)
JAVA:   0.340000   1.100000 2542.980000 (1197.375407)

So that’s pretty much 2 minutes for Go, 4 minutes for Ruby, and 20 minutes for Java.

What!?!

I really didn’t expect the Java warmup penalty to be so severe! Java is certainly a fast language, but because of the simplicity of this test, the JVM just didn’t have enough time to optimize the code…or something like that.

Valuable information, but to get a better comparison, I updated the Java and Go programs to iterate and process the ebooks natively;

  GO: 0.000000   0.000000  90.630000 ( 92.980314)
JAVA: 0.000000   0.000000  72.250000 ( 62.275899)

That’s more like it.

When we give Java a decent amount of work to do it gets to show its capabilities, which is that; Java 8 is ~33% faster than Golang 1.4.2.

Update 2015-11-21: Go 1.4 vs 1.5 Performance

Since originally running these benchmarks Go v1.5.1 has been released, and I thought it would be useful to re-run them and get a more up-to-date comparison against both Java and Go 1.4.2. This time I’ve decided to run the benchmarks four times consecutively, to give us a better representative for each platform.

              user     system      total         real
Go 1.4.2:  0.000000   0.010000  93.320000 ( 95.186819)
Go 1.4.2:  0.000000   0.000000  96.390000 ( 97.443422)
Go 1.4.2:  0.000000   0.000000  94.840000 ( 97.212206)
Go 1.4.2:  0.000000   0.000000  93.030000 ( 94.894219)

Go 1.5.1:  0.000000   0.000000 107.600000 ( 86.864770)
Go 1.5.1:  0.000000   0.000000 113.090000 ( 90.166395)
Go 1.5.1:  0.000000   0.000000 111.240000 ( 88.054824)
Go 1.5.1:  0.000000   0.000000 108.320000 ( 85.088105)

And for reference, an up-to-date run against Java:

Java 8.45: 0.010000   0.000000  69.080000 ( 59.504218)
Java 8.45: 0.000000   0.000000  67.770000 ( 58.762161)
Java 8.45: 0.000000   0.000000  73.560000 ( 61.080194)
Java 8.45: 0.000000   0.000000  71.130000 ( 60.216356)

As you can see, there’s a marked improvement between the two versions of Go. Taking the median of the four iterations, we get a 9% performance boost on v1.5.1. Comparing these new benchmarks against Java 8.45, Go 1.5 is 31.5% slower, and Go 1.4 is 37.75% slower.

Conclusion

Before I ran this test I’d read that Java was faster than Go, and it goes without saying that both are faster than Ruby, but this test shows that that’s only half the truth.

I was certainly surprised that for simple tasks like this Java is actually many times slower, even than Ruby, but once you give it a good chunk of work to do, it can really fly.

My conclusion is that for small tasks - image resizing would fall into this category - especially when you have large numbers of files that you’ll be calling individually from your web app (Ruby on Rails, or perhaps even PHP if you’re that way inclined), then Go is very much the better choice than Java. These are exactly the kinds of requirements my client needs. We’ll need to do more research and tests for sure, but it is looking very likely that I’ll be learning Go over the next few months.