28 August, 2018

Shrinking Go Binaries

As part of the efforts to build a new artifact system, I wrote a CLI program to handle Taskcluster Artifact upload and download.  This is written in Go and as a result the binaries are quite large.  Since I'd like this utility to be used broadly within Mozilla CI, which requires a reasonably sized binary.  I was curious about what the various methods are and what the trade-offs of each would be.

A bit of background is that Go binaries are static binaries which have the Go runtime and standard library built into them.  This is great if you don't care about binary size but not great if you do.

This graph has the binary size on the left Y axis using a linear scale in blue and the number of nanoseconds each reduction in byte takes to compute on the right Y axis using a logarithmic scale.

To reproduce my results, you can do the following:

go get -u -t -v github.com/taskcluster/taskcluster-lib-artifact-go
cd $GOPATH/src/github.com/taskcluster/taskcluster-lib-artifact-go
git checkout 6f133d8eb9ebc02cececa2af3d664c71a974e833
time (go build) && wc -c ./artifact
time (go build && strip ./artifact) && wc -c ./artifact
time (go build -ldflags="-s") && wc -c ./artifact
time (go build -ldflags="-w") && wc -c ./artifact
time (go build -ldflags="-s -w") && wc -c ./artifact
time (go build && upx -1 ./artifact) && wc -c ./artifact
time (go build && upx -9 ./artifact) && wc -c ./artifact
time (go build && strip ./artifact && upx -1 ./artifact) && wc -c ./artifact
time (go build && strip ./artifact && upx --brute ./artifact) && wc -c ./artifact
time (go build && strip ./artifact && upx --ultra-brute ./artifact) && wc -c ./artifact
time (go build && strip && upx -9 ./artifact) && wc -c ./artifact

Since I was removing a lot of debugging information, I figured it'd be worthwhile checking that stack traces are still working. To ensure that I could definitely crash, I decided to panic with an error immediately on program startup.

Even with binary stripping and the maximum compression, I'm still able to get valid stack traces.  A reduction from 9mb to 2mb is definitely significant.  The binaries are still large, but they're much smaller than what we started out with.  I'm curious if we can apply this same configuration to other areas of the Taskcluster Go codebase with similar success, and if the reduction in size is worthwhile there.

I think that using strip and upx -9 is probably the best path forward.  This combination provides enough of a benefit over the non-upx options that the time tradeoff is likely worth the effort.

No comments:

Post a Comment