In Analyzing Go Build Times, I went over how to analyze and understand Go build times, and what factors impact build times. A close cousin to build times is build sizes.

Large binaries can lead to a variety of issues such as:

  • Generally, slower build times
  • Increased costs of storage
  • Increased costs and time to distribute
  • Increased memory usage at runtime (more on this in another article, hopefully)

So its generally nice to keep them small.

Measuring build size

Measuring build size is not rocket science - just build it and check the size! I will build the Istio sidecar agent in these examples, as its a fairly large real world Go binary.

$ go build ./pilot/cmd/pilot-agent; du -sh pilot-agent
109M    pilot-agent
$ stat -c %s pilot-agent # If you prefer the exact size...
114237668

What is actually taking up space though? For that, we can reach to nm, which will "list symbols from object files".

With the right flags, we will find what symbols are taking up the most space:

$ nm -S --size-sort -t d pilot-agent | tail
0000000016675872 0000000000042060 T github.com/envoyproxy/go-control-plane/envoy/config/bootstrap/v3.(*Bootstrap).validate
0000000015690464 0000000000049950 T github.com/envoyproxy/go-control-plane/envoy/extensions/filters/network/http_connection_manager/v3.(*HttpConnectionManager).validate
0000000024892736 0000000000052344 T k8s.io/api/core/v1.init
0000000016214464 0000000000055942 T github.com/envoyproxy/go-control-plane/envoy/config/cluster/v3.(*Cluster).validate
0000000021548512 0000000000058506 T github.com/google/cel-go/common/stdlib.init.0
0000000082309760 0000000000065792 B runtime.trace
0000000053340256 0000000000072088 r runtime.itablink
0000000082375552 0000000000092792 B runtime.mheap_
0000000053225696 0000000000114540 r runtime.typelink
0000000053053056 0000000000168800 r runtime.findfunctab

Nice! The second column represents the size. Here our top 10 offenders appear to be some Envoy, Kubernetes, and CEL dependencies, along with core Go runtime components.

Go also provides a similar tool builtin, which gives the same results (but in the reverse order):

$ go tool nm -size -sort size pilot-agent | head
 3298680     168800 r runtime.findfunctab
 32c28e0     114540 r runtime.typelink
 4e8f380      92792 D runtime.mheap_
 32de860      72088 r runtime.itablink
 4e7f280      65792 D runtime.trace
 148cde0      58506 T github.com/google/cel-go/common/stdlib.init.0
  f769c0      55942 T github.com/envoyproxy/go-control-plane/envoy/config/cluster/v3.(*Cluster).validate
 17bd540      52344 T k8s.io/api/core/v1.init
  ef6ae0      49950 T github.com/envoyproxy/go-control-plane/envoy/extensions/filters/network/http_connection_manager/v3.(*HttpConnectionManager).validate
  fe7420      42060 T github.com/envoyproxy/go-control-plane/envoy/config/bootstrap/v3.(*Bootstrap).validate

goda weight is another handy tool (which we will use later as well), which gives similar info in a nicer form.

So if we sum up all of the symbols, surely it will match the binary size (109M)...?

$ nm -S --size-sort -t d pilot-agent | awk '{print $2}'| paste -sd+ - | bc | numfmt --to=iec-i --suffix=B
33MiB

Nope!

Analyzing symbols is an okay way to view whats in the binary roughly, but it shouldn't be thought of as a direct mapping to binary size (more on this later).

Making things smaller

CGO and static linking

Our current build is dynamically linked due to CGO usage:

$ file pilot-agent
pilot-agent: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked ... debug_info, not stripped

Are there improvements (or regressions) with static linking?

$ CGO_ENABLED=0 go build ./pilot/cmd/pilot-agent; du -sh pilot-agent
109M    pilot-agent

Nope, about the same. The file output does give us a clue of where to look next, though: debug_info, not stripped.

Stripping debug information

Go offers some control during the build of what is included in the binary:

-s
	Omit the symbol table and debug information.
-w
	Omit the DWARF symbol table.
$ CGO_ENABLED=0 go build -ldflags '-w' ./pilot/cmd/pilot-agent; du -sh pilot-agent
89M     pilot-agent
$ CGO_ENABLED=0 go build -ldflags '-s' ./pilot/cmd/pilot-agent; du -sh pilot-agent
75M     pilot-agent
$ CGO_ENABLED=0 go build -ldflags '-s -w' ./pilot/cmd/pilot-agent; du -sh pilot-agent
75M     pilot-agent

Some nice improvements here! Interestingly, it seems passing just -s is sufficient, despite most projects using -s -w.

Another point of interest is the improvements from removing symbols nearly exactly matches the total symbol size reported by nm. This makes sense; nm reports the symbol sizes, and -s strips the symbols.

Another possibility:

        -trimpath
                remove all file system paths from the resulting executable.
                Instead of absolute file system paths, the recorded file names
                will begin either a module path@version (when using modules),
                or a plain import path (when using the standard library, or GOPATH).
$ CGO_ENABLED=0 go build -ldflags '-s' -trimpath ./pilot/cmd/pilot-agent; du -sh pilot-agent
75M     pilot-agent

This rounds to the same result, but does shave off 120KB total, so gives a modest improvement. It is more valuable as a means of reproducibility, though.

Surely these symbols provided some value though, so what are we loosing?

$ nm pilot-agent
nm: pilot-agent: no symbols

Well, that is one thing! The other notable loss is when attaching a debugger. However, much of the core functionality (panic stack traces, pprof, etc) remain fully functional (more on how this is possible, pclntab, later). As such, I think its generally a good idea to use these flags for production builds.

Digging deeper

At this point we know ~33M of the binary was symbols, but we still have 75M left. Where is that from?

bloaty is a nice tool to help inspect a binary.

$ bloaty pilot-agent
    FILE SIZE        VM SIZE
 --------------  --------------
  30.3%  32.9Mi  44.1%  32.9Mi    .text
  23.8%  25.8Mi  34.6%  25.8Mi    .gopclntab
  12.6%  13.8Mi  18.4%  13.8Mi    .rodata
   9.7%  10.6Mi   0.0%       0    .strtab
   7.4%  8.04Mi   0.0%       0    .debug_info
   5.0%  5.47Mi   0.0%       0    .debug_loc
   4.1%  4.45Mi   0.0%       0    .debug_line
   3.0%  3.25Mi   0.0%       0    .symtab
   1.5%  1.65Mi   0.0%       0    .debug_ranges
   1.1%  1.19Mi   1.6%  1.19Mi    .noptrdata
   1.0%  1.13Mi   0.0%       0    .debug_frame
   0.0%       0   0.4%   324Ki    .bss
   0.3%   323Ki   0.4%   323Ki    .data
   0.1%   111Ki   0.1%   111Ki    .typelink
   0.0%       0   0.1%  90.1Ki    .noptrbss
   0.1%  70.3Ki   0.1%  70.3Ki    .itablink
   0.0%  14.8Ki   0.0%  14.8Ki    .go.buildinfo
   0.0%  3.45Ki   0.0%       0    [Unmapped]
   0.0%  2.07Ki   0.0%  2.07Ki    [LOAD #2 [RX]]
   0.0%  1.44Ki   0.0%  1.44Ki    [ELF Section Headers]
   0.0%  1.18Ki   0.0%     845    [8 Others]
 100.0%   108Mi 100.0%  74.6Mi    TOTAL

So we have:

  • .text: contains the executable code of a program or application.
  • .rodata: contains constants, string literals, type names, etc.
  • .gopclntab: containers the Program Counter to Line Number Table.

We can extract these to inspect directly if we wish:

$ objcopy -O binary --only-section=.rodata pilot-agent /tmp/rodata

The format isn't exactly human readable, but its not entirely garbage.

Here is a short snippet, where we can clearly see some Kubernetes API fields:

MinDomainsson:"minDomains,omitempty"&*v1.PodResourceClaimApplyConfiguration
StringDatason:"stringData,omitempty"
TargetPortson:"targetPort,omitempty"
ClusterIPsson:"clusterIPs,omitempty"
IPFamiliesson:"ipFamilies,omitempty"&*v1beta1.EventSeriesApplyConfiguration&*v1beta1.IngressRuleApplyConfiguration&*v1beta1.IngressSpecApplyConfiguration&*v1.FlowSchemaStatusApplyConfiguration&*v1beta1.UserSubjectApplyConfiguration&*v1beta2.UserSubjectApplyConfiguration&*v1beta3.UserSubjectApplyConfiguration&*v1.IngressRuleValueApplyConfiguration&*v1.IngressClassSpecApplyConfiguration

Analyzing the remaining information

shotizam is a pretty neat tool to parse the rest of the binary. Unlike most other tools, it doesn't rely on the symbols (which we strip anyways!). Unfortunately, it hasn't been updated to accommodate some Go internals. A (partial) PR updates it though.

Lets give it a shot.

$ shotizam pilot-agent > /tmp/analysis
$ sqlite3
sqlite> .read /tmp/analysis
sqlite> select sum(size) from bin;
77721600
$ stat -c %s pilot-agent
77721600

Every byte is accounted for! This is great. Lets see what else we can get...

sqlite> select * from bin limit 10;
internal/abi.(*RegArgs).Dump|internal/abi|fixedheader|40
internal/abi.(*RegArgs).Dump|internal/abi|pcsp|20
internal/abi.(*RegArgs).Dump|internal/abi|pcfile|5
internal/abi.(*RegArgs).Dump|internal/abi|pcln|63
internal/abi.(*RegArgs).Dump|internal/abi|text|640
internal/abi.(*RegArgs).Dump|internal/abi|funcname|29
internal/abi.(*RegArgs).IntRegArgAddr|internal/abi|fixedheader|40
internal/abi.(*RegArgs).IntRegArgAddr|internal/abi|pcsp|16
internal/abi.(*RegArgs).IntRegArgAddr|internal/abi|pcfile|5
internal/abi.(*RegArgs).IntRegArgAddr|internal/abi|pcln|17

So we have Func, Pkg, What, and the Size... Lets see what types of space there are

sqlite> select What, sum(size) from bin group by What order by 2 desc;
text|34527198
TODO|16003973
funcname|8916126
funcdata|5511128
fixedheader|4755160
pcln|4321870
pcsp|2389831
pcfile|1296314

Ah, so while we do add up to the exact bytes, some of that is marked as TODO. These are not "dark" or "non-useful" bytes, but rather just bytes the tool doesn't yet know how to categorize. On this binary, 80% is analyzed which is good enough for most cases.

34MB is text (which aligns with what we saw above as .text section of the binary), 9MB is function names (new info!), and then a variety of other categories make up the rest.

I believe TODO corresponds primarily to .rodata, text to .text, and the rest to .gopclntab.

We can also query by package or function name (or whatever else we want):

sqlite> select Pkg, sum(size) from bin where Pkg <> '' group by Pkg order by 2 desc limit 10;
k8s.io/api/core/v1|3049990
github.com/google/cel-go/parser/gen|1459406
github.com/envoyproxy/go-control-plane/envoy/config/core/v3|1097645
github.com/envoyproxy/go-control-plane/envoy/config/route/v3|857938
github.com/antlr/antlr4/runtime/Go/antlr/v4|756050
net/http|716133
github.com/google/gnostic-models/openapiv2|697648
runtime|667034
k8s.io/apimachinery/pkg/apis/meta/v1|663585
github.com/google/gnostic-models/openapiv3|634038

sqlite> select Func, sum(size) from bin where Pkg <> '' group by Func order by 2 desc limit 10;
github.com/google/cel-go/common/stdlib.init.0|63368
github.com/envoyproxy/go-control-plane/envoy/config/cluster/v3.(*Cluster).validate|61394
k8s.io/api/core/v1.init|54903
github.com/envoyproxy/go-control-plane/envoy/extensions/filters/network/http_connection_manager/v3.(*HttpConnectionManager).validate|54772
github.com/envoyproxy/go-control-plane/envoy/config/bootstrap/v3.(*Bootstrap).validate|46190
github.com/envoyproxy/go-control-plane/envoy/config/route/v3.(*RouteAction).validate|43414
github.com/google/gnostic-models/openapiv3.NewSchema|38303
github.com/envoyproxy/go-control-plane/envoy/config/listener/v3.(*Listener).validate|35136
k8s.io/apiextensions-apiserver/pkg/apis/apiextensions/v1.(*JSONSchemaProps).Unmarshal|33583
k8s.io/apiextensions-apiserver/pkg/apis/apiextensions/v1beta1.(*JSONSchemaProps).Unmarshal|33556

So here we see Kubernetes is a big factor (as we have seen in the past, as are Envoy and CEL.

Reducing the binary size

Aside from our one trick of stripping debug symbols, there isn't any easy fixes to reducing the size of the binary other than reducing the amount of code compiled into the binary. Above, we saw how to analyze what is taking up space. But understanding how to remove it can be still be tricky.

For instance, in the above analysis we found github.com/google/cel-go using substantial space. Intuitively, I know this probably isn't actually used in pilot-agent, so its likely an inadvertent dependency. However, tracking down why its imported (possibly through a long chain of packages) can be tricky.

Again, goda can help. There are lots of handy tools within it, but I tend to use the tree reach(...) command. This shows the path from one package to another:

$ goda tree 'reach(./pilot/cmd/pilot-agent:all, github.com/google/cel-go/common/stdlib)' | hh cel-go
  ├ istio.io/istio/pilot/cmd/pilot-agent
    └ istio.io/istio/pilot/cmd/pilot-agent/app
      ├ istio.io/istio/pilot/cmd/pilot-agent/config
        ├ istio.io/istio/pkg/bootstrap
          ├ istio.io/istio/pilot/pkg/model
            ├ istio.io/istio/pkg/config/mesh
              └ istio.io/istio/pkg/config/validation
                └ github.com/google/cel-go/cel
                  ├ github.com/google/cel-go/checker
                    └ github.com/google/cel-go/common/stdlib
                  └ github.com/google/cel-go/common/stdlib ~

Here we can see a fairly long series of imports before we finally import github.com/google/cel-go. From there, we can do the refactoring work needed to prune it out of the dependency tree.

In this particular case, it was easier to mask the unexpected dependency behind a build flag; in many other cases its possible to simply refactor away dependencies.

Takeaways

  • -ldflags '-s -w' can help reduce binary sizes by stripping debug symbols; these are typically not required.
  • There aren't other (reliable) magic binary reduction methods; you'll need to understand what is in the binary and remove it.
  • go tool nm or other symbol analysis tools can give a rough idea of whats in a binary, but probably are not a reliable source of data given these symbols should be stripped anyways.
  • shotizam, on the other hand, analyzes parts of the binary beyond symbols, so is very useful to analyze the contents of a binary.
  • goda is a great tool to analyze dependencies.