Sam Doshi personal ramblings

AVR32 RAM and ROM usage

When developing for embedded devices it’s incredibly important to keep a track of your RAM and ROM usage. In order to do so, we can use the Unix utility size, in particular for the AVR32, we use the avr32-size command.

If you want the quick answer:

Run avr32-size -A <insert name>.elf

The .heap size is effectively your free RAM (malloc usage aside).

Your ROM size is approximately .data + .rodata + .text, to which you can add your .flash_nvram to get an approximation of total flash ROM use.

Read on for the gory details…

Output formats

Two output formats for the avr32-size command exist, the simpler ‘Berkeley’ style, e.g. avr32-size -B teletype.elf:

   text	   data	    bss	    dec	    hex	filename
  81034	   7504	 237872	 326410	  4fb0a	teletype.elf

Or the more complex ‘SysV’ style, e.g. avr32-size -A teletype.elf:

teletype.elf  :
section               size         addr
.reset                8204   2147483648
.rela.got                0   2147491852
.init                   26   2147491852
.text                55604   2147491880
.exception             512   2147547648
.fini                   24   2147548160
.rodata              16664   2147548184
.dalign                  4            4
.ctors                   8            8
.dtors                   8           16
.jcr                     4           24
.got                     0           28
.data                 7484           28
.balign                  0         7512
.bss                  8064         7512
.heap                74536        15576
.comment                47            0
.debug_aranges        8624            0
.debug_pubnames      20090            0
.debug_info         279439            0
.debug_abbrev        38429            0
.debug_line         161921            0
.debug_frame         25408            0
.debug_str           53842            0
.debug_loc           76994            0
.debug_macinfo    34421509            0
.stack                8192        90112
.flash_nvram        147076   2147745792
.debug_ranges        14936            0
Total             35427649

We’re going to use the SysV style, the Berkeley style doesn’t include a useful value for BSS1.

Enhancing the output of avr32-size

Before we go further, let’s enhance the output of avr32-size, firstly I would like both the decimal and hexadecimal values side by side:

pr -w 85 -m -t \
   <(avr32-size -Ad teletype.elf) \
   <(avr32-size -Ax teletype.elf | cut -c 19-)

Next, let’s remove the .debug_* lines and the headers and totals:

pr -w 85 -m -t \
   <(avr32-size -Ad teletype.elf) \
   <(avr32-size -Ax teletype.elf | cut -c 19-) \
| grep '^\..*' \
| grep -v '^.debug.*'

And finally, let’s sort by address:

pr -w 85 -m -t \
   <(avr32-size -Ad teletype.elf) \
   <(avr32-size -Ax teletype.elf | cut -c 19-) \
| grep '^\..*' \
| grep -v '^.debug.*' \
| sort -k 3

Understanding the output

Let’s take our enhanced command and run it against a built version of Teletype2, we get the following output (to which I have added a header and highlighted the interesting lines).

Name            Size (dec)   Addr (dec)  Size (hex)   Addr (hex)
================================================================
.comment                47            0        0x2f          0x0
.dalign                  4            4         0x4          0x4
.ctors                   8            8         0x8          0x8
.dtors                   8           16         0x8         0x10
.jcr                     4           24         0x4         0x18
.got                     0           28         0x0         0x1c
.data                 7484           28      0x1d3c         0x1c
.balign                  0         7512         0x0       0x1d58
.bss                  8064         7512      0x1f80       0x1d58
.heap                74536        15576     0x12328       0x3cd8
.stack                8192        90112      0x2000      0x16000
.reset                8204   2147483648      0x200c   0x80000000
.rela.got                0   2147491852         0x0   0x8000200c
.init                   26   2147491852        0x1a   0x8000200c
.text                55604   2147491880      0xd934   0x80002028
.exception             512   2147547648       0x200   0x8000fa00
.fini                   24   2147548160        0x18   0x8000fc00
.rodata              16664   2147548184      0x4118   0x8000fc18
.flash_nvram        147076   2147745792     0x23e84   0x80040000

The AT32UC3B05123 used by the Teletype has 96kb of RAM and 512kb of flash ROM. The address space is unified, so a 32-bit pointer can refer to a location in RAM or in ROM. The RAM starts at location 0x0, the ROM starts at 0x80000000. For those MCUs that support SDRAM, that would start at 0xD0000000.

The above table gives the name, size and address for each section of the unified address space, sorted by memory address. There can be gaps between sections, for example there is a gap of 180,944 bytes between the end of .rodata and the start of .flash_nvram, later on we’ll see that there is a copy of .data stored there for use at initialisation.

RAM

We aren’t interested in all the sections in RAM, for example .comment contains the name of the compiler4 and will be striped out before the final .hex file is created.

We are interested in the following (see Wikipedia for more info on these):

.data
This is used for initialised data, it takes up space in both RAM and ROM, unfortunately the ROM address is not given in the output of avr32-size, we’ll need avr32-objdump for that (see later.)
.bss
This is used for uninitialised data, it only takes up space in RAM.
.stack
The call stack. The size is set by the linker script, it may be changed by updating the linker variable __stack_size__, ideally by updating LDFLAGS in config.mk, e.g. LDFLAGS = -Wl,-e,_trampoline,--defsym=__stack_size__=0x1000 for a 4096 byte stack.
.heap
By default all unused RAM is allocated to the heap for use with malloc and such. Override the linker variable __heap_size__ if you wish to change this (though I can’t think of a reason as to why you would).

ROM

The ROM contains your code and any read-only data, as well as the initial value for any read-write data. It also contains the NVRAM5, which can be read directly, but writes must go via the flash controller interface.

.text
This is the code segment, it contains your program.
.data
A read only copy of your initialised data. It is copied from ROM to RAM after the bootloader has run.
.rodata
Constant read-only data.
.flash_nvram
The location of the NVRAM data, plus the size of any variables you’ve explicitly stored in there via __attribute__((__section__(".flash_nvram"))). The total size of flash that you wish to dedicate to NVRAM usage is configured with the linker variable __flash_nvram_size__ (ideally set in LDFLAGS).

Initalised vs. uninitalised vs. constant data

Understanding the difference between initialised, uninitialised and constant data will help you understand where they are stored. The following only applies to global data. Local data lives on the stack.

Initialised data
This takes the form int x = 10; and is stored in .data, which occupies space in both ROM and RAM.
Uninitialised data
This takes the form int x; and is stored in .bss in RAM only.
Constant data
This takes the form const int x = 10; and is stored in .rodata in ROM only.

It’s worth being aware that the compiler can and will try to make optimisations for you. So constants will be in-lined if possible.

VMA and LMA, using avr32-objdump

Unfortunately avr32-size -A doesn’t list the ROM address for .data, in order to discover the address we need to use avr32-objdump and understand the difference between VMA and LMA.

Quoting AVR32795: Using the GNU Linker Scripts on AVR UC3 Devices:

3.3 VMA and LMA

Every loadable or allocatable output section has two addresses. The first is the VMA, or virtual memory address. This is the address the section will have when the output file is run. The second is the LMA, or load memory address. This is the address at which the section will be loaded. In most cases the two addresses will be the same.

An example of when the LMA and VMA might be different is when a data section is loaded into ROM, and then copied into RAM when the program starts up (a technique often used to initialize global variables in a ROM-based system). In this case, the ROM address would be the LMA and the RAM address would be the VMA.

We can use avr32-objdump to list the VMA and LMA:

avr32-objdump -h teletype.elf

I have omitted the .debug_* sections and highlighted the .data and .rodata sections. We can see that the VMA and LMA are different for .data, but not for .rodata.

teletype.elf:     file format elf32-avr32

Sections:
Idx Name          Size      VMA       LMA       File off  Algn
  0 .reset        0000200c  80000000  80000000  00000400  2**2
                  CONTENTS, ALLOC, LOAD, READONLY, CODE
  1 .rela.got     00000000  8000200c  8000200c  0000240c  2**2
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  2 .init         0000001a  8000200c  8000200c  0000240c  2**2
                  CONTENTS, ALLOC, LOAD, READONLY, CODE
  3 .text         0000d944  80002028  80002028  00002428  2**2
                  CONTENTS, ALLOC, LOAD, READONLY, CODE
  4 .exception    00000200  8000fa00  8000fa00  0000fe00  2**9
                  CONTENTS, ALLOC, LOAD, READONLY, CODE
  5 .fini         00000018  8000fc00  8000fc00  00010000  2**2
                  CONTENTS, ALLOC, LOAD, READONLY, CODE
  6 .rodata       00004118  8000fc18  8000fc18  00010018  2**2
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  7 .dalign       00000004  00000004  00000004  00000000  2**0
                  ALLOC
  8 .ctors        00000008  00000008  80013d30  00014408  2**2
                  CONTENTS, ALLOC, LOAD, DATA
  9 .dtors        00000008  00000010  80013d38  00014410  2**2
                  CONTENTS, ALLOC, LOAD, DATA
 10 .jcr          00000004  00000018  80013d40  00014418  2**2
                  CONTENTS, ALLOC, LOAD, DATA
 11 .got          00000000  0000001c  80013d44  0001441c  2**2
                  CONTENTS, ALLOC, LOAD, DATA
 12 .data         00005d3c  0000001c  80013d44  0001441c  2**2
                  CONTENTS, ALLOC, LOAD, DATA
 13 .balign       00000000  00005d58  80019a80  0001a158  2**0
                  ALLOC
 14 .bss          00001f80  00005d58  00005d58  00000000  2**2
                  ALLOC
 15 .heap         0000e328  00007cd8  00007cd8  00000000  2**0
                  ALLOC
 16 .comment      0000002f  00000000  00000000  0001a158  2**0
                  CONTENTS, READONLY
 26 .stack        00002000  00016000  00016000  00000000  2**0
                  ALLOC
 27 .flash_nvram  00023e84  80040000  80019a80  0001a400  2**1
                  ALLOC

Summary

RAM usage is governed by the size of .data, .bss and .stack. These are the things we have control over. Any free RAM is allocated to the .heap, in effect this is how much free RAM you have (malloc use aside).

The flash ROM is divided into ROM and NVRAM as determined by the linker variable __flash_nvram_size__. The size of ROM is primarily determined by the size of .text, .data and .rodata. NVRAM usage via named sections in your source code is given by .flash_nvram, this will not include any manual flash usage via the flash controller.

For more information see:


  1. It appears to include .flash_nvram, plus possibly .heap and .stack in the it’s calculation of BSS.
  2. Commit 220ea9d
  3. Also used by Ansible, see the Atmel website. White Whale, Meadowphysics and Earthsea all use the AT32UC3B0256 which has 32kb of RAM and 256kb of flash ROM.
  4. avr32-readelf -p .comment teletype.elf
  5. Non-volatile RAM