The purpose of this test was to analyze at a gross level the dynamic footprint of the embedded component as it ran through several dozen URLs. Conventional wisdom and anecdotal evidence indicate that the embedded component can only run several URLs before blowing the memory budget.
The graph below the embedded browser component's memory usage pattern over time as several dozen URLs are loaded in rapid sequence via an automated "browser buster".
VM Size reflects the total amount of virtual memory
required by the browser and the test harness. For devices
without virtual memory, this number is probably the most
significant: it reflects the amount of RAM that Gecko will occupy.
RSS is the Resident Set Size, which is RAM
actually consumed by the application on a system with virtual memory.
This number is significant for devices with virtual memory, as
non-resident pages do not task RAM.
Data Size reflects the porition of the
VM Size that is attributed to data, including the
application's heap, statically initialized data, and stack.
Code Size reflects the protion of the
VM Size that is attributed to code.
Note the steady climb in data size from 5MB to 15MB over the first 15 minutes of operation, followed by a slow but steady growth to just beneath 20Mb over the next 45 minutes. The initial rapid rise might be explained by caches filling to capacity; the subsequent slow growth is presumably leaks.
To produce the above data, we used the gtkEmbed test
harness.
gtkEmbed app was modified slightly to disable
window.open() from creating new top-level windows.
The harness ran against a "browser buster" variant that loaded 135
"top" sites in rapid succession, with twenty to thirty seconds between
each site. The "buster" is a CGI script that generates HTML which
loads the target site into a nested IFRAME. Sites are
cycled by embedding an HTTP "refresh" command in the HTML. A secondary
script "watched" the process using the Linux /proc
filesystem to collect raw information.
Certainly some of the footprint comes from memory leaks in Gecko. The
graph below shows an extremely conservative estimate of the amount of
memory leaked "per URL" for gtkEmbed. Numbers are also
provided for a similar Seamonkey build.
The "bytes leaked" are the bytes leaked at shutdown as detected by the XPCOM "bloat log" after loading n URLs. This is a conservative estimate because:
nsISupports and use the
NS_IMPL_ISUPPORTS macro (or it's cousins). (Almost all of
the XPCOM objects in Seamonkey do this.)
MOZ_COUNT_CTOR and MOZ_COUNT_DTOR
macro. Many, but not all, non-XPCOM objects have been instrumented
this way.
sizeof the
object, and does not include dynamically allocated structures that an
object may create. (For example, an nsString is recorded
as being about 40 bytes regardless of how large the buffer is that it
owns.)
In other words, it does not account for much of the memory in the application that is allocated in an ad hoc fashion.
Note that the number of bytes leaked by the Seamonkey build is significantly less for an identical test suite. We presume this is due to bugs in the current embedding harness (for example, it might be possible that the lack of, say, proper focus management may be causing us to leak the DOM element that might otherwise be recieving the focus).
To produce the above data, we used the "XPCOM Bloat Log" as described
above. We ran a debug gtkEmbed build on the "browser
buster", allowing it to load successivly larger and larger numbers of
URLs. At shutdown, we collected the "total bytes leaked" from the
bloat log, and recorded that number.
More than a third of gtkEmbed's memory is consumed by
code, even after running for almost 45 minutes against the browser
buster! The chart below shows a breakdown of the the ten largest
libraries that are resident when gtkEmbed is running, as
a percentage of all gtkEmbed code that is loaded into
virtual memory.
Note that some libraries (e.g., libc-2.1.2.so) may be
shared with other processes that are running.3
The table below details the memory consumed by the .text
segment of the ten largest libraries.
| Library | .text in VM |
|---|---|
libgklayout.so | 3.17MB |
libgtk-1.2.so | 1.14MB |
libc-2.1.2.so | 963KB |
libjsdom.so | 688KB |
libxpcom.so | 680KB |
libX11.so | 647KB |
librdf.so | 512KB |
libnecko.so | 487KB |
libmozjs.so | 381KB |
libhtmlpars.so | 369KB |
| Other (58 libraries) | 4.5MB |
| Total | 13.5MB |
To collect this information, we examined the
/proc/<pid>/maps file, which contains detailed
information about the memory map of a process, including the VM start
and end addresses for each text and data segment the process owns. We
sorted the file, ignoring all segments except text. We used the same
gtkEmbed build that was used to collect the "gross
dynamic footprint" information, so the same caveats apply.
4
1 Note that Seamonkey builds are not stripped, resulting in binaries that are roughly 40% larger.
2,4 dougt has done a fair amount of work reducing the set of XPCOM components required to bring up a minimal browser. Unfortunately, we were unable to use this "minimal configuration" for the purposes of these tests because of several known problems (e.g., HTTP refresh does not work). That will tend to skew the code size numbers unfavorably.
3
We'll have to dig a bit deeper to understand how Linux accounts for
shared code: in the process tables (e.g., accessable via the
/proc filesystem and the ps command), it
appears that each process is credited with what seems to be
"shared" text.