Sunday, June 4, 2000

Spent the day tinkering around with the rest of the inline-block bugs. I think I have a pretty good handle on 'em. Just one or two left to go. Fixed 35935, 34862, and 41521. Respectively, we needed to delete all the "special" frames when doing a ContentInserted; the relatively positioned inline code had to do the special splitting magic; and "style changed" reflows need to be savvy with respect to special frames. Fun fun.

Cleaned up some warnings while I was in there; tweaked some of the HAVE_CPP_* macros to get them consistent.

Monday, June 5, 2000

Cleaned up the patch for 35935, I needed to make sure that the anynonymous block frame picked up the position: relative style.

Spent about an hour reviewing Ben Bucksch's crappy MIME patch. What a waste of time: rhp punted and left me holding the bag (sigh).

Spent forever (and then some) discussing rayw's ideas for versioning. Sigh. The group consensus: maybe just doing it by hand like MS COM does isn't so bad. Bitched to brendan about how much time these meetings are taking: I should bug warren tomorrow.

Tinkered around a bit with 27211 which illustrates how screwed up the block-in-inline code gets when you start trying to tinker with it via the DOM. I'm tempted to say "screw this", and say these are cases that we don't handle. It'll take me another couple of days to get this all cleaned up if it really is important. I'll bug buster tomorrow.

Tuesday, June 6, 2000

Not much accomplished today. Gecko meeting, marathon architecture meetings (group, then bloat, then group angst).

Started poking around bug 25963, which is my only "dogfood+" bug. I'm pretty sure this has nothing to do with resolving anchor tags and everything to do with that boneheaded stuff that pink found. Anyway, sent a mail to simon to see if he could try it. I managed to get a profile of a big-ass LXR page, and I'm seeing shades of the bug that pavlov and I were looking at a long time ago with nsVoidArray eating all of the time it takes to lay out a long text document.

We spend about 12% of the total time laying out http://lxr.mozilla.org/mozilla/source/rdf/content/src/nsXULTemplateBuilder.cpp in nsVoidArray::InsertElementAt. Half the time comes from nsCheapVoidArray (Vidur's content model hackery to optimize the zero- or one-child common cases), and the other half comes from nsLineLayout, which I suspect could also benefit from the zero- or one-child optimization.

InsertElementAt() is dominated by 8,500 calls to memcpy() and 16,000 calls to ::operator new() and ::operator delete(). So, it's getting clobbered copying the array around, which I believe it grows 4 slots at a time! This makes me want to take Vidur's hack one further, and create something that's good for zero, one, a small number, and a large number. I think that's the btree magic I was tinkering around with at davidmc's prompting several months ago.

Wednesday, June 7, 2000

Implemented nsAdaptiveVoidArray, which uses vidur's one element hack, and then faults to a b-tree if you start allocating more elements. I'm rebuilding layout now to see if it'll work or not. Adapted my old nsSegmentedSupportsArray, and cleaned it up a bit to be a bit more compact (e.g., avoided extra allocation and packed bits). It's still a bit sloppy in some ways (e.g., IndexOf() just uses ElementAt() and a loop to find the element, which'll be O(n log16n) instead of O(n).

It seems to work, but I'll need to do some more measurement to see how wonderful it really is. warren and I talked about it a bit, and he recommended that I try it against a power-of-two in terms of bloat and speed trade-off. My initial reason for doing this was to avoid wastage risk that power-of-two entails; but, we'll see.

* * *

Spent some time last night trying to do an analysis of the situation. My gut feeling here is that using a B-tree is probably not going to buy us much: in order to keep wastage down in the small case we need to use a fairly small node size (e.g., 16). This ends up equating to about 1.2 words required for each entry. The only advantages that I can think of are:

I'm not sure how I could analyze these. The down sides are:

Anyway, I'm looking forward to hearing back from warren and brendan on the analysis.

Thursday, June 8, 2000

Moved the nsAdaptiveVoidArray changes over to my optimized tree: they shave about 4 seconds off of the 40s page load time (10%, as expected) for a long LXR page.

Realized that I'm not being "honest" with respect to what the node overhead is: I should really count the number of words required for each heap allocation: probably something like two. That makes the ratio be (b+4)/(b-1), so for b=16, we should really be comparing to a power-of-n array with a growth rate of 1.33, instead of 1.2.

Furthermore, I'm comparing the power-of-n array growth's worst case with the B-tree's best case, which isn't fair; however, I don't have the smarts to compute what the power-of-n's overhead is in the "average case". Maybe brendan will have some ideas.

Friday, June 9, 2000

Put the wraps on this b-tree implementation and checked it in. I haven't turned it on in the build yet because I have to add it to the friggin' Mac project file. I'll do that this weekend: maybe take this sad old PowerTower Pro in to work and install the latest Mac juju on it.

Started to get in to grokking the line layout stuff a bit more. I'll have a handle on this nsLineLayout::FindNextText() problem in no time. I feel lucky, baby!

Saturday, June 10, 2000

Hrm. I put some code in to instrument the size of the nsTextRun arrays, and saw this peaking at about twenty. So there is no need to make that into an nsVoidBTree. I think I was seeing it come up on the radar so frequently because of then IndexOf that gets slammed when it linearly searches through the list of text runs.

Sunday, June 11, 2000

Okay, I was wrong about nsTextRun being constrained. On LXR, I really did see this go through the roof. I just untangled some weirdness in the parser that was causing LXR to not display right, and saw a text run of about 500 elements on a short page!

Monday, June 12, 2000

Allright, I implemented nsStatistics last night and gathered some information about the distribution of nsGenericHTMLContainer's children. Turns out 72% of them have zero or one element, 92% have less than 4 elements, and 98% have less than eight elements. This was on a subset of the "top 100" URLs.

This makes me question the wisdom of the b-tree implementation. Certainly b=16 is way too big: this'll create 12 words of wasted space 30% of the time! Decided to pack the bits even harder so that the b-tree overhead is one word per node rather than two. This makes the best-case wastage drop to (b+1)/(b-1), and makes b=8 come out at 1.28 words per node where n>>b (as opposed to 1.42, previously).

Ok, after thinking about this a bit more, I thought I'd go off and write a little program that would generate utilization data (which is the reciprocal of overhead, I guess). I came up with the chart to the right. Utilization Comparison The blue line is nsVoidArray, as it stands today. The pink line is nsVoidArray implemented to increase exponentially by a power of 1.333 after 16 elements have been inserted. The yellow line is nsVoidBTree with b=8. The X-axis is the number of elements in the array, and is logarithmic. The y-axis is memory utilization.

It's pretty clear that exponential expansion of nsVoidArray does better in terms of memory utilization until we get up past about 1000 elements. At which point, nsVoidBTree really only starts to marginally beat the worst case! It's pretty darn horrible at utilization down in the 10 to 100 node range: it only really gets about 50% utilization on average.

I haven't been able to quantify the time required to do memcpy() during the growth of the array, but considering that the case is "so rare" (well, it's rare if you look at the top 100 sites, anyway), that I'm hard pressed to say we need a data structure that can accomodate it.

The only argument left for doing nsVoidBTree is the random insertion and removal argument. And this really only holds water for the (I would claim) extremely rare case that you'd be doing DOM manipulation on a very large content model.

Or does it? the chart to the left shows how many words are actually wasted by nsVoidArray with exponential growth compared to nsVoidBTree. Wasted Words This shows that, even though the utilization ratio may not be good, the number of words actually burned is relatively small. It's a log-log graph, with element count along the x-axis, and words wasted along the y-axis. Between 10 and 100 nodes, nsVoidBTree is pretty consistently burning about twenty words of space; after that, things start to sync up with the nsVoidArray.

When plotted this way, it's easy to see how nsVoidBTree levels off at a consistent burn rate. nsVoidArray does too, but with a much wider "spread". And more interesting, if you were to plot the mean burn of nsVoidArray, it would be lower than nsVoidBTree by a pretty significant amount.

Anyway, I've send a message to people asking for some feedback on this stuff. I think that the middle insertion and removal case would be good for XUL (mostly because of mailnews), although I'm hard pressed to see it being really all that useful in vanilla HTML, it sort of seems like a wash in the other areas of comparison, so you might as well us nsVoidBTree and get the cheap insertion.

Tuesday, June 13, 2000

Got changes to nsVoidBTree and nsVoidArray checked in late last night. nsVoidArray now doubles in size after reaching 16 elements.

Proposed breaking up librdf.so to factor XUL stuff into a "transitional" DLL that would live over in the layout source tree. Want to get some buy-in from everyone who'd be affected, so I'm gonna wait a day or two for the idea to sink in.

While I was sitting in the bloat meeting today, I discovered why <pre>-formatted text containing <span>'s was ending up with such long nsTextRun's. It's treating the entire block of text as a single, long text run; i.e., it's not splitting it into lines, like I'd have thought it would.

Wednesday, June 14, 2000

Spent the day with Brad Quinn from Gateway going over the leak tools n' stuff.

Monday, June 19, 2000

Got buster's block regression tests checked in. Checked in a fix for the stack overflow you'd get loading an XML file that contained XUL. Got rid of a bogus property in the TREE_PROPERTY_HACK code in nsXULTemplateBuilder. Figured out what was wrong with deleting from an inline-block and sent the patch off to nisheeth for review.

Tuesday, June 20, 2000

Got fixes checked in for the 39211, which also fixes a couple of other bugs. I think that's about as much as I'm gonna do for inline blocks right now!

Spent the rest of the day profiling the test case from 39133, which is a gargantuan table. I put the obvious stuff into the bug report, but have a way to go yet. Here's what I've uncovered so far:

- 3% of the time spent in nsHTMLTags::LookupTag(). It's all spent
  grovelling through the AVL tree. This should probably be a
  hashtable. I guess 4.x computed a perfect hash?

- CNavDTD::WillHandleStartTag() is using NS_ConvertToString() where it
  should be using NS_LITERAL_STRING() while notifying the observer
  service. Altogether, inflating the string and notifying the observer
  is accounting for 1.77% of the time. Went ahead and changed this:
  although we'll still have to pay for the observer service, we won't
  need to inflate.

- nsBlockFrame::Reflow() accounts for 30% of the time. Of that:

  - 5% is spent in nsLineLayout::AddText(). 70% of that time is spent
    in ::operator new() creating nsTextRun objects. We're down in the
    noise, but it may make sense to recycle these or something.

  - 9% is spent nsBlockFrame::SlideLine() placing frame views. This is
    dominated by nsContainerFrame::PositionChildViews(). It's a recursive
    function, dominated by the cost of the recursion itself and
    the cost of addref-ing and release-ing atoms. Atoms are expensive
    to addref and release because it appears that they're assumed to
    be threadsafe. Argh! Sheesh, why even use atoms for this? They're
    tokens! Use enums!

Need to keep poking around reflow to see what's going on there. Also need to look into frame construction and content construction. Joy!

Wednesday, June 21, 2000

Finished writing peer reviews. Ran a leak report and filed some bugs on evaughan to fix the temple and obelisk baloney.

Thursday, June 22, 2000

Mostly worked on leaks today. Figured out why the nsIOService was leaking: re-entering the service manager in an evil way. Spent a couple of hours with warren trying to figure out why the FTP protocol was leaking an nsSocketTransport and some nsPipe objects, but to no avail.

Friday, June 23, 2000

Boioioing. Turns out the nsIOService leak was from ruslan's attempt to avoid assertions from the string bundle. So I promptly busted his code. Oh well, I didn't like his solution anyway! (Seriously though, I think warren and rpotts have a whole grand scheme figured out for string mapping anyway.)

Shepherded a couple more leak fixes from Inaky Gonzales.

Started looking at av's plugin bugs. I thought that I was finished with these block-in-inline bugs, but it turns out I really did miss one where the <object> can't be loaded and we've got a block-in-inline frame replacement happening. This one could be nasty to fix, because it's not clear that I can just punt and wipe the containing block. I guess I really have to understand what's going on now.

Damn. So part of the problem here is that when we do the frame replacement, we all of a sudden make the inline that was previously a "normal" inline into a "special" block-in-inline beast. So to get this right, I need to pull in pieces from WipeContainingBlock() and ConstructInline(). Woof.

Tuesday, June 27, 2000

Raptor meeting. Talked with editor folks afterwards about retaining whitespace in tables; harishd is going to flip the switch to make that happen. Talked with mjudge and karnaze about performance of ender-lite inside tables; karnaze is going to profile. warren's meeting, footprint meeting.

Spent the rest of the evening tinkering with leaks with dbaron and looking at (what I believe to be) the last block-in-inline bug. It's a nasty one, because it happens when we're replacing the an <object> tag's frame with the "default" content. I've been playing around with splitting the frame tree back up to the nearest containing block without much luck.

Thursday, June 29, 2000

Finally got the CantRenderReplacedElement() hackery to work when the inline is suddenly changed into a block on Wednesday. And then a tree fell on my car and knocked out the power! Doh!

Tinkered with problem with a plugin's JavaScript description not being available on Linux, and uncovered a real rat's nest of evil code in the plugin initialization stuff. I guess this is what av meant when he said, "we haven't really figured out how to initialize new-style plugins yet." Anyway, what should have been going through the ns4xPlugin object has just been hard-coded into this abomination that derives from nsFileSpec. I'm not sure if we want to spend the time to untangle it (who's writing new-style plugins, anyway?) or if we should really fix it.

Friday, June 30, 2000

Tinkered with maubi.net to get my machine back on line again. Approved some minor bug fixes from BenB and Andreas.