Saturday, July 1, 2000

Started looking over some of the block and line reflow code last night after thinking about "how one flattens implicit recursion, in general". Wow. Now I understand why Kipp didn't want to go and do this. And, in looking at the code, I started to wonder if maybe some of the bugs might be better served by really digging in and seeing what the hell is going wrong with block and line layout. For example, the bug dealing with the long parts list was a ton of nested <ul> tags: maybe it's running into the problem with list numbering?

Anyway, I had a few ideas with respect to how to proceed with the interruptable reflow stuff that seemed cool before I started really reading the code. I'll write 'em down, because maybe they'll turn out to make a bit of sense.

So, while this all sounds fine an nice, it turns out that block and line frame layout is recursive in a pretty indirect way; we wind our way through about three or four different routines before we end up back in nsBlockFrame::Reflow(). And at each level, there is so much pre- and post-recursion cruft going on, I can see that this may be very difficult to do interruptably.

Which brings us back to Kipp's (and Troy's) assertion that we just need to make sure that we reflow frequently enough to avoid getting into a state where we're reflowing the entire world. By which is meant, tinker with the content sink notificiations! Now, about two or three months ago, an assertion was made that we reflow too much, and Nisheeth went off and fixed this by coalescing reflows together. I wonder if the real win here is to avoid over coalescing reflows.

* * *

Okay, so this SplitToContainingBlock() stuff is kicking my ass. Here is what I think needs to happen.

  1. In the initial state, you've got a block frame B that contains a bunch of inline frames eventually winding down to the <object> frame o

    A-->B-->C
        |
        a-->1-->a'
            |
            b-->2-->b'
                |
                o
    
  2. Now the object frame o gets split into the left inlines (o), the block frames (O), and the right inlines (o').

            o O o'
    
  3. We call SplitToContainingBlock(), which will split 2 as follows.

      b-->2   2*  2'-->b'
          |   |   |
          o   O   o'
    

    Note that 2 gets split into 2* and 2', which correspond to the anonymous block and inline frames, respectively.

  4. We recurse again to split 1.

      a-->1      1*  1'-->a'
          |      |   |
          b-->2  2*  2'-->b'
              |  |   |
              o  O   o'
    
  5. When we hit B, the recursion terminates and the frame model is fat and happy.

    A-->B-->C
        |
        a-->1------>1*-->1'-->a'      
            |       |    |
            b-->2   2*   2'-->b'
                |   |    |
                o   O    o'
    

    Since B is a block, it is allowed to contain both block and inline frames, so we can let 1* and 1' be "real siblings" of 1.

So this all seems great, but there's a problem. A bad problem.

As we recurse up, I need to make it so that the "left inline frame" (e.g., 2) gives its next sibling (e.g., b' to the "right inline frame" (e.g., 2'). Unfortunately, the line boxes all assume that the number of frames in the frame list is correct, and when I twiddle with the frames directly like this, it totally screws that count. And makes us crash while reflowing.

Furthermore, there are no "noisy" frame manipulation methods that would let me temporarily remove a frame from the frame list and wedge it in someplace else: RemoveFrame() nukes the frame that you want to remove!

So here's the way I'm going to try to solve the problem. I'll "defer" the transfer of the left inline frame's successors to the right inline frame by "one level" in the recursion. This will mean that when the recursion terminates at the block frame, all that will happen is that the block frame and the right inline frame will be inserted into the containing block. The "left inline frame" won't have prematurely had its successors nuked, so it's line box will still be intact. This certainly makes the logic more squirrely. But hopefully it will work. (The big question is, how much of the nested frame's line box state will be screwed if I do this...)

So a more realistic illustration of what happens would be:

  1. In the initial state, you've got a block frame B that contains a bunch of inline frames eventually winding down to the <object> frame o

    A-->B-->C
        |
        a-->1-->a'
            |
            b-->2-->b'
                |
                o
    
  2. Now the object frame o gets split into the left inlines (o), the block frames (O), and the right inlines (o').

            o O o'
    
  3. We call SplitToContainingBlock(), which will split 2 as follows.

            .--------.
           /          \
      b-->2   2*  2'   b'
          |   |   |
          o   O   o'
    

    Note that 2 gets split into 2* and 2', which correspond to the anonymous block and inline frames, respectively. Furthermore, note that 2 still refers to b' as its next sibling, and that 2' does not.

  4. We recurse again to split 1. At this point, we'll break the link between 2 and b', and make b' be the next sibling of 2'.

            .-----------.
           /             \
      a-->1      1*  1'   a'
          |      |   |
          b-->2  2*  2'-->b'
              |  |   |
              o  O   o'
    

    As before, 1 retains a' as its next sibling.

  5. When we hit B, the recursion terminates. We now insert 1* and 1' immediately after 1, resulting in the following frame model. This is done using the "normal" frame insertion mechanism, nsIFrame::InsertFrames(), which properly recomputes the line boxes.

    A-->B-->C
        |
        a-->1------>1*-->1'-->a'      
            |       |    |
            b-->2   2*   2'-->b'
                |   |    |
                o   O    o'
    

    Since B is a block, it is allowed to contain both block and inline frames, so we can let 1* and 1' be "real siblings" of 1.

With this logic, everything seems to work out pretty much right. (I needed to juggle some of the code that was setting the "special sibling" stuff, and had to make sure that everything got reparented properly.)

* * *

Heh. Dammit! I tried running my test with viewer set to a really narrow window, and everything went to hell again. Grr. I think that there are problems if the inline spans multiple line boxes. Well, I guess that gives me something to poke at tomorrow.

Sunday, July 2, 2000

I managed to convince myself late last night that it should be okay that the only frame we do "noisy" manipulation on is the containing block. All the nested frames contained by the block should be inline frames, so the only line boxes we need to worry about will be in the containing block.

I created a test case to experiment with wrapped inlines. It looks what I'm botching are continuation frames. Instead of splitting the first-in-flow, I'm trying to break a continuation frame. Can I just walk to the first-in-flow, and split it? Here's trying...

Well, it didn't work: it inserted the block frames directly after the first inline, and then left the continuation frames fat and happy afterwards:

before before before before before before before before

block block block block block block
block block block block block block
block block block block block block

before before before before before before before before
before before before after after after after after
after after after after after after after after after

So now what I'm going to do is not walk back to the first-in-flow, but try to split the continuation frame itself (like I was doing originally). Bug this time, I'll break the next-in-flow and prev-in-flow pointers. Whee!

Woof. That seemed to do it.

* * *

So I'm trying my hand at figuring out where the time goes on Linux loading a long LXR page. Thought I'd write down what I find so I can bug people about it later.

  1. The parser's "topic" stuff does string compares to determine whether a topic matches. These seem like they should be atoms.

  2. I found nisheeth and vidur's "magic prefs" that control how the content sink notifies the back end.

    I tried tweaking these a bit; specifically, setting the "back-off count" (which should probably be called "turn-off count") to a really large number, and cranking down on the notifications to every 100ms. I managed to really bring this thing to a crawl!

  3. From profiling on Win32, I recall that nsLineLayout::FindNextText() is a real dog. It does a linear walk back to the first-in-flow, and then iterates linealy through the text runs to find a frame.

  4. It's particularly disturbing that Linux is so much less responsive that Win32 while doing a load. If I recall correctly, Win32 would actually scroll fairly decently while loading an LXR page (until it slammed against the back-off count pouring through several thousand lines of text). Why is Linux so crappy?

* * *

So speaking of nsTextRun dying on long inline frames, I went ahead and re-wrote it to use PLHashTable in a righteous way so that access time to individual frames is O(1). (I stuffed the frame pointer in as the key, and use the value field to hold the next frame in the run.) It would be even more righteous if I had the double-hash table, because then I'd waste a lot less space. But, it hasn't made an appearance in NSPR yet, so -- oh well.

Wednesday, July 5, 2000

Talked to mjudge about the incremental reflow problems in block layout and promised to look at them. This is killing EnderLite so bad that he might have to back stuff out.

Went through fix for bug 22037 with Nisheeth; all looked good, but I need to still check what happens for absolutely positioned frames.

Spent some time with dbaron, warren, and ruslan looking at the service manager assertions and concluded that there are some problems with the threading model. Also discussed the necko string bundle hackery and discussed a couple of alternatives to fix, including just doing away with the string bundle stuff altogether (which is what warren wants to do anyway, I guess).

I need to take a look at the bug with images with percentage widths in table cells. I don't understand the code around there enough to make a good decision on what to do.

Re-wrote my patch for the {ib} performance bug where nsLineLayout::FindNextText() was sucking down time. I changed it to use brendan's new JSDHashTable stuff. It speeds up a long LXR document by about 10%; that is, 6s out of 58s. I'm pretty depressed that it wasn't better.

Took a look at a couple of bugs that dealt with building deeply nested content models because of 4.x-class content that doesn't close <font> tags; specifically, one that dealt with crashing, and another that's really just a performance bug in disguise. I almost threw up when I read some of the bogo-computer science commentary in the former. Especially when you look at 4.x, which load the page in under a second. I gave up after three minutes in Mozilla. We're doomed.

Thursday, July 6, 2000

Pinkerton helped me figure out why tooltips weren't working (I am an idiot: I'd left off the "hook" in the main file that would let the tooltip stuff overlay properly). So now I should be able to make progress on the tooltip bug that's assigned to me.

Communal hand wringing with warren, then the whole architecture group, over the sorry state of our application. Vidur says, "just do it"; dougt says, "ship and be happy". I think those are the right attitudes to take at this point.

Talked to brendan for a while about interruptable reflow. We rediscovered the idea that troy (and rickg) had been talking about, which was to use the "reflow command" architecture to manage interruptability. Specifically, it might be possible to stop reflow once you've reached an invariant, and then enqueue a reflow command to restart the reflow where you'd left off.

Talked with dougt for a bit about event queues and why the Unix build is so unresponsive when the event queue is flooded. I don't understand that code well enough to have really taken away much from our conversation. Need to go read it, I suppose.

Friday, July 7, 2000

Taking a look at the incremental reflow problem that mjudge was talking about. nsIReflowCommand::BuildPath()

Last night, brendan and I were talking about using the reflow commands to do interruptable reflow. I see a member variable mChildFrame that seems like it could be used to maintain the partial state...

Spent some time looking at the tooltip bug, but unfortunately, it looks like it can't be reproduced on Linux. At least, it can't be reproduced on Linux by me. I'll reboot in Win32 one of these days and try to reproduce it there.

Fix the bookmark URL updating bug: it turned out that I'd failed to re-implement the OnChange() logic when I re-wrote all the template builder stuff a couple of months ago. Oops! Anyway, the fix was easy.

Spent a bit of time trying to get jprof to run, but to no avail. Need to figure out how to run profiles with that thing...

Monday, July 10, 2000

Meeting to go over the "plan for performance" for nsbeta3. I'm supposed to go off and figure out "where we are" for embedding. I figure I'll run viewer, go through the top 100 sites and see how much memory we use, try a couple of degenerate sites, and report back.

Looking at the <embed hidden="true"> bug. This is not going to be all that easy. It appears to be the case that there are plenty of platform-specific hacks floating around. I went back through the Classic codebase and here's what I've found:

I think that's as far as I need to care about: FE_CreateEmbedWindow() is only be called if the LO_ELE_HIDDEN attribute is not set, and the LO_ELE_HIDDEN attribute is only set in the case of an <applet> or <embed> tag. In other words, if I encounter an <applet> or <embed> tag with a hidden="true" attribute (well, see this code for the exact logic: looks like hidden set to "no", "false", or "off"), then we'll simply punt on creating the plugin window.

* * *

Well, I guess things are never that easy. I started poking around nsObjectFrame.cpp and was quickly disheartened. I also found this gem, which seems to indicate that on Mac, we always need to call SetWindow() on the plugin.

Tuesday, July 11, 2000

Did some profiling on a page with a ton of Ender widgets: discovered that the bulk of the time was going to create native widgets for each entry field. Duly noted in the bug.

Wednesday, July 12, 2000

Last night, I went and ripped out nsTextRun. I spent the better part of the day figuring out how to make it work. Turns out that there was a bunch of code that troy put in to lazily fixup an overflow frame's parent pointer when constructing new line boxes. Since the frame pointers hadn't been fixed up by the time nsLineLayout::FindNextText() was being called, I was trying to walk this crazy frame tree.

Troy put this code in to fix bug 5588, but using the ultra-scientific "one mississippi, two mississippi" method of profiling, I couldn't see much of a difference between an optimized build with his changes, and one without. Sent him a message to see if he can remember why he did this. (Since it was back in March, he should hopefully be able to reconstruct his thinking...)

Talked to warren for a bit about optimizing nsXULTemplateBuilder::CheckTemplate(). We eventually decided that the best way to do it would be to add HasArcOut() and HasArcIn() methods. We'll do the "end around" interface later, to make bookmarks and drag & drop faster.

Thursday, July 13, 2000

Some prose I wrote to rationalize nsTextRun-ectomy. Might need it to fight PDT...

nsTextRun is a datastructure that stores the frame pointers to all of
the primary text frames in a block. It is an implementation detail of
nsLineLayout::FindNextText(), which is called by the line-breaking
code to find the next text frame in the block while measuring the text
to fit in a line box.

The problem is that we are spending an inordinate amount of time
keeping the nsTextRun data structure up-to-date. We frequently need to
update it during incremental reflow, and the datastructure can grow to
be extremely large on long text documents. Furthermore,
FindNextText()'s implementation ends up degenerating to a linear crawl
through the array!

I propose (in bug 19051) that we eliminate this altogether, and
replace nsLineLayout::FindNextText()'s implementation with one that
directly crawl's a block's frame hierarchy. There are patches attached
to the bug that do just this. This changes FindNextText() from an O(n)
operation (where n is the number of text frames in the block) to a
worst-case O(log n) operation, with the base of the logarithm being
the fan-out in the block's frame subtree. Typically, it so close to
O(1) as to be indistinguishable. Furthermore, it eliminates all the
code that was used to maintain the data structure.

Did a bit of performance analysis, comparing the new code and the old code. Sent some email back and forth with Troy. I'm seeing about a 10-20% slowdown in the new code on documents that end up pushing to overflow lines, so after I verify that my eager parent-setting is the cause, I'm going to see if I can't get both lazy parent setting and tree grovelling.

Friday, July 14, 2000

Got the lazy parent setting stuff working last night: it was pretty simple after all. I just needed to ensure that as we traversed "across" frames, we made sure to set any frames further along in the child list to have the same parent as those to their left. Protects the assumption that you enter nsLineLayout::FindNextText() with a node that has proper heritage.

Went round and round a couple times with edburns on a bug that is supposed to use the alt attribute on an applet to display the text if there is no alternate content between the applet tags. Turns out that it doesn't really fix the bug: there are deeper problems in the object frame that cause the "plugin downloader plugin" to display.

Spent some time banging around with may patch to fix the <embed hidden="true"> bug. Found out that I was being an idiot, and that plugins are working on Win32. Just some JavaScript sniffing stuff appears to be broken. I'll try to wrap these bugs up this weekend.

Friday, July 21, 2000

For bug 31933,

Monday, July 24, 2000

Spent most of last week getting plug-in stuff wrapped up for nsbeta2. Oh, my head. Applying dbaron's GetLinkState() changes to my tree. Tried to do it with scc's L-prefix turned on for NS_LITERAL_STRING(), but my build barfed all over. I'll leave fixing that to him: looks like nsString and friends need some love.

Tuesday, July 25, 2000

Spent the day looking at bug 46013, which had to do with not being able to use any of the chrome registry the first time you ever start the browser. Ends up this is caused by RDF needing to initialize the layout DLL to create a namespace manager. I started by moving the namespace manager code into the parser, but ended up deciding that the way to go was to do one-off namespace management. Turns out that the namepsace manager APIs are a bad match for the way RDF uses them anyway.

Wednesday, July 26, 2000

Profiled "new window creation" after applying patches for a bunch of the content baloney that hyatt and I worked out on Monday. We spend half of the time (!) calling OLE's RegisterDragDrop() method six times. Oof. Not sure why we need to do it six times (or even at all). Of course this may be a "lie", especially if the funtion's wall clock time was being measured instead of its actual CPU time. Pinged pink about it; looks like rods wrote most of this code. I'll bug him about it...

Wonderful orgy of tree dumping as nsbeta2 goes off on a branch today. Ahh. That felt good.

Thursday, July 27, 2000

Bug 46013, redux. This bug got re-opened today :-(. Turns out that there was also a refcounting problem that was causing this bug to manifest itself in a different and wonderful way. Changed the composite datasource from using an nsISupportsArray to a nsVoidArray so that it wouldn't addref & release while notifying. Cleaned up refcounting and ownership over in the nsChromeUIDataSource.

My nsTextRun evisceration caused an editor regression because I'd failed to bring over the logic that broke a text run at a non-inline frame. I put back a method on nsIFrame called CanContinueTextRun(), which asks a frame if a text run can be continued through it. Everyone except text, first-letter, and inline frames answer "no".

Friday, July 28, 2000

Moved nsFixedSizeAllocator out of RDF and into xpcom/ds for use by harishd, alecf, and whoever else feels the urge.

Bug triage fun with jud, warren, and vidur.

Saturday, July 29, 2000

Looked into a problem with the XUL template builder where calling rebuild() twice causes a crash. Turns out that we clobber the conflict set, leaving the content support map with dangling pointers into it. I'm not sure why mail/news doesn't crash all over when switching folders. I put together a patch, but am not really happy with it (for a couple of reasons). First, I'd like to understand better why mail/news doesn't crash. Second, I'd like for the code that just changes the ref attribute on the root to also dump the conflict set. Third, I think that it may make sense to move the content support map into the conflict set: I'm skeptical that these really need to be separate objects.

Monday, July 31, 2000

Spent the evening scratching my head over this problem with the template builder. I've pretty much convinced myself that the ContentSupportMap is not necessary, and have gone so far as to yank it out completely (along with commenting a good porition of the code, which I had a hard time remembering...)

Anyway, as far as I can tell, the only reason that I'd made the ContentSupportMap was so that I could pick up where I'd left off in CreateTemplateContents(). Specifically, I needed to record the Match for which we'd left off building content.

The thing is, we can get back to this match using the ClusterKey, constructed from the member and container variables (which we can fetch by walking up the content model), and then selecting the Match with the highest priority from that set of matches. Seems to work, doubt it's much slower or faster, but it uses less memory. Only problem is that the friggin' bookmarks window (not the sidebar panel; just the window) doesn't display! So I need to debug that. I'm also not convinced that it's pulling all the right rules out. Need to regression test a bit more carefully.