Saturday, December 18, 2010

Improved Hardware Acceleration in Fennec

On Thursday night, after the all-hands party, Matt Woodrow landed a beautiful refactoring of our texture upload code. This should give a noticeable improvement in scrolling performance when accelerated layers are enabled and hopefully fixes some of the problems people were seeing there. It also improves texture upload performance on OS X.

Unfortunately, there are still two bugs that are keeping us from enabling accelerated layers by default:
Bug 619615 - Hangs on Nexus One
Bug 619539 - Startup crashes on Droid

Any help debugging these problems would be greatly appreciated.

Wednesday, December 15, 2010

Hardware Acceleration on Fennec

It's now possible with current nightlies to use OpenGL for compositing in Fennec on Android. To turn it on, go to about:config and set "layers.accelerate-all" to "true" and restart. If it's working you can go to about:support and the Graphics section will say "1/1 OpenGL".

It would be great if people can test it and let me know how it goes.

Monday, November 8, 2010

Dealing with mach_kernel in Shark

Sometimes when profiling a bunch of time ends up in mach_kernel. Figuring out why isn't always easy but here are two tips that should help a bit:

  • You can get better symbols for mach_kernel by downloading a KernelDebugKit
    This can help a bit when trying to figure out what's happening in the kernel. For example, _dtrace_get_cpu_int_stack_t becomes _mach_call_munger.

  • Shark has a System Trace profiling mode. This can show you what code is causing the kernel to do work. It can break down time by system call or by vm fault which should account for most things.

    While trying this out I noticed we were spending a fair amount of time in ChildViewMouseTracker::WindowForEvent(NSEvent*). This gave me the idea that the reason that Firefox causes the WindowServer process to start using a huge amount of CPU is because we tell the WindowServer to give us all of the mouse events instead of the ones only targeted at our window. Presumably this causes the WindowServer to build up a very large queue of events when the Firefox process is stopped and thus use lots of CPU. This turns out to be the case. nsToolkit::RegisterForAllProcessMouseEvents causes us to listen to all mouse events and disabling the code there fixes the problem. Bug 611068 tracks the problem.

Friday, May 28, 2010

Reviewing in vim

Bugzilla's review interface is poor. I find a mild improvement is possible by copying the review text into an editor and reviewing it there. One of things that makes this experience better is syntax highlighting. Here's a modification of vim's diff highlighting script that works with quoted patches. Adding the following to one's .vimrc will get it used for .review files:
au BufNewFile,BufRead *.review setf review

Friday, January 22, 2010

Reproducing bugs on complicated webpages

A while back, I was running into hangs on Facebook with the html5 parser. I wanted to reproduce the problem locally and try to minimize it. However, Facebook is pretty complicated and loads content using ajax, so the traditional approach of saving the page didn't work. I needed a different approach, so I hacked up a http proxy to be able to record and replay server interactions. It works by writing out the result of each GET request to a file named after the Request-URI. After you're done recording an interaction with the server, you can switch to replay mode and instead of proxying, the recording will be played back to the browser.

I should mention a couple of limitations to this approach. First, it assumes that the resources specified at particular urls don't change over the recording session. Second, it assumes that the resources requested are dependent only on the data recorded and not on the time or anything like that. Even with these limitations, the proxy seems to work well enough.

It's currently pretty hacky, but you can grab it from git:// There's a README in the repository explaining how to run and replay. I'd love to hear if this of any use to anyone else or if you have patches for improving it.

Graphics performance in Firefox 3.6

One of the performance improvements included in Firefox 3.6 is a new path rasterizer for use on Windows. This new rasterizer improves vector graphics performance substantially.

The previous rasterizer is designed for a XRender trapezoid model of rasterization. In this model, to fill a polygon, we first tessellate it into a collection of trapezoids. Next, each of these trapezoids is rasterized and the result added to mask image. Finally, this mask is used to composite the filled polygon. This design can work well if we rasterize multiple trapezoids in parallel as could be done on a GPU. However, when we're using the CPU to sequentially rasterize and blend each trapezoid it's not the most efficient approach.

Scanline rasterization is the textbook method for filling polygons and when using a CPU it can be more efficient than tessellating. Instead of breaking the polygon into trapezoids we iterate over each scanline of the mask image. For each scanline, we iterate over the edges that it intersects and fill the pixels in-between the edges to produce the mask image.

M Joonas Pihlaja contributed a scanline rasterizer to cairo as part of a Google Summer of Code project and it's included in Firefox 3.6. This new rasterizer makes a pretty significant difference when filling complex paths. For example, this test draws a spinning map of the world using canvas. In Firefox 3.5, I get about 6-7fps. Using Firefox 3.6, it's nearly 4x faster with about 19-24fps.