<?xml version='1.0' encoding='UTF-8'?><?xml-stylesheet href="http://www.blogger.com/styles/atom.css" type="text/css"?><feed xmlns='http://www.w3.org/2005/Atom' xmlns:openSearch='http://a9.com/-/spec/opensearchrss/1.0/' xmlns:georss='http://www.georss.org/georss' xmlns:gd='http://schemas.google.com/g/2005' xmlns:thr='http://purl.org/syndication/thread/1.0'><id>tag:blogger.com,1999:blog-1386948037384435441</id><updated>2012-01-24T22:40:35.620-08:00</updated><title type='text'>Jeff Muizelaar</title><subtitle type='html'></subtitle><link rel='http://schemas.google.com/g/2005#feed' type='application/atom+xml' href='http://muizelaar.blogspot.com/feeds/posts/default'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1386948037384435441/posts/default?max-results=100'/><link rel='alternate' type='text/html' href='http://muizelaar.blogspot.com/'/><link rel='hub' href='http://pubsubhubbub.appspot.com/'/><author><name>Jeff Muizelaar</name><uri>http://www.blogger.com/profile/17483047845050494642</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><generator version='7.00' uri='http://www.blogger.com'>Blogger</generator><openSearch:totalResults>19</openSearch:totalResults><openSearch:startIndex>1</openSearch:startIndex><openSearch:itemsPerPage>100</openSearch:itemsPerPage><entry><id>tag:blogger.com,1999:blog-1386948037384435441.post-1607666791795567725</id><published>2011-10-19T13:17:00.000-07:00</published><updated>2011-10-19T13:17:11.717-07:00</updated><title type='text'>Moving patches between git and hg</title><content type='html'>Moving patches between git and hg is currently not very easy. I found script that converts in one direction and I added script that goes in the other direction. The scripts are available here: &lt;a href="https://github.com/jrmuizel/patch-converter"&gt;https://github.com/jrmuizel/patch-converter&lt;/a&gt;. Hopefully, this will make it a bit easier.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1386948037384435441-1607666791795567725?l=muizelaar.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://muizelaar.blogspot.com/feeds/1607666791795567725/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=1386948037384435441&amp;postID=1607666791795567725' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1386948037384435441/posts/default/1607666791795567725'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1386948037384435441/posts/default/1607666791795567725'/><link rel='alternate' type='text/html' href='http://muizelaar.blogspot.com/2011/10/moving-patches-between-git-and-hg.html' title='Moving patches between git and hg'/><author><name>Jeff Muizelaar</name><uri>http://www.blogger.com/profile/17483047845050494642</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1386948037384435441.post-1221486019482053995</id><published>2011-06-16T14:04:00.000-07:00</published><updated>2011-06-16T14:54:29.238-07:00</updated><title type='text'>WebGL considered harmful?</title><content type='html'>Today Microsoft posted an article titled "WebGL considered harmful". It seems like a lot of their arguments against WebGL also apply to Silverlight 5's XNA 3D graphics support. It, like WebGL, allows authors to write shaders using HLSL. I wonder, if you reframe their article by replacing WebGL with Silverlight 5, is anything untrue? If so, how does Microsoft solve these problems?&lt;blockquote&gt;&lt;h2&gt;Silverlight XNA 3D considered harmful&lt;/h2&gt;Microsoft's Silverlight 5 XNA 3D technology is a low-level 3D graphics API for the web.&lt;br /&gt;&lt;br /&gt;One of the functions of MSRC Engineering is to analyze various technologies in order to understand how they can potentially affect Microsoft products and customers. As part of this charter, we recently took a look at XNA 3D. Our analysis has led us to conclude that Microsoft products supporting XNA 3D would have difficulty passing Microsoft’s &lt;a href="http://www.microsoft.com/security/sdl/default.aspx"&gt;Security Development Lifecycle&lt;/a&gt; requirements. Some key concerns include:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;&lt;span style="font-weight: bold;"&gt;Browser support for Silverlight 5 directly exposes hardware functionality to the web in a way that we consider to be overly permissive&lt;/span&gt;&lt;br /&gt;The security of Silverlight 5 as a whole depends on lower levels of the system, including OEM drivers, upholding security guarantees they never really need to worry about before. Attacks that may have previously resulted only in local elevation of privilege may now result in remote compromise. While it may be possible to mitigate these risks to some extent, the large attack surface exposed by Silverlight 5 remains a concern. We expect to see bugs that exist only on certain platforms or with certain video cards, potentially facilitating targeted attacks.&lt;br /&gt;&lt;/li&gt;&lt;br /&gt;   &lt;li&gt;    &lt;span style="font-weight: bold;"&gt;Browser support for Silverlight 5 security servicing responsibility relies too heavily on third parties to secure the web experience&lt;/span&gt;&lt;br /&gt;As Silverlight 5 vulnerabilities are uncovered, they will not always manifest in the Silverlight 5 API itself. The problems may exist in the various OEM and system components delivered by IHV’s. While it has been suggested that Silverlight 5 implementations may block the use of affected hardware configurations, this strategy does not seem to have been successfully put into use to address existing vulnerabilities. It is our belief that as configurations are blocked, increasing levels of customer disruption may occur. Without an efficient security servicing model for video card drivers (eg: Windows Update), users may either choose to override the protection in order to use Silverlight 5 on their hardware, or remain insecure if a vulnerable configuration is not properly disabled. Users are not accustomed to ensuring they are up-to-date on the latest graphics card drivers, as would be required for them to have a secure web experience. In some cases where OEM graphics products are included with PCs, retail drivers are blocked from installing. OEMs often only update their drivers once per year, a reality that is just not compatible with the needs of a security update process.&lt;/li&gt;&lt;br /&gt;&lt;li&gt;&lt;span style="font-weight: bold;"&gt;Problematic system DoS scenarios&lt;br /&gt;&lt;/span&gt; Modern operating systems and graphics infrastructure were never designed to fully defend against attacker-supplied shaders and geometry. Although mitigations such as Direct3D 10 may help, they have not proven themselves capable of comprehensively addressing the DoS threat. While traditionally client-side DoS is not a high severity threat, if this problem is not addressed holistically it will be possible for any web site to freeze or reboot systems at will. This is an issue for some important usage scenarios such as in critical infrastructure.&lt;/li&gt;&lt;/ul&gt;&lt;br /&gt;We believe that Silverlight 5 will likely become an ongoing source of hard-to-fix vulnerabilities. In its current form, XNA 3D in Silverlight 5 is not a technology Microsoft can endorse from a security perspective.&lt;br /&gt;&lt;br /&gt;We recognize the need to provide solutions in this space however it is our goal that all such solutions are secure by design, secure by default, and secure in deployment.&lt;br /&gt;&lt;/blockquote&gt;&lt;br /&gt;The problems Microsoft is worried about are real, and they don't have any easy solutions. At the same, I don't think we need to wait for perfect answers before trying. With Silverlight 5's 3D support, it looks like Microsoft feels the same way.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1386948037384435441-1221486019482053995?l=muizelaar.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://muizelaar.blogspot.com/feeds/1221486019482053995/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=1386948037384435441&amp;postID=1221486019482053995' title='10 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1386948037384435441/posts/default/1221486019482053995'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1386948037384435441/posts/default/1221486019482053995'/><link rel='alternate' type='text/html' href='http://muizelaar.blogspot.com/2011/06/webgl-considered-harmful.html' title='WebGL considered harmful?'/><author><name>Jeff Muizelaar</name><uri>http://www.blogger.com/profile/17483047845050494642</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>10</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1386948037384435441.post-721624392910701759</id><published>2011-04-20T07:26:00.000-07:00</published><updated>2011-05-24T18:20:55.019-07:00</updated><title type='text'>WebP</title><content type='html'>&lt;p&gt;Overall the reception to &lt;a href="http://code.google.com/speed/webp/faq.html#whatis"&gt;WebP&lt;/a&gt; that I've seen so far has been pretty negative.  Jason Garrett-Glaser wrote a &lt;a href="http://x264dev.multimedia.cx/archives/541"&gt;popular review&lt;/a&gt;, but there have been similar response from others like &lt;a href="http://cbloomrants.blogspot.com/2010/10/10-02-10-webp.html"&gt;Charles  Bloom&lt;/a&gt;. Since these reviews, the WebP encoder has improved on the example used by Jason (&lt;a href="http://x264.nl/developers/Dark_Shikari/imagecoding/vp8.png"&gt;old&lt;/a&gt; vs. &lt;a href="http://people.mozilla.org/%7Ejmuizelaar/webp/parkjoy-webp.png"&gt;new&lt;/a&gt;) but it's still not a lot better than a decent &lt;a href="http://people.mozilla.org/%7Ejmuizelaar/webp/parkjoy.jpg"&gt;JPEG encoding&lt;/a&gt;. I also have a couple of thoughts on the format that I'd like to share.&lt;/p&gt;   &lt;p&gt; Google &lt;a href="http://code.google.com/speed/webp/docs/c_study.html"&gt;claims it's better&lt;/a&gt; than JPEG but this study has some problems and as a result, isn't very convincing (&lt;span style="font-weight: bold;"&gt;Update: &lt;/span&gt;Google has a &lt;a href="http://code.google.com/speed/webp/docs/webp_study.html"&gt;new study&lt;/a&gt; that's better). First, they recompress existing JPEG's. This is unconventional.  Perhaps recompressing JPEG's is their target market, but I find that a little weird and it should at least be explained in the study. Second, they use PSNR as a comparison metric. This is even more confusing. PSNR has, for a while now, been accepted as a poor measure of visual quality and I can't understand why Google continues to use it. I think it would help the format's credibility if Google did a study that used uncompressed source images, SSIM as a metric and provided enough information about the methodology so that others could reproduce their results. &lt;/p&gt; &lt;p&gt; WebP also comes across as half-baked. Currently, it only supports a subset of the features that JPEG has. It lacks support for any color representation other than 4:2:0 YCrCb. JPEG supports 4:4:4 as well as other color representations like CMYK. WebP also seems to lack support for EXIF data and ICC color profiles, both of which have be come quite important for photography. Further, it has yet to include any features missing from JPEG like alpha channel support. These features can still be added, but the longer they remain unspecified, the more difficult it will be to adopt. &lt;/p&gt; &lt;p&gt; &lt;a href="http://en.wikipedia.org/wiki/JPEG_XR"&gt;JPEG XR&lt;/a&gt; provides a good example of what features you'd want from a replacement for JPEG. It has support for an alpha channel and &lt;a href="http://en.wikipedia.org/wiki/High_dynamic_range_imaging"&gt;HDR&lt;/a&gt; among &lt;a href="http://en.wikipedia.org/wiki/JPEG_XR#Capabilities"&gt;others&lt;/a&gt;.  Microsoft has also put in the effort to have it formally standardized. However, it too is not without problems. The compression improvements it claims haven't matched evaluations other parties have done. I don't know enough about JPEG XR to say whether this is because the &lt;a href="http://x264dev.multimedia.cx/archives/164"&gt;encoders are bad&lt;/a&gt; or because the format is not really that great. &lt;/p&gt; &lt;p&gt; Every image format that becomes “part of the Web platform” exacts a cost for all time: all clients have to support that format forever, and there's also a cost for authors having to choose which format is best for them. This cost is no less for WebP than any other format because progressive decoding requires using a separate library instead of reusing the existing WebM decoder. This gives additional security risk but also eliminates much of the benefit of having bitstream compatibility with WebM. It makes me wonder, why not just change the bitstream so that it's more suitable for a still image codec? Given every format has a cost, if we're going to have a new image format for the Web we really need to make it the best we can make it with today's (royalty-free) technology.&lt;/p&gt; &lt;p&gt; Where does that leave us? WebP gives a subset of JPEG's functionality with more modern compression techniques and no additional IP risk to those already shipping WebM. I'm really not sure it's worth adding a new image format for that. Even if WebP was a clear winner in compression, large image hosts don't seem to care that much about image size. Flickr compresses their images at libjpeg quality of 96 and Facebook at 85: both quite a bit higher than the recommended 75 for &lt;a href="http://google.com/codesearch/p?hl=en#M3EzZdztQo0/pub/graphics/packages/jpeg/jpegsrc.v6.tar.gz%7Cu6QbQHjGtGQ/jpeg-6/jcparam.c&amp;amp;l=70"&gt;“very good quality”&lt;/a&gt;. Neither of them optimize the huffman tables, which gives a lossless 4–7% improvement in size. Further, switching to progressive JPEG gives an even larger improvement of 8–20%.&lt;/p&gt; &lt;p&gt; History has shown that adoption of image formats on the internet is slow. JPEG 2000 has mostly failed on the internet. PNG took a very long time, despite having large advantages. I expect that adoption may even be slower now than it was in the past, because there is no driving force. I would also be surprised if Microsoft adopted WebP because of their stance on WebM and their involvment in JPEG XR. Can WebP succeed without being adopted by all of the major web browsers? It's hard to say, but it wouldn't be easy. Personally, I'd rather the effort being spent on WebP be spent on a &lt;a href="http://cbloomrants.blogspot.com/2010/10/10-08-10-optimal-baseline-jpeg.html"&gt; improved JPEG encoder&lt;/a&gt; or even an improved JPEG XR encoder. &lt;/p&gt; &lt;p&gt; Is JPEG still great? No. Is there a great replacement for it? It doesn't feel like we're there yet. &lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1386948037384435441-721624392910701759?l=muizelaar.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://muizelaar.blogspot.com/feeds/721624392910701759/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=1386948037384435441&amp;postID=721624392910701759' title='24 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1386948037384435441/posts/default/721624392910701759'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1386948037384435441/posts/default/721624392910701759'/><link rel='alternate' type='text/html' href='http://muizelaar.blogspot.com/2011/04/webp.html' title='WebP'/><author><name>Jeff Muizelaar</name><uri>http://www.blogger.com/profile/17483047845050494642</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>24</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1386948037384435441.post-1588501744036403364</id><published>2011-03-02T21:07:00.000-08:00</published><updated>2011-03-02T21:51:57.008-08:00</updated><title type='text'>Drawing Sprites: Minimizing draw calls</title><content type='html'>One reason OpenGL is so fast is that it allows applications to provide large chunks of work to be done in parallel. When drawing sprites with WebGL, it's important to make an effort to take advantage of this by minimizing the number of &lt;a href="http://www.khronos.org/registry/webgl/specs/latest/#5.13.11"&gt;draw calls&lt;/a&gt;. This is true with OpenGL, but even more so with WebGL because each draw call requires extra validation.&lt;br /&gt;&lt;br /&gt;Unfortunately, minimizing draw calls isn't always easy. It's often impractical or impossible to draw all your geometry at once because the geometry must share the same texture(s). FishIE used a single &lt;a href="http://blog.mozilla.com/webdev/2009/03/27/css-spriting-tips/"&gt;sprite&lt;/a&gt; from the beginning, which made it easy to draw everything at once. If possible, move as many sprites into the same texture as possible and sort or group sprites using the same texture into a single draw call. It may also be possible to use multi-texturing, but depending on the GPU architecture, this can cause all textures to be read for each sprite which will have dramatic impact on performance because of limitations on texture bandwidth.&lt;br /&gt;&lt;br /&gt;The performance difference between drawing sprites individually versus all at once can be pretty big. I made another version of the &lt;a href="http://people.mozilla.org/%7Ejmuizelaar/fishie/fishie-gl-individual.html"&gt;FishIE demo that draws each sprite individually&lt;/a&gt;. This version draws 2000 fish at 10fps on my test system, while the original &lt;a href="http://people.mozilla.org/%7Ejmuizelaar/fishie/fishie-gl.html"&gt;WebGL FishIE&lt;/a&gt;  can do 4000 fish at 60fps on the same system. Since the same texture is used for all sprites I did not have to rebind the texture for each sprite; doing so would likely decrease performance further.&lt;br /&gt;&lt;br /&gt;Designing an application around these limitation can be tricky, but often the application is in a better position to make compromises or take short cuts than a more general Canvas 2D implementation would be.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1386948037384435441-1588501744036403364?l=muizelaar.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://muizelaar.blogspot.com/feeds/1588501744036403364/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=1386948037384435441&amp;postID=1588501744036403364' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1386948037384435441/posts/default/1588501744036403364'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1386948037384435441/posts/default/1588501744036403364'/><link rel='alternate' type='text/html' href='http://muizelaar.blogspot.com/2011/03/drawing-sprites-minimizing-draw-calls.html' title='Drawing Sprites: Minimizing draw calls'/><author><name>Jeff Muizelaar</name><uri>http://www.blogger.com/profile/17483047845050494642</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1386948037384435441.post-3543947908205238312</id><published>2011-02-28T13:29:00.000-08:00</published><updated>2011-03-01T18:03:24.598-08:00</updated><title type='text'>Drawing Sprites: Canvas 2D vs. WebGL</title><content type='html'>Lately I've seen a lot of graphics benchmarks that basically just test image blitting/sprite performance. These include &lt;a href="http://ie.microsoft.com/testdrive/Performance/FlyingImages/"&gt;Flying Images&lt;/a&gt;, &lt;a href="http://ie.microsoft.com/testdrive/Performance/FishIETank/Default.html"&gt;FishIE&lt;/a&gt;, &lt;a href="http://ie.microsoft.com/testdrive/Performance/SpeedReading/Default.html"&gt;Speed Reading&lt;/a&gt; and &lt;a href="https://developers.facebook.com/blog/post/460"&gt;JSGameBench&lt;/a&gt;(&lt;b&gt;Update:&lt;/b&gt; I just saw the blog post for the &lt;a href="http://developers.facebook.com/blog/post/468"&gt;WebGL JSGameBench&lt;/a&gt;. This further confirms my claim that WebGL is a better way to do sprites). They all try to draw a bunch of images in a short amount of time. They mostly use two techniques: positioned images or canvas' drawImage. Neither of these methods is particularly well suited to this task. Positioned images have typically been used for document layout and the Canvas 2D API was designed as a JavaScript binding to CoreGraphics which owes most of its design to Postscript. Neither were designed for high performance interactive graphics. However, OpenGL, and its web counterpart WebGL, was designed for exactly this.&lt;br /&gt;&lt;br /&gt;To show off some of the potential performance difference available, I ported the FishIE benchmark to WebGL. Along the way I discovered some different problems and ways to solve them.&lt;br /&gt;&lt;br /&gt;The problem, once the overhead of Canvas 2D is removed, is that FishIE very quickly becomes texture read bound. I noticed that the FishIE sprites have a lot of horizontal padding. This padding was included in the drawImage calls which causes us to do a bunch of texture reads for transparent pixels. Trimming this down a little gave a noticeable framerate boost.&lt;br /&gt;&lt;br /&gt;An even bigger cause of texture bandwidth waste is that the demo uses a large sprite to draw a small fish. Fortunately, OpenGL has a great solution to this problem: &lt;a href="http://en.wikipedia.org/wiki/Mipmap"&gt;mipmaps&lt;/a&gt;. &lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://2.bp.blogspot.com/-2Q7pPwc0xhA/TWwc-5zEEWI/AAAAAAAAACI/b8FWXFSFTX8/s1600/mipmap-out.png"&gt;&lt;img style="float:right; margin:0 0 10px 10px;cursor:pointer; cursor:hand;width: 80px; height: 80px;" src="http://2.bp.blogspot.com/-2Q7pPwc0xhA/TWwc-5zEEWI/AAAAAAAAACI/b8FWXFSFTX8/s400/mipmap-out.png" title="without mipmaps" alt="without mipmaps" id="BLOGGER_PHOTO_ID_5578865905397666146" border="0" /&gt;&lt;/a&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://2.bp.blogspot.com/-I2KqTQ_xcBI/TWwc5wXkdlI/AAAAAAAAACA/xjzXequhCEw/s1600/alias-out.png"&gt;&lt;img style="float:right; margin:0 0 10px 10px;cursor:pointer; cursor:hand;width: 80px; height: 80px;" src="http://2.bp.blogspot.com/-I2KqTQ_xcBI/TWwc5wXkdlI/AAAAAAAAACA/xjzXequhCEw/s400/alias-out.png" alt="" id="BLOGGER_PHOTO_ID_5578865816967083602" border="0" /&gt;&lt;/a&gt;Mipmaps let the GPU use smaller textures when drawing smaller fish, which can dramatically reduce the texture bandwidth required. They also improve the quality of small fish by eliminating the aliasing that occurs when downscaling by large amounts.&lt;br /&gt;&lt;br /&gt;Mipmapping is a good example of the flexibility that WebGL allows. Canvas 2D aims to be an easy to use API for drawing pictures, but this ease of use comes at some cost. First, the Canvas 2D implementation has to guess the intents of the author. For example drawImage on OS X does a high quality lanczos down scaling of the image. Direct2D just does a quick bilinear down scale. This makes it difficult for  authors to know how fast drawImage will be. Further, because the design of Canvas 2D is inspired by an API for describing print jobs, it's not well suited to reusing data between paints.&lt;br /&gt;&lt;br /&gt;Try out the difference with these two modified versions of FishIE:&lt;ol&gt;&lt;li&gt;&lt;a href="http://people.mozilla.org/%7Ejmuizelaar/fishie/fishie.html"&gt;The original FishIE modified only to allow more fish&lt;/a&gt;.&lt;/li&gt;&lt;li&gt;&lt;a href="http://people.mozilla.org/%7Ejmuizelaar/fishie/fishie-gl.html"&gt;FishIE ported to WebGL&lt;/a&gt;.&lt;/li&gt;&lt;/ol&gt; The method I used to port FishIE to WebGL is pretty straight forward so I expect that any of the other benchmarks listed above could also be easily ported to WebGL.&lt;br /&gt;&lt;h2&gt;Pushing the limits&lt;/h2&gt;Once the number of fish becomes high enough we run into Javascript performance problems. FishIE has some Javascript problems that make things worse than they need to be. First, it loops over the fish with "for (var fishie in fish) {". This can end up using 10% of the total CPU time. The problem with this code is that converts all of the array indices to strings and then uses those strings to index into the array. It also has the problem that any additional properties added to the array will also show up as index values, which is likely not the intent of the author.&lt;br /&gt;&lt;br /&gt;Second, each fish object includes a swim() method. Unfortunately, in the FishIE source swim() is a closure inside the Fish() object. This means that the swim() method is different for each Fish which makes things worse for Javascript engines.&lt;br /&gt;&lt;br /&gt;Fixing both of these problems, and making the fish really small lets us get an idea of how many sprites we can actually push around. Here's a &lt;a href="http://people.mozilla.org/%7Ejmuizelaar/fishie/fishie-fast.html"&gt;final version&lt;/a&gt;. If I disable the method jit (&lt;a href="https://bugzilla.mozilla.org/show_bug.cgi?id=637878"&gt;bug 637878&lt;/a&gt;) and run at an even window size (&lt;a href="https://bugzilla.mozilla.org/show_bug.cgi?id=637894"&gt;bug 637894&lt;/a&gt;) I can do 60000 fish at 30fps, which I think is pretty impressive compared to the 1000 that the original Microsoft demo does.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1386948037384435441-3543947908205238312?l=muizelaar.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://muizelaar.blogspot.com/feeds/3543947908205238312/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=1386948037384435441&amp;postID=3543947908205238312' title='17 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1386948037384435441/posts/default/3543947908205238312'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1386948037384435441/posts/default/3543947908205238312'/><link rel='alternate' type='text/html' href='http://muizelaar.blogspot.com/2011/02/drawing-sprites-canvas-2d-vs-webgl.html' title='Drawing Sprites: Canvas 2D vs. WebGL'/><author><name>Jeff Muizelaar</name><uri>http://www.blogger.com/profile/17483047845050494642</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://2.bp.blogspot.com/-2Q7pPwc0xhA/TWwc-5zEEWI/AAAAAAAAACI/b8FWXFSFTX8/s72-c/mipmap-out.png' height='72' width='72'/><thr:total>17</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1386948037384435441.post-8622897834688961818</id><published>2011-02-18T12:55:00.000-08:00</published><updated>2011-02-18T13:03:49.210-08:00</updated><title type='text'>Updated mozilla-cvs-history git repo</title><content type='html'>I recently ran git gc --agressive on the cvs history git repository mentioned &lt;a href="http://muizelaar.blogspot.com/2010/02/historical-mozilla-central-git.html"&gt;here&lt;/a&gt;. It's now 543M, down from 986M. I've also uploaded a copy to &lt;a href="https://github.com/jrmuizel/mozilla-cvs-history"&gt;github&lt;/a&gt;.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1386948037384435441-8622897834688961818?l=muizelaar.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://muizelaar.blogspot.com/feeds/8622897834688961818/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=1386948037384435441&amp;postID=8622897834688961818' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1386948037384435441/posts/default/8622897834688961818'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1386948037384435441/posts/default/8622897834688961818'/><link rel='alternate' type='text/html' href='http://muizelaar.blogspot.com/2011/02/updated-mozilla-cvs-history-git-repo.html' title='Updated mozilla-cvs-history git repo'/><author><name>Jeff Muizelaar</name><uri>http://www.blogger.com/profile/17483047845050494642</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1386948037384435441.post-8229365047654231461</id><published>2011-02-10T10:43:00.000-08:00</published><updated>2011-02-10T11:33:07.235-08:00</updated><title type='text'>Clone timings</title><content type='html'>Chris Atlee was wondering how clone times differ between git and mercurial so I ran a quick test on a fast linux machine.&lt;br /&gt;&lt;br /&gt;$ time git clone git://github.com/doublec/mozilla-central.git&lt;br /&gt;real 1m33.478s&lt;br /&gt;&lt;br /&gt;$ time git clone mozilla-central moz2&lt;br /&gt;real 0m2.559s&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;$ time hg clone http://hg.mozilla.org/mozilla-central/&lt;br /&gt;real 3m22.510s&lt;br /&gt;&lt;br /&gt;$ time hg clone mozilla-central moz2&lt;br /&gt;real 0m20.660s&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1386948037384435441-8229365047654231461?l=muizelaar.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://muizelaar.blogspot.com/feeds/8229365047654231461/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=1386948037384435441&amp;postID=8229365047654231461' title='6 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1386948037384435441/posts/default/8229365047654231461'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1386948037384435441/posts/default/8229365047654231461'/><link rel='alternate' type='text/html' href='http://muizelaar.blogspot.com/2011/02/clone-timings.html' title='Clone timings'/><author><name>Jeff Muizelaar</name><uri>http://www.blogger.com/profile/17483047845050494642</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>6</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1386948037384435441.post-341477275038911580</id><published>2011-01-12T16:37:00.000-08:00</published><updated>2011-01-12T13:38:40.531-08:00</updated><title type='text'>historical mozilla-central git repository</title><content type='html'>A number of people use git to work with the mozilla hg tree. In the past I've wanted the entire history as a git repo so I converted the old CVS repository to git and put it up on people.mozilla.org.&lt;br /&gt;&lt;br /&gt;You can set it up as follows:&lt;br /&gt;&lt;code style="font-size: small"&gt;&lt;br /&gt;git clone http://people.mozilla.org/~jmuizelaar/mozilla-cvs-history.git&lt;br /&gt;git clone git://bluishcoder.co.nz/git/mozilla-central.git&lt;br /&gt;&lt;br /&gt;cd mozilla-central/.git/objects/pack&lt;br /&gt;# set up symbol links to cvs-history pack files&lt;br /&gt;ln -s ../../../../mozilla-cvs-history/.git/objects/pack/pack-5b5d604ab48cf7bc2a6b4495292fa8700a987c5f.pack .&lt;br /&gt;ln -s ../../../../mozilla-cvs-history/.git/objects/pack/pack-5b5d604ab48cf7bc2a6b4495292fa8700a987c5f.idx .&lt;br /&gt;cd ../../&lt;br /&gt;&lt;br /&gt;# add a graft from the last revision in the mozilla-central repo&lt;br /&gt;# to the first revision in the cvs-history&lt;br /&gt;echo 2514a423aca5d1273a842918589e44038d046a51 3229d5d8b7f8376cfb7936e7be810635a14a486b &gt; info/grafts&lt;br /&gt;&lt;/code&gt;&lt;br /&gt;Now you have a git repository containing all of the history. You can update the mozilla-central repository as you normally would. The conversion isn't perfect, but it's been good enough to have working blame back into cvs time.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1386948037384435441-341477275038911580?l=muizelaar.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://muizelaar.blogspot.com/feeds/341477275038911580/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=1386948037384435441&amp;postID=341477275038911580' title='5 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1386948037384435441/posts/default/341477275038911580'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1386948037384435441/posts/default/341477275038911580'/><link rel='alternate' type='text/html' href='http://muizelaar.blogspot.com/2010/02/historical-mozilla-central-git.html' title='historical mozilla-central git repository'/><author><name>Jeff Muizelaar</name><uri>http://www.blogger.com/profile/17483047845050494642</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>5</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1386948037384435441.post-4254343743114925704</id><published>2011-01-11T14:05:00.000-08:00</published><updated>2011-01-11T15:24:54.234-08:00</updated><title type='text'>Firefox acceleration prefs changing</title><content type='html'>I just landed a changeset that changes the names of the layer acceleration prefs in Firefox.&lt;br /&gt;&lt;br /&gt;The old prefs were:&lt;br /&gt; layers.accelerate-all&lt;br /&gt; layers.accelerate-none&lt;br /&gt;&lt;br /&gt;The new prefs are:&lt;br /&gt; layers.acceleration.disabled&lt;br /&gt; layers.acceleration.force-enabled&lt;br /&gt;&lt;br /&gt;layers.accelerate-all previously defaulted to 'true' on Windows and OS X. Which meant that there was no easy way to force layer acceleration on if your card had been blacklisted for some reason. The new prefs allow the blacklist to be overwritten. The old prefs are not being migrated over to the new names. If you have a problem with the defaults, please file bugs.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1386948037384435441-4254343743114925704?l=muizelaar.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://muizelaar.blogspot.com/feeds/4254343743114925704/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=1386948037384435441&amp;postID=4254343743114925704' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1386948037384435441/posts/default/4254343743114925704'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1386948037384435441/posts/default/4254343743114925704'/><link rel='alternate' type='text/html' href='http://muizelaar.blogspot.com/2011/01/firefox-acceleration-prefs-changing.html' title='Firefox acceleration prefs changing'/><author><name>Jeff Muizelaar</name><uri>http://www.blogger.com/profile/17483047845050494642</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1386948037384435441.post-8900392212168496708</id><published>2011-01-08T18:42:00.001-08:00</published><updated>2011-01-09T18:44:17.659-08:00</updated><title type='text'>Trying out AVX</title><content type='html'>Intel's new &lt;a href="http://www.realworldtech.com/page.cfm?ArticleID=RWT091810191937&amp;amp;p=1"&gt;Sandy Bridge&lt;/a&gt; CPUs came out this week and they support a new set of instructions called &lt;a href="http://software.intel.com/en-us/avx/"&gt;AVX&lt;/a&gt;. The AVX instructions are a much bigger change than the usual SSE revisions in the past few micro-architectures. First of all, they double the 128 bit SSE registers to 256 bits. Second, they introduce an entirely new instruction &lt;a href="http://en.wikipedia.org/wiki/VEX_prefix"&gt;encoding&lt;/a&gt;. The new encoding switches from 2 operand instructions to 3 operand instructions allowing the destination register to be different than the source registers. For example:&lt;br /&gt;&lt;code&gt;&amp;nbsp;&amp;nbsp;addps  r0, r1&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;# (r0 = r0 + r1)&lt;/code&gt;&lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;vs.&lt;br /&gt;&lt;code&gt;&amp;nbsp;&amp;nbsp;vaddps r0, r1, r2&amp;nbsp;&amp;nbsp;# (r0 = r1 + r2)&lt;/code&gt;&lt;br /&gt;This new encoding is not only used for the new 256 bit instructions, but also for the 128 bit AVX versions of all the old SSE instructions. This means that existing SSE code can improved without requiring a switch to 256 bit registers. Finally, AVX introduces some new data movement instructions, which should help improve code efficiency.&lt;br /&gt;&lt;br /&gt;I decided to see what kind of performance difference using AVX could make in &lt;a href="http://mxr.mozilla.org/mozilla-central/source/gfx/qcms/transform-sse2.c#13"&gt;qcms&lt;/a&gt; with minimal effort. If you use SSE compiler intrinsics, like qcms does, switching to AVX is very easy; simply recompile with -mavx. In addition to using -mavx, I also took advantage of some of the new data movement instructions by replacing the following:&lt;br /&gt;&lt;code&gt;&amp;nbsp;&amp;nbsp;vec_r = _mm_load_ss(r);&lt;br /&gt;&amp;nbsp;&amp;nbsp;vec_r = _mm_shuffle_ps(vec_r, vec_r, 0);&lt;/code&gt;&lt;br /&gt;with the the new vbroadcastss instruction:&lt;br /&gt;&lt;code&gt;&amp;nbsp;&amp;nbsp;vec_r = _mm_broadcast(r);&lt;/code&gt;&lt;br /&gt;Overall, this change reduces the inner loop by 3 instructions.&lt;br /&gt;&lt;br /&gt;The performance results were positive, but not what I expected. Here's what the timings were:&lt;table style="width: auto; margin-left: 1em; -moz-font-feature-settings: &amp;quot;tnum=1&amp;quot;;"&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td style="line-height: auto"&gt;SSE2:&lt;/td&gt;&lt;td style="line-height: auto; color: #444444"&gt;75798 usecs&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td style="line-height: auto"&gt;AVX (-mavx):&lt;/td&gt;&lt;td style="line-height: auto"&gt;69687 usecs&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;AVX w/ vbroadcastss:&lt;/td&gt;&lt;td style="line-height: auto;"&gt;72917 usecs&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;Switching to the AVX encoding improves performance by more than I expected: nearly 10%. But adding the new &lt;code&gt;vbroadcastss&lt;/code&gt; instruction, in addition to the AVX encoding, not only doesn't help, but actually makes things worse. I tried analyzing the code with the &lt;a href="http://software.intel.com/en-us/articles/intel-architecture-code-analyzer/"&gt;Intel Architecture Code Analyzer&lt;/a&gt;, but the analyzer also thought that using &lt;code&gt;vbroadcastss&lt;/code&gt; should be faster. If anyone has any ideas why &lt;code&gt;vbroadcastss&lt;/code&gt; would be slower, I'd love to hear them.&lt;br /&gt;&lt;br /&gt;Despite this weird performance problem, AVX seems like a good step forward and should provide good opportunities for improving performance beyond what's possible with SSE. For more information, check out this &lt;a href="http://software.intel.com/file/24742"&gt;presentation&lt;/a&gt; which gives a good overview of how to take advantage AVX.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1386948037384435441-8900392212168496708?l=muizelaar.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://muizelaar.blogspot.com/feeds/8900392212168496708/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=1386948037384435441&amp;postID=8900392212168496708' title='5 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1386948037384435441/posts/default/8900392212168496708'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1386948037384435441/posts/default/8900392212168496708'/><link rel='alternate' type='text/html' href='http://muizelaar.blogspot.com/2011/01/trying-out-avx.html' title='Trying out AVX'/><author><name>Jeff Muizelaar</name><uri>http://www.blogger.com/profile/17483047845050494642</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>5</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1386948037384435441.post-3180100475542770145</id><published>2010-12-18T21:53:00.000-08:00</published><updated>2010-12-19T18:57:06.523-08:00</updated><title type='text'>Improved Hardware Acceleration in Fennec</title><content type='html'>On Thursday night, after the all-hands party, Matt Woodrow landed a beautiful &lt;a href="https://bugzilla.mozilla.org/show_bug.cgi?id=604101"&gt;refactoring&lt;/a&gt; of our texture upload code. This should give a noticeable improvement in scrolling performance when accelerated layers are enabled and hopefully fixes some of the problems people were seeing there. It also improves texture upload performance on OS X.&lt;br /&gt;&lt;br /&gt;Unfortunately, there are still two bugs that are keeping us from enabling accelerated layers by default:&lt;br /&gt;&lt;a href="https://bugzilla.mozilla.org/show_bug.cgi?id=619615"&gt;Bug 619615&lt;/a&gt; - Hangs on Nexus One&lt;br /&gt;&lt;a href="https://bugzilla.mozilla.org/show_bug.cgi?id=619539"&gt;Bug 619539&lt;/a&gt; - Startup crashes on Droid&lt;br /&gt;&lt;br /&gt;Any help debugging these problems would be greatly appreciated.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1386948037384435441-3180100475542770145?l=muizelaar.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://muizelaar.blogspot.com/feeds/3180100475542770145/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=1386948037384435441&amp;postID=3180100475542770145' title='5 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1386948037384435441/posts/default/3180100475542770145'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1386948037384435441/posts/default/3180100475542770145'/><link rel='alternate' type='text/html' href='http://muizelaar.blogspot.com/2010/12/improved-hardware-acceleration-in.html' title='Improved Hardware Acceleration in Fennec'/><author><name>Jeff Muizelaar</name><uri>http://www.blogger.com/profile/17483047845050494642</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>5</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1386948037384435441.post-2418227055563538743</id><published>2010-12-15T12:24:00.000-08:00</published><updated>2010-12-15T13:23:34.965-08:00</updated><title type='text'>Hardware Acceleration on Fennec</title><content type='html'>It's now possible with current nightlies to use OpenGL for compositing in Fennec on Android. To turn it on, go to about:config and set "layers.accelerate-all" to "true" and restart. If it's working you can go to about:support and the Graphics section will say "1/1 OpenGL".&lt;br /&gt;&lt;br /&gt;It would be great if people can test it and let me know how it goes.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1386948037384435441-2418227055563538743?l=muizelaar.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://muizelaar.blogspot.com/feeds/2418227055563538743/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=1386948037384435441&amp;postID=2418227055563538743' title='12 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1386948037384435441/posts/default/2418227055563538743'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1386948037384435441/posts/default/2418227055563538743'/><link rel='alternate' type='text/html' href='http://muizelaar.blogspot.com/2010/12/hardware-acceleration-on-fennec.html' title='Hardware Acceleration on Fennec'/><author><name>Jeff Muizelaar</name><uri>http://www.blogger.com/profile/17483047845050494642</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>12</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1386948037384435441.post-8877613920810925606</id><published>2010-11-08T11:54:00.000-08:00</published><updated>2010-11-10T11:16:03.710-08:00</updated><title type='text'>Dealing with mach_kernel in Shark</title><content type='html'>Sometimes when profiling a bunch of time ends up in mach_kernel. Figuring out why isn't always easy but here are two tips that should help a bit:&lt;br /&gt;&lt;br /&gt;&lt;ul&gt;&lt;li&gt;You can get better symbols for mach_kernel by downloading a &lt;a href="http://developer.apple.com/hardwaredrivers/download/kerneldebugkits.html"&gt;KernelDebugKit&lt;/a&gt;&lt;br /&gt;This can help a bit when trying to figure out what's happening in the kernel. For example, _dtrace_get_cpu_int_stack_t becomes _mach_call_munger.&lt;br /&gt;&lt;/li&gt;&lt;br /&gt;&lt;li&gt;Shark has a System Trace profiling mode. This can show you what code is causing the kernel to do work. It can break down time by system call or by vm fault which should account for most things.&lt;br /&gt;&lt;br /&gt;While trying this out I noticed we were spending a fair amount of time in ChildViewMouseTracker::WindowForEvent(NSEvent*). This gave me the idea that the reason that Firefox causes the WindowServer process to start using a huge amount of CPU is because we tell the WindowServer to give us all of the mouse events instead of the ones only targeted at our window. Presumably this causes the WindowServer to build up a very large queue of events when the Firefox process is stopped and thus use lots of CPU. This turns out to be the case. nsToolkit::RegisterForAllProcessMouseEvents causes us to listen to all mouse events and disabling the code there fixes the problem. Bug &lt;a href="https://bugzilla.mozilla.org/show_bug.cgi?id=611068"&gt;611068&lt;/a&gt; tracks the problem.&lt;br /&gt;&lt;/li&gt;&lt;br /&gt;&lt;/ul&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1386948037384435441-8877613920810925606?l=muizelaar.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://muizelaar.blogspot.com/feeds/8877613920810925606/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=1386948037384435441&amp;postID=8877613920810925606' title='3 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1386948037384435441/posts/default/8877613920810925606'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1386948037384435441/posts/default/8877613920810925606'/><link rel='alternate' type='text/html' href='http://muizelaar.blogspot.com/2010/11/dealing-with-machkernel-in-shark.html' title='Dealing with mach_kernel in Shark'/><author><name>Jeff Muizelaar</name><uri>http://www.blogger.com/profile/17483047845050494642</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>3</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1386948037384435441.post-5744425288648737284</id><published>2010-05-28T14:22:00.000-07:00</published><updated>2010-05-28T14:36:03.693-07:00</updated><title type='text'>Reviewing in vim</title><content type='html'>Bugzilla's review interface is poor. I find a mild improvement is possible by copying the review text into an editor and reviewing it there. One of things that makes this experience better is syntax highlighting. &lt;a href="http://people.mozilla.org/%7Ejmuizelaar/vim/review.vim"&gt;Here&lt;/a&gt;'s a modification of vim's diff highlighting script that works with quoted patches. Adding the following to one's .vimrc will get it used for .review files:&lt;code&gt;&lt;br /&gt;au BufNewFile,BufRead *.review                  setf review&lt;br /&gt;&lt;/code&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1386948037384435441-5744425288648737284?l=muizelaar.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://muizelaar.blogspot.com/feeds/5744425288648737284/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=1386948037384435441&amp;postID=5744425288648737284' title='4 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1386948037384435441/posts/default/5744425288648737284'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1386948037384435441/posts/default/5744425288648737284'/><link rel='alternate' type='text/html' href='http://muizelaar.blogspot.com/2010/05/reviewing-in-vim.html' title='Reviewing in vim'/><author><name>Jeff Muizelaar</name><uri>http://www.blogger.com/profile/17483047845050494642</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>4</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1386948037384435441.post-968347364178078478</id><published>2010-01-22T14:52:00.000-08:00</published><updated>2010-02-27T14:27:19.004-08:00</updated><title type='text'>Reproducing bugs on complicated webpages</title><content type='html'>A while back, I was running into &lt;a href="https://bugzilla.mozilla.org/show_bug.cgi?id=536097"&gt;hangs&lt;/a&gt; on Facebook with the html5 parser. I wanted to reproduce the problem locally and try to minimize it. However, Facebook is pretty complicated and loads content using ajax, so the traditional approach of saving the page didn't work. I needed a different approach, so I hacked up a http proxy to be able to record and replay server interactions. It works by writing out the result of each GET request to a file named after the Request-URI. After you're done recording an interaction with the server, you can switch to replay mode and instead of proxying, the recording will be played back to the browser.&lt;br /&gt;&lt;br /&gt;I should mention a couple of limitations to this approach. First, it assumes that the resources specified at particular urls don't change over the recording session. Second, it assumes that the resources requested are dependent only on the data recorded and not on the time or anything like that. Even with these limitations, the proxy seems to work well enough.&lt;br /&gt;&lt;br /&gt;It's currently pretty hacky, but you can grab it from git://anongit.freedesktop.org/~jrmuizel/http-recording-proxy. There's a README in the repository explaining how to run and replay. I'd love to hear if this of any use to anyone else or if you have patches for improving it.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1386948037384435441-968347364178078478?l=muizelaar.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://muizelaar.blogspot.com/feeds/968347364178078478/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=1386948037384435441&amp;postID=968347364178078478' title='3 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1386948037384435441/posts/default/968347364178078478'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1386948037384435441/posts/default/968347364178078478'/><link rel='alternate' type='text/html' href='http://muizelaar.blogspot.com/2010/01/reproducing-bugs-on-complicated.html' title='Reproducing bugs on complicated webpages'/><author><name>Jeff Muizelaar</name><uri>http://www.blogger.com/profile/17483047845050494642</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>3</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1386948037384435441.post-742899532081502026</id><published>2010-01-22T07:47:00.000-08:00</published><updated>2010-01-22T14:51:40.134-08:00</updated><title type='text'>Graphics performance in Firefox 3.6</title><content type='html'>One of the performance improvements included in Firefox 3.6 is a new path &lt;a href="http://en.wikipedia.org/wiki/Rasterisation"&gt;rasterizer&lt;/a&gt; for use on Windows. This new rasterizer improves vector graphics performance substantially.&lt;br /&gt;&lt;br /&gt;The previous rasterizer is designed for a &lt;a href="http://cgit.freedesktop.org/xorg/proto/renderproto/plain/renderproto.txt"&gt;XRender&lt;/a&gt; trapezoid model of rasterization. In this model, to fill a polygon, we first &lt;a href="http://ect.bell-labs.com/who/hobby/93_2-27.pdf"&gt;tessellate&lt;/a&gt; it into a collection of trapezoids. Next, each of these trapezoids is rasterized and the result added to mask image. Finally, this mask is used to composite the filled polygon. This design can work well if we rasterize multiple trapezoids in parallel as could be done on a GPU. However, when we're using the CPU to sequentially rasterize and blend each trapezoid it's not the most efficient approach.&lt;br /&gt;&lt;br /&gt;Scanline rasterization is the &lt;a href="http://books.google.com/books?id=t2cOddQJQPYC&amp;amp;lpg=PP1&amp;amp;dq=Computer%20Graphics%20Principles%20and%20Practice&amp;amp;pg=PA92#v=onepage&amp;amp;q=&amp;amp;f=false"&gt;textbook&lt;/a&gt;  method for filling polygons and when using a CPU it can be more efficient than tessellating. Instead of breaking the polygon into trapezoids we iterate over each scanline of the mask image. For each scanline, we iterate over the edges that it intersects and fill the pixels in-between the edges to produce the mask image.&lt;br /&gt;&lt;br /&gt;M Joonas Pihlaja contributed a scanline rasterizer to &lt;a href="http://cairographics.org/"&gt;cairo&lt;/a&gt; as part of a Google Summer of Code project and it's included in Firefox 3.6. This new rasterizer makes a pretty significant difference when filling complex paths. For example, this &lt;a href="http://people.mozilla.org/%7Ejmuizelaar/world-map-it.html"&gt;test&lt;/a&gt; draws a spinning map of the world using canvas. In Firefox 3.5, I get about 6-7fps. Using Firefox 3.6, it's nearly 4x faster with about 19-24fps.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1386948037384435441-742899532081502026?l=muizelaar.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://muizelaar.blogspot.com/feeds/742899532081502026/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=1386948037384435441&amp;postID=742899532081502026' title='7 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1386948037384435441/posts/default/742899532081502026'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1386948037384435441/posts/default/742899532081502026'/><link rel='alternate' type='text/html' href='http://muizelaar.blogspot.com/2010/01/graphics-performance-in-firefox-36.html' title='Graphics performance in Firefox 3.6'/><author><name>Jeff Muizelaar</name><uri>http://www.blogger.com/profile/17483047845050494642</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>7</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1386948037384435441.post-7588413996152060894</id><published>2009-10-07T21:46:00.000-07:00</published><updated>2009-10-14T11:00:53.326-07:00</updated><title type='text'>Geometry Homework</title><content type='html'>Drawing arcs is a common task in computer graphics. Arcs are typically drawn using an approximation by set of cubic Bezier splines. These splines are passed through the regular rendering pipeline where they are usually subdivided into a set of line segments and then rasterized. A circle, for example, can be represented by four cubic splines: one for each 90 degree arc.&lt;br /&gt;&lt;br /&gt;&lt;a href="http://3.bp.blogspot.com/_twJ4AAJXouo/StVZq0N0ArI/AAAAAAAAABk/Zb_3nomf9p8/s1600-h/arc.png" onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}"&gt;&lt;img border="0" id="BLOGGER_PHOTO_ID_5392314720951993010" alt="" src="http://3.bp.blogspot.com/_twJ4AAJXouo/StVZq0N0ArI/AAAAAAAAABk/Zb_3nomf9p8/s400/arc.png" style="float: right; margin: 0pt 0pt 10px 10px; cursor: pointer; width: 141px; height: 120px;" /&gt;&lt;/a&gt;An arc of up to 90 degrees can be approximated by a single cubic Bezier spline reasonably well. To find an appropriate approximating spline we set the start and end points of the spline to match the arc and then can find the middle two points using the following formula, where A is the start angle and B is the end angle:&lt;br /&gt;&lt;br /&gt;h = 4/3 * tan ((B - A)/4)&lt;br /&gt;&lt;br /&gt;pt1 = {center.x + r*cos(A),              center.y + r*sin(A)}&lt;br /&gt;pt2 = {center.x + r*cos(A) - h*r*sin(A), center.y + r*sin(A) + h*r*cos(A)}&lt;br /&gt;pt3 = {center.x + r*cos(A) + h*r*sin(B), center.y + r*sin(A) - h*r*cos(B)}&lt;br /&gt;pt4 = {center.x + r*cos(B),              center.y + r*sin(B)}&lt;br /&gt;&lt;br /&gt;This approach works quite well and is currently used by &lt;a href="http://www.cairographics.org"&gt;cairo&lt;/a&gt;. However, the problem with this approach is that it requires converting to polar coordinates and then back to Cartesian ones. This conversion is slow because it requires the use of trigonometric functions.&lt;br /&gt;&lt;br /&gt;Fortunately, this &lt;a href="http://itc.ktu.lt/itc354/Riskus354.pdf"&gt;paper&lt;/a&gt; gives a solution that avoids the conversion to polar coordinates:&lt;br /&gt;&lt;br /&gt;ax = pt1.x - center.x&lt;br /&gt;ay = pt1.y - center.y&lt;br /&gt;bx = pt4.x - center.x&lt;br /&gt;by = pt4.y - center.y&lt;br /&gt;&lt;br /&gt;q1 = ax*ax + ay*ay&lt;br /&gt;q2 = q1 + ax*bx + ay*by&lt;br /&gt;&lt;br /&gt;k2 = 4/3 (sqrt(2*q1*q2) - q2) / (ax*by - ay*bx)&lt;br /&gt;&lt;br /&gt;pt2.x = pt1.x - k2*ay&lt;br /&gt;pt2.y = pt1.y + k2*ax&lt;br /&gt;pt3.x = pt4.x + k2*bx&lt;br /&gt;pt3.y = pt4.y - k2*by&lt;br /&gt;&lt;br /&gt;Unfortunately, there is no explanation, derivation or proof for this formula. Even more problematic, the formula for k2 becomes less stable as pt1 approaches pt4, becoming NaN when pt1 equals pt4.&lt;br /&gt;&lt;br /&gt;Therefore, the homework problem has two parts:&lt;br /&gt;&lt;br /&gt;1. Give an explanation or derivation for the formula for k2 provided above.&lt;br /&gt;2. Provide a similar formulation for pt2 and pt3 that doesn't degenerate as pt1 and pt4 become close or prove that one doesn't exist.&lt;br /&gt;&lt;br /&gt;The best answer will be credited in the new cairo stroking code.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1386948037384435441-7588413996152060894?l=muizelaar.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://muizelaar.blogspot.com/feeds/7588413996152060894/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=1386948037384435441&amp;postID=7588413996152060894' title='17 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1386948037384435441/posts/default/7588413996152060894'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1386948037384435441/posts/default/7588413996152060894'/><link rel='alternate' type='text/html' href='http://muizelaar.blogspot.com/2009/10/geometry-homework.html' title='Geometry Homework'/><author><name>Jeff Muizelaar</name><uri>http://www.blogger.com/profile/17483047845050494642</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://3.bp.blogspot.com/_twJ4AAJXouo/StVZq0N0ArI/AAAAAAAAABk/Zb_3nomf9p8/s72-c/arc.png' height='72' width='72'/><thr:total>17</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1386948037384435441.post-8292731199390851158</id><published>2009-10-02T12:51:00.000-07:00</published><updated>2009-10-06T22:37:00.125-07:00</updated><title type='text'>qcms — now faster</title><content type='html'>Thanks to some optimization &lt;a href="https://bugzilla.mozilla.org/show_bug.cgi?id=512865"&gt;work&lt;/a&gt; by Steve Snyder, qcms is even faster than before.&lt;br /&gt;&lt;br /&gt;What follows is a chart with the new performance numbers:&lt;a href="http://3.bp.blogspot.com/_twJ4AAJXouo/SsZaRyEf9UI/AAAAAAAAABE/taWxkzUNgkI/s1600-h/qcms2-speed.png" onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}"&gt;&lt;img border="0" id="BLOGGER_PHOTO_ID_5388093265740297538" alt="" src="http://3.bp.blogspot.com/_twJ4AAJXouo/SsZaRyEf9UI/AAAAAAAAABE/taWxkzUNgkI/s400/qcms2-speed.png" style="display: block; margin: 0px auto 10px; text-align: center; cursor: pointer; width: 400px; height: 345px;" /&gt;&lt;/a&gt;The benchmark is the same as last time but run on a slightly slower computer using OS X v10.6 instead of 10.5. As the chart shows, the new qcms code is more than twice as fast as the previous code. In addition to the performance improvement, the new code includes a version that only uses SSE1 instructions. This will be especially helpful for those with older computers where the time spent doing color correction isn't as negligible as it is on faster computers.&lt;br /&gt;&lt;br /&gt;When running this benchmark again, I noticed that the performance of lcms had drastically improved since the last time I had run the benchmark. Why was lcms so much faster on 10.6? What had changed? The default architecture target: in OS X 10.6, the compiler builds 64 bit binaries by default. Still, it was surprising that compiling for 64 bit could nearly double the performance.&lt;br /&gt;&lt;br /&gt;The large difference, it turns out, can largely be attributed to the &lt;a href="http://hg.mozilla.org/releases/mozilla-1.9.1/file/2b21457f4f4f/modules/lcms/src/cmsmtrx.c#l788"&gt;&lt;code&gt;MAT3evalW&lt;/code&gt;&lt;/a&gt;&lt;sup&gt;&lt;a href="#footnote1"&gt;1&lt;/a&gt;&lt;/sup&gt; function. This function multiplies a 1&amp;times;3 matrix with a 3&amp;times;3 one using 9 32&amp;times;32&amp;rarr;64 multiplications. GCC can usually optimize these multiplications by using the 32&amp;times;32&amp;rarr;64 multiply instructions, however that wasn't happening in 32 bit mode. Instead of the expected 9 multiplies, we get 18 multiplies and a bunch of housekeeping work, likely caused by the 64 bit additions and additional register pressure. In 64 bit mode, however, we get the code that you'd expect. This only takes 38 instructions versus the 169 instructions the 32 bit build uses. With a difference like that in the inner loop, it's easy to see why the 64 bit build is so much faster.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-size:78%;"&gt;&lt;a name="footnote1"&gt;1.&lt;/a&gt; &lt;code&gt;MAT3evalW&lt;/code&gt; has a handwritten assembly version that should be faster than the one that GCC generates, unfortunately it is MSVC only.&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1386948037384435441-8292731199390851158?l=muizelaar.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://muizelaar.blogspot.com/feeds/8292731199390851158/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=1386948037384435441&amp;postID=8292731199390851158' title='6 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1386948037384435441/posts/default/8292731199390851158'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1386948037384435441/posts/default/8292731199390851158'/><link rel='alternate' type='text/html' href='http://muizelaar.blogspot.com/2009/10/qcms-now-faster.html' title='qcms &amp;mdash; now faster'/><author><name>Jeff Muizelaar</name><uri>http://www.blogger.com/profile/17483047845050494642</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://3.bp.blogspot.com/_twJ4AAJXouo/SsZaRyEf9UI/AAAAAAAAABE/taWxkzUNgkI/s72-c/qcms2-speed.png' height='72' width='72'/><thr:total>6</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1386948037384435441.post-2967229466817312890</id><published>2009-06-02T08:57:00.000-07:00</published><updated>2009-06-04T19:38:48.475-07:00</updated><title type='text'>qcms - color management for the web</title><content type='html'>&lt;span style="font-size:100%;"&gt;qcms is a brand-new color management system replacing Mozilla's previous solution, &lt;a href="http://www.littlecms.com/"&gt;lcms&lt;/a&gt;. The browser's color management system allows us to use data in photos on the web and data about your monitor to make sure that colors look the same everywhere. This helps photographers and site designers control the appearance of their images more precisely, and makes the web a prettier place.&lt;p&gt;&lt;/p&gt;&lt;p&gt;Normally, rewriting a large chunk of code from scratch is a bad idea. However, we were in a situation that made rewriting pretty attractive. First, we were only using a small portion of lcms's functionality, but we were paying for the large code base in both maintainability and threat surface. Second, we were already maintaining substantial modifications which was an additional maintenance burden. Finally, our requirements for a color management library are different then the goals of lcms: we need something fast and secure, while lcms seems more focused on functionality, completeness and correctness. &lt;/p&gt;&lt;/span&gt; &lt;h2&gt;What's new?&lt;/h2&gt;&lt;p&gt; qcms is made up of two main parts: the ICC profile parser and the transformation engine. The transformation engine reuses a lot of code from lcms. The ICC profile parser is completely new and written with security and robustness in mind. &lt;/p&gt; &lt;p&gt; The new parser has some key design differences compared to lcms. One of these is the I/O model. lcms has an I/O abstraction layer that abstracts reading from memory and reading from a file. It uses this layer to read what it needs from the profile as it's needed. This means that the full file or memory copy of the profile needs to be kept around for the lifetime of the profile object.&lt;sup&gt;&lt;a href="https://bugzilla.mozilla.org/show_bug.cgi?id=491572"&gt;1&lt;/a&gt;&lt;/sup&gt; &lt;/p&gt; &lt;p&gt;qcms uses a simpler model. The entire profile is read into memory and then parsed to construct a profile object. We only keep the information that's needed to construct a transformation between color spaces. After parsing we can close the file or free the memory used to store the profile. Furthermore, since we only parse in-memory, we also don't have to deal with any possible read errors or other file I/O problems during parsing. &lt;/p&gt; &lt;p&gt; Error handling during parsing can be tricky and ridden with security holes. To help deal with this problem, qcms adopts an error handling strategy similar to &lt;a href="http://cairographics.org/manual/cairo-error-status.html"&gt;cairo&lt;/a&gt;. Instead of trying to deal with the error immediately and returning an error result up the call stack, we often set a flag to note the brokenness, putting us into an error state, and continue on. To continue successfully, all of the following operations must be completable even while in an error state. However, ensuring this is usually easy, especially if the results will be discarded. When we do get to a place where it's convenient to return, we do so. The big advantage to this approach is that it keeps the error state control flow as similar to the normal control flow as possible. This makes the code easier to read, easier to reason about, and easier to test. &lt;/p&gt;&lt;h2&gt;Speed&lt;/h2&gt; &lt;p&gt;Last summer, Bobby Holley did a bunch of &lt;a href="http://bholley.wordpress.com/2008/09/12/so-many-colors/"&gt;work&lt;/a&gt; to make lcms faster. I was able to reuse that work in qcms. The result is that qcms is one of the fastest color management systems around. Here's a simple test that transforms all the possible RGB components from one RGB colorspace to another. It compares lcms, qcms and ColorSync, the system color management system on OS X. &lt;/p&gt;&lt;h2&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://1.bp.blogspot.com/_twJ4AAJXouo/Sigshwk7IsI/AAAAAAAAAAc/7_mgyxPTtPY/s1600-h/qcms-speed.png"&gt;&lt;img style="margin: 0px auto 10px; display: block; text-align: center; cursor: pointer; width: 400px; height: 267px;" src="http://1.bp.blogspot.com/_twJ4AAJXouo/Sigshwk7IsI/AAAAAAAAAAc/7_mgyxPTtPY/s400/qcms-speed.png" alt="" id="BLOGGER_PHOTO_ID_5343569916362171074" border="0" /&gt;&lt;/a&gt;&lt;/h2&gt; &lt;p&gt; The tests are &lt;a href="http://cgit.freedesktop.org/%7Ejrmuizel/qcms/tree/lcms-compare.c"&gt;here&lt;/a&gt; and &lt;a href="http://cgit.freedesktop.org/%7Ejrmuizel/qcms/tree/util/colorsync-perf.c"&gt;here&lt;/a&gt;. &lt;/p&gt; &lt;h2&gt;Current Limitations&lt;/h2&gt;  &lt;p&gt; qcms currently only supports transformations to and from RGB colorspaces. This covers the vast majority of uses on web; however, it means there's no support for CMYK and many additional profile types. If you are interested in making this code better, let me know!&lt;/p&gt; &lt;h2&gt;Conclusion&lt;/h2&gt;In the end, we have a library about a tenth of the size and 5 times as fast than what we had previously. I've put the code up at &lt;a href="http://cgit.freedesktop.org/%7Ejrmuizel/qcms/"&gt;git://git.freedesktop.org/~jrmuizel/qcms&lt;/a&gt;. It's portable C and licensed under the same license as lcms so it's pretty easy to change from one api to another. Hopefully other projects can find it useful. Finally, a big thanks to Marti Maria for lcms, without which qcms wouldn't been possible.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1386948037384435441-2967229466817312890?l=muizelaar.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://muizelaar.blogspot.com/feeds/2967229466817312890/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=1386948037384435441&amp;postID=2967229466817312890' title='17 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1386948037384435441/posts/default/2967229466817312890'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1386948037384435441/posts/default/2967229466817312890'/><link rel='alternate' type='text/html' href='http://muizelaar.blogspot.com/2009/06/qcms-color-management-for-web.html' title='qcms - color management for the web'/><author><name>Jeff Muizelaar</name><uri>http://www.blogger.com/profile/17483047845050494642</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://1.bp.blogspot.com/_twJ4AAJXouo/Sigshwk7IsI/AAAAAAAAAAc/7_mgyxPTtPY/s72-c/qcms-speed.png' height='72' width='72'/><thr:total>17</thr:total></entry></feed>
